Help Centre

Introducing Integrate
About Integrate
Quantemplate Integrate takes raw input data and processes it via a transformation pipeline to produce cleansed output data.
This video shows you how to get started creating and running pipelines. For more detailed video walk-throughs see Tutorials.
Data table concepts
Quantemplate Integrate takes data in unstructured/semi-structured formats such as spreadsheets, and transforms them into harmonised, structured data tables.


Data tables
have a simple structure consisting of a fixed set of columns and an arbitrary number of rows.
Columns
are the named set of values that each row of the data table will contain. Conventionally these are represented as a set of headers at the top of a table.
Data values
are the individual data points. In Quantemplate, the data values can be of any type but other data processing systems you may wish to connect to Quantemplate sometimes apply strict rules about the type of data that may be held in each column.
Rows
are collections of related data values, with one value for each column, usually represented as horizontal rows in Quantemplate.

How Quantemplate Integrate is different from Excel

Unlike spreadsheet tools such as Excel, Quantemplate Integrate applies a rules-based approach to configures batch processing actions across multiple datasets. This allows you to quickly cleanse and harmonise data at scale, and for your processes to be repeatable on similar sets of source data.
Because it's built for defining data processing rules, Quantemplate Integrate does not allow:
Components

Pipelines

A pipeline is a data transformation process built for a set of input datasets, transforming them to a desired set of output datasets. A pipeline comprises:
The uploaded input datasets

Stages and operations required to transform the data

Transformed output datasets

Validation report on the results of any validation operations in the pipeline

Run log detailing the pipeline run




Input data

Raw data for cleansing is uploaded directly via the inputs interface. Quantemplate supports data in XLS, XLSX, CSV and GZipped CSV formats. Cleansed data such as other pipeline outputs or reference codes can be connected in from your data repo.

Stages and operations

Data transformations are sequenced and configured via stages and operations.
Stages
are structural components with a defined number of input and output datasets, whilst
operations
are individual transformation process, grouped together in a transform stage. See Stages and operations for more details.

Output data

Each stage creates output datasets which are the result of the input data, plus the stage transformations. A stage’s output datasets can be connected to the inputs of a subsequent stage for further transformations, or can be exported to your data repo, downloaded or shared with another organisation.

Pipeline runs

Pressing the run button executes the pipeline transformations and creates the output datasets. The output datasets from each run of the pipeline are retained, and can be previewed, exported or downloaded. This allows comparison of different versions of your pipeline outputs, which may be useful if source data or transformation operations have changed.