Quantemplate Integrate takes raw input data and
processes it via a transformation pipeline to produce cleansed output
data.
This video shows you how to get started creating and running pipelines.
For more detailed video walk-throughs see Tutorials.
Data table concepts
Quantemplate Integrate takes data in unstructured/semi-structured
formats such as spreadsheets, and transforms them into harmonised, structured data tables.
Data tables
have a
simple structure consisting of a fixed set of columns and an arbitrary number of rows.
Columns
are the named
set of values that each row of the data table will contain. Conventionally these are
represented as a set of headers at the top of a table.
Data values
are the
individual data points. In Quantemplate, the data values can be of any type but other
data processing systems you may wish to connect to Quantemplate sometimes apply strict
rules about the type of data that may be held in each column.
Rows
are collections
of related data values, with one value for each column, usually represented as
horizontal rows in Quantemplate.
How Quantemplate Integrate is different from Excel
Unlike spreadsheet tools such as Excel, Quantemplate Integrate applies a rules-based approach
to configures batch processing actions across multiple datasets. This allows you to quickly cleanse and harmonise data at scale,
and for your processes to be repeatable on similar sets of source data.
Because it's built for defining data processing rules, Quantemplate Integrate does not allow:
Cell-level alterations of individual values.
To change a value, set up a transformation targeting the
type of value you wish to change.
Table presentation such as multiple layers of header,
headers along the side of the table, totals of rows or columns.
Quantemplate Analyse
provides tools to configure table presentation, add totals, apply filters, etc.
Presentational number formatting such as millions, billions, %.
Quantemplate Analyse
provides tools to apply presentational formatting to numbers numbers.
Components
Pipelines
A pipeline is a data transformation process built for a
set of input datasets, transforming them to a desired set of output datasets. A
pipeline comprises:
Raw data for cleansing is uploaded directly via the
inputs interface. Quantemplate supports data in XLS, XLSX, CSV and GZipped CSV formats. Cleansed
data such as other pipeline outputs or reference codes can be connected in from
your data repo.
Stages and operations
Data transformations are sequenced and configured via
stages and operations.
Stages
are structural
components with a defined number of input and output datasets, whilst
operations
are individual transformation
process, grouped together in a transform stage. See
Stages and operations for more details.
Output data
Each stage creates output datasets which are the result
of the input data, plus the stage transformations. A stage’s output datasets can
be connected to the inputs of a subsequent stage for further transformations,
or can be exported to your data repo,
downloaded or shared
with another organisation.
Pipeline runs
Pressing the run button executes the pipeline
transformations and creates the output datasets. The output datasets
from each run of the pipeline are retained, and can be previewed, exported
or downloaded. This allows comparison of different versions of your pipeline
outputs, which may be useful if source data or transformation operations
have changed.