Help Centre

Calculate

About Calculate

The Calculate operation creates new columns of data calculated from your source data. For example, from a column of Premium and a column of Incurred Losses it is possible to calculate a new column
of Profit.

The new column is appended to the right side of the output data.

In addition to a full set of arithmetic operators, the Calculate operation utilises a library of advanced functions, such as IF statements. For example, to check whether a policy is profitable create a new column called ‘In Profit’; if the value in the ‘Profit’ column is greater than 0 write ‘Y’, and if not, write ‘N’.

See the Syntax and functions section for more details.

Multiple calculated columns can be added within a single Calculate operation, up to a maximum of 51.

To create a calculated column:

  1. Add a calculate operation from the enrich section in the operations palette.
  2. Name the calculated column.
  3. Type in the equation field and select from the suggested column names and functions.
  4. Run the pipeline to create the calculated column in your output datasets.

Column ordering and referencing

Calculated columns are added to the right end of the dataset in the order shown in the panel, from top to bottom, and can be reordered via drag and drop.

Calculated columns can reference other calculated columns higher in the list.

The example below demonstrates adding calculated columns, referencing other calculated columns and reordering them.

Note that the column referencing will fail when:

Bulk-copy calculated columns

Calculated columns can be copied and pasted all at once to another Calculate operation. This is useful for duplicating a set of calculated columns across multiple pipelines.

To copy columns: open a Calculate operation and with no fields selected click ctrl+c (Win) or cmd+c (Mac).

To paste columns: open a Calculate operation and click ctrl+v (Win) or cmd+v (Mac). This will add the copied columns below any existing columns.

To duplicate a Calculate operation: copy the columns from one operation, create a new Calculate operation and paste them in.

Note that the Calculate operation may need the same input column names for the calculations to work.

If the number of pasted columns will cause the total number of columns to to exceed 51, the excess columns will not be pasted.

Functions index

The Equation Editor in the Calculate and Validate operations supports a range of Excel-like functions for general purpose data cleansing and set of functions specifically for cleansing dates. To define columns, values and operators in functions, use the syntax described in the Operators and syntax section.

General Functions

ABS

Returns the absolute value of a number.

ABS(<column>)

Example

VALUE
ABS('VALUE')
1
1
-1
1

CONCATENATE

Joins multiple values together to create a single text value.

Note that if named columns are not present, the run will fail. If this is not desired behviour, consider using the Combine Columns operation instead.

CONCATENATE(<column name or value> … )

Example

ID
CONCATENATE("X", 'ID')
123
X123
1234
X1234

FIND

Finds a text value and returns position of its first character.

FIND(<search within>, <search for>)

Note:

Examples

Find the position of the character ‘d’ within the ID column.

ID
FIND('ID’, “d”)
abcd
4
defg
1
efgh
0

Extract the first word of a text value, for example extracting a first name from a full name. Note that in this example, if the name does not contain a space, a blank will be returned.

NAME
LEFT('NAME', FIND('NAME', " ") - 1)
Ronald Davies
Ronald
Jane Alice Smith
Jane
F. Scott Fitzgerald
F.
Madonna

Check whether a column contains a given text value, for example check whether a post code contains SE4.

Postcode
IF(FIND('Postcode', "SE4") > 0, "Yes", "No")
SE4 5BX
Yes
SE13 4WZ
No
SE47RS
Yes

Remove the first word, for example remove a building number from a street address. This expression looks for the position of the first space and returns everything after that space, returning ‘Unknown’ if no space is found.

Street
IF(FIND('Street', " ") = 0, "Unknown", MID('Street', FIND('Street', " ") + 1, LENGTH('Street')))
23 Acacia Avenue
Acacia Avenue
361–364 Park Road
Park Road
London
Unknown

IF

Checks whether a condition is met and returns one value if true, and another value if false.

IF(<logical query>, <value if true>, <value if false>)

Examples of logical queries:

'Line_of_Business' = "Property"
'Region' = "Americas"
'Claim' > 1000000

IF statement example

If the Line of Business is Property, then multiply premium by 1.2. If it is not Property leave the Premium value unchanged.

IF('Line_of_Business' = "Property", 'Premium' * 1.2, 'Premium')

Using ‘and/or’ in IF statements

The calculation editor supports and/or syntax in IF statements and validations. The words ‘and’ ‘or’ should be entered in lowercase.

‘and’ statement

IF('Column 1' > 'Column 2' and 'Column 3' < 'Column 4',
value_if_true, value_if_false)

‘or’ statement

IF('Column 1' > 'Column 2' or 'Column 3' < 'Column 4',
value_if_true, value_if_false)

Combined ‘and/or’ statement

IF(('Column 1' > 'Column 2' or 'Column 1' < 'Column 5')
and ('Column 3' < 'Column 4'), value_if_true, value_if_false)

Using ‘and/or’ in validations

Validations work like an IF statement, returning a warning or failure if the result is false. Be aware of this when formulating a validation with ‘and/or’ and ‘=/!=’: it may be necessary to invert the ‘and/or’ logic to get the desired result.

‘and/or’ validation examples

Zip must be populated and 5 digits long

ISNUMBER('Zip') and LENGTH('Zip')=5

‘Gross Premium’ must be a negative value when ‘Policy Transaction Type’ = ‘Cancellation’

'Gross Premium'<0 or 'Transaction Type' !=“Cancellation"

If ‘Transaction Type’ = ‘Cancellation’, then ‘Gross Premium’ cannot be greater than sum of ‘Gross Premium’ on all prior transactions

('Compare cancelled GWP against prior periods'>=0) or ('Cancellation flag'!="Cancelled”)

IFERROR

Returns a value if a calculation generates an error.

IFERROR(<calculation>, <value>)

Example

If the calculation ‘claim_amount / claim_count’ generates an error, return the value 0.

IFERROR('claim_amount' / 'claim_count', 0)

ISNUMBER

Returns TRUE if value is a number, and FALSE if not (for use with IF statements).

ISNUMBER(<column>)

Example

value
IF(ISNUMBER('value'), "Number", "Not a number")
1
Number
"1"
Number
1.2
Number
""
Not a number
"text"
Not a number

ISLIKE

Returns TRUE if value matches a regular expression, and FALSE if not (for use with IF statements)

ISLIKE(<column>, "regular expression")

Example

ID
IF(ISLIKE('ID', "4$"), "Ends with 4", "Doesn't end with 4")
123
Doesn’t end with 4
1234
Ends with 4

LEFT

Returns the first characters from a value, given the number of characters

LEFT(<column>, <no. of characters>)

Example

Address
LEFT('SIC Code', 4)
0800 Forestry
0800
2080 Beverages
2080

LENGTH

Returns the length of each column value.

LENGTH(<column>)

Example

ID
LENGTH('ID')
123
3
1234
4

LOWER

Converts text to lowercase.

LOWER(<column name or text value> … )

MAX

Returns the maximum value across multiple columns or constant values.

Each parameter can be a numeric value or a column name. If the parameter is a column name then the function will get the value in that column. All the values are compared and the largest value is returned.

MAX(<column name or numeric value>, … )

Example

VALUE
MAX('VALUE',0)
1
1
-1
0

MID

Returns characters from a value, given starting position (where first character is 1) and number of characters.

MID(<column>, <start position>, <no. of characters>)

Example

ID
MID('ID', 4, 2)
12345678
45
abcdefghijkl
de

MIN

Returns the minimum value across multiple columns or constant values.

Each parameter can be a numeric value or a column name. If the parameter is a column name then the function will get the value in that column. All the values are compared and the smallest value is returned.

MIN(<column name or numeric value>, … )

Example

VALUE
MIN('VALUE',0)
1
0
-1
-1

PROPER

Converts the first letter of each word to uppercase and the other letters to lowercase.

PROPER(<column name or text value> … )

Example

VALUE
PROPER('VALUE')
SAN FRANCISCO
San Francisco
sao paulo
Sao Paulo
SAN DIEGO, CA
San Diego, Ca

Returns the last characters from a value, given the number of characters

RIGHT(<column>, <no. of characters>)

Example

Address
RIGHT('Address', 8)
London, EC1Y 1AA
EC1Y 1AA
Penzance, TR19 7AA
TR19 7AA

ROUND

Rounds a number to a given number of digits.

ROUND(<column name or numeric value>, <no. of digits>)

Numbers 1–4 are rounded down and 5–9 are rounded up. Numbers are rounded to a specified level of precision. They can be rounded to the right or left of the decimal point.

If num_digits > 0, number is rounded to the specified number of decimal places to the right of the decimal point.

If num_digits < 0, number is rounded to the left of the decimal point (i.e. to the nearest 10, 100, 1000, etc.).

If num_digits = 0, number is rounded to the nearest integer.

Examples

Formula
Description
Result
ROUND(3.14, 1)
Rounds 3.14 to one decimal place
3.1
ROUND(12.34, -1)
Rounds 12.34 to one decimal place
to the left of the decimal point
10
ROUND(876.5, -3)
Rounds 876.5 to the nearest 1000
1000

SUBSTITUTE

Finds a text value and replaces it with another text value.

SUBSTITUTE(<search within>, <search for>, <replace with>)

Note:

Example

Remove dashes from an ID value.

ID
SUBSTITUTE('ID', "-", "")
QUAN-1234-785
QUAN1234785
QUAN7436528
QUAN7436528

SUBSTRING

Returns characters from a value, given starting position (where first character is 0) and number of characters.

SUBSTRING(<column>, <start position>, <no. of characters>)

Example

ID
SUBSTRING('ID', 4, 2)
12345678
56
abcdefghijkl
ef

TRIM

Removes extra spaces from text: leading and trailing whitespace and repeated spaces between words.

TRIM(<column name or text value> … )

UPPER

Converts text to uppercase.

UPPER(<column name or text value> … )

Date Functions

Date functions operate on cleansed date values which are in Basic ISO format

yyyymmdd
.

See Working with dates to learn more about how Quantemplate processes dates.

DATECALC

Returns a new basic-ISO formatted date by adding or removing a term (constant value or column reference) from a basic-ISO date.

DATECALC(<iso date>, <term (numeric value)>, <unit>)

Example

Premium
DATECALC('Premium', -1, "year")
DATECALC('Premium', 6, "months")
20120629
20110629
20121229
20200229
20190228
20200829
""
""
""
2012-06-29
(pipeline error)
(pipeline error)

DURATION

Returns the number of whole elapsed units between two basic-ISO dates.

DURATION(<start>, <end>, <unit>)

Example

Start
End
DURATION('Start', 'End', "days")
20120629
20120630
1
20120630
20120629
-1
""
""
""
2012-06-29
2012-06-30
(pipeline error)

ENDOFQUARTER

Moves a date to the last day in a quarter.

ENDOFQUARTER(<iso date>)

ENDOFMONTH

Moves a date to the last day in a month.

ENDOFMONTH(<iso date>)

ENDOFYEAR

Moves a date to the last day in a year.

ENDOFYEAR(<iso date>)

STARTOFQUARTER

Moves a date to the first day in a quarter.

STARTOFQUARTER(<iso date>)

STARTOFMONTH

Moves a date to the first day in a month.

STARTOFMONTH(<iso date>)

STARTOFYEAR

Moves a date to the first day in a year.

STARTOFYEAR(<iso date>)

Operators and syntax

Valid operators in the equation editor:
+
  plus

-
  minus

/
  divide

*
  multiply

^
  exponent

=
  equals

!=
  not equals

>
  greater than

<
  less than

>=
  greater than or equal to

<=
  less than or equal to
Standard syntax must be followed for the equation editor to function. Quantemplate automatically inserts correct syntax where possible.
‘ ’

“ ”
IF(...)
,
Single quotes for fields in your data. These are added automatically added,
and lozenged if validated.
Double quotes for categorical values in your data, eg ‘Region’=“America”
Brackets must follow a function.
Comma must follow a field or value in the function.

Notation

Certain combinations of letters and numbers will be interpreted as numbers in arithmetic operations.

The most useful of these is the exponent indicator ‘e’, so a string ‘5e5’ would be interpreted as the number ‘500,000’ when used in an arithmetic expression.

Characters f, F, d and D indicate the type of floating-point number. F or f indicates the number is a float; otherwise its type is double and it can optionally be suffixed with a letter D or d.