Help Centre

Legacy date operations
About legacy date operations
Legacy date operations are script operations which have been superseded by the easy-to-use operations Date Cleanse and Date Output, along with date functions in Calculate. They will soon be deprecates so they can no longer be added to a pipeline. They will continue to work in existing pipelines. Updating existing pipelines to remove legacy date operations is advised.
Add Term to Date
Given a column containing a term and a column containing a date, this operation creates a new column containing the result of adding (or subtracting) the term and the date. Dates in source column can be in any of the formats checked by the date cleansing operation.
Usage
addTermToDate:
    dateColumn: "<insert column name>"
    termColumn: "<insert column name>"
    resultColumn: "<insert new column name>"
    unit: "<days, months or years>"

Notes

1.
  Output dates are in ISO format.

2.
  The operation is date aware, so it will account for leap years.

3.
  Adding months to dates will output the same day of the subsequent month. On the last days of the month where the subsequent month is shorter, it will output the last day of the month. For example, adding 1 month to 30 January will set an output date of 28 February (in a non-leap year).

Example
Calculate the end date from a start date and a given term.
addTermToDate:
    dateColumn: "start date"
    termColumn: "term"
    resultColumn: "end date"
    unit: "days"

ID
Start date
Term
End date
1
2016-01-31
1
20160229
2
20180101
3
20180401

Date Cleansing
This operation reads dates from a wide variety of formats and cleanses into ISO format.
Usage
cleanseDateColumns:
    columns:
      - "<insert date column name>"
      - "<insert date column name>"
Alternatively, apply date cleansing to columns whose name contains a specified keyword:
cleanseDateColumnsWithNames:
    containing: "header"
For example, cleanseDateColumnsWithNames: {containing: "Date"} would cleanse the date of columns with headers: ‘Date of Birth’ and ‘Date of Inception’ but ignore headers with: ‘Premium’, ‘DoB’, ‘date of birth’. Note that the matching is case sensitive.
The date cleansing goes through the following formats in order to try and get a match:
yyyyMMdd
dd-MM-yyyy
dd/MM/yyyy
yyyy-MM-dd
dd-MMM-yyyy
dd-MMM-yy
ddMMyyyy
Excel date format (days since Jan 1st 1900)

The date will be output in basic ISO format (yyyyMMdd). If another output format is required, follow the Date Cleansing operation with the Extract date operation.
US date formats
To convert US date format to ISO format, use the following configuration:
cleanseDateColumns:
    columns:
      - "<date column name>"
    format: us
This will check the following formats:
MM-dd-yyyy
MM/dd/yyyy
MMddyyyy
The date will be output in basic ISO format (yyyyMMdd). If another output format is required, follow the Date Cleansing operation with the Extract date operation.
Duration
Given two date columns this operation calculates the days, months or years between these dates. Dates in source columns can be in any of the formats checked by the date cleansing operation.
Usage
yearsBetween:
    from: "<insert column name>"
    to: "<insert column name>"
    resultColumn: "<insert new column name>"
Replace
yearsBetween
with:
  • monthsBetween
    to give the number of whole months between the dates
  • daysBetween
    to give the number of days between the dates
Example
yearsBetween:
    from: "Date of Birth"
    to: "Report date"
    resultColumn: "Age"
ID
Date of birth
Report date
Age
1
19870123
20170223
30
2
19461224
20170112
70


Extract Date
Take a column containing a date in one format and create a new column containing that date in another format. This operation also handles partial date formats. Allowing you to convert year-month or year-quarter into a date.
Usage
extractDate:
    column: "<insert column name>"
    resultColumn: "<insert column name>"
Example
extractDate:
    column: filename
    resultColumn: reporting date eom
    input:
        pattern: '(\d{4}) (\d{2})'
        groupKey:
            year: 1
            month: 2
    output:
        format: ISO
        dayOfMonthRule: EOM
         
There are two optional arguments: Input and Output.
Input
used to determine the date in the column value

if not specified it tries a range of partial date formats:
Year month (2013-01)

Year quarter (2013Q1)

Named Month Year (Jan 2013)

Arguments
Either specify a named format, or a pattern and group key to identify the year, month and optionally day in any text.
Format
One of:
YearMonth

YearNamedMonth

ISO

YearQuarter


Pattern
Enter a regular expression containing 2 or 3 groupings, eg:
(\d{4}) (\d{2})

Group Key
Identify the groups in the pattern
year: 1
month: 2
day: 3 (optional)
Output
Once the date components are identified the data in the output column can be defined. If not specified then the output is the basic ISO date. If the input is a partial date (i.e. has no day part) then the output will be the last day in that month.
dayOfMonthRule:

Either

1. EOM
2. PreviousEOM
3. SubsequentEOM

Format

One of:
1. "ISO"
2. "YearMonth"
3. "Year-Month"
4. "YearQuarter"

Pattern
A pattern can be used instead of a preset format and should be a Java DateTimeFormatter string:
using predefined constants, such as   ISO_LOCAL_DATE

using pattern letters, such as yyyy-MM-dd

See the Java DateTimeFormatter documentation for full details.
Year Quarter
Extracts the calendar year and calculates the year quarter from a given date. Results are given in a new column. Dates in the source column can be in any of the formats checked by the date cleansing operation.
Usage
extractYearQuarter:
    dateColumn: "<insert column name>"
    resultColumn: "<insert new column name>"
Example
For input data
ID
DT
1
20120101
2
2013-12-30

extractYearQuarter:
    dateColumn: "dt"
    resultColumn: "quarter"
ID
DT
Quarter
1
20120101
2012Q1
2
2013-12-30
2013Q4