1.3.7.4 Date Profiler

The Date Profiler analyzes a Date attribute, and shows the distribution of date values in that attribute in terms of:

  • Day of the week

  • Day of the month

  • Day of the year

  • Month

  • Year

A Valid/Null view is also included. Invalid dates are by definition Null, as any data value in a DATE attribute must be a valid date.

Use the Date Profiler to see if there are any unusual trends in your Date attributes - for example to see if there is a default date such as 01/01/1970 that has commonly been used instead of a real date value.

The following table describes the configuration options:

Configuration Description

Inputs

Specify a single Date attribute.

Options

None.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The data attributes are:

  • [Attribute Name].dayofweek: adds the day of week. Possible values: Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday

  • [Attribute Name].dayofmonth: adds the day of the month in a new attribute. Possible values are 1-31.

  • [Attribute Name].dayofyear: adds the day of the year in a new attribute. Possible values are 1 Jan - 31st Dec.

  • [Attribute Name].month: adds the month in a new attribute. Possible values are January-December.

  • [Attribute Name].year: adds the year in a new attribute. Possible values are any four-digit year.

Note that splitting out the date values in the way above may be useful for downstream processing, for example, if you want to write out the data and perform matching based on the day, month and year values in separate attributes.

Flags

None.

The Date Profiler looks for trends in batches of records with date values. It therefore requires a batch of records to produce its statistics. It must run to completion before its results are available, and is not suitable for a process that requires a real time response.

When executed against a batch of transactions from a real time data source, it will finish its processing when the commit point (transaction or time limit) configured on the Read Processor is reached.

The following table describes the statistics produced for the Day in Week view:

Statistic Description

Day in week

The day of the week (Sunday-Saturday).

Count

The number of records with dates that fell on that day of the week.

%

The percentage of records with dates that fell on that day of the week.

The following table describes the statistics produced for the Day in Month view:

Statistic Description

Day in month

The day of the month (1-31).

Count

The number of records with dates that fell on that day of the month.

%

The percentage of records with dates that fell on that day of the month.

The following table describes the statistics produced for the Day in Year view:

Statistic Description

Day in year

The day of the year (for example, 1st Jan).

Count

The number of records with dates that fell on that day of the year.

%

The percentage of records with dates that fell on that day of the year.

The following table describes the statistics produced for the Month view:

Statistic Description

Month

The month (January - December).

Count

The number of records with dates that fell in that month.

%

The percentage of records with dates that fell in that month.

The following table describes the statistics produced for the Year view:

Statistic Description

Year

The year.

Count

The number of records with dates that fell in that year.

%

The percentage of records with dates that fell in that year.

The following table describes the statistics produced for the Valid/Null view:

Statistic Description

Valid

The number of records with a valid date in the DATE attribute analyzed.

Null

The number of records with a null value in the DATE attribute analyzed.

Clicking on the Additional Information button from the Valid/Null view shows the statistics as percentages of the total number of records analyzed.

Examples

In this example, the Date Profiler analyzes the distribution of dates in an attribute storing the date of the last payment made by a Customer. In this case, the user is most interested in the distribution of dates across years. The year summary:

Year Count %

2003

369

18.4

2002

303

15.1

2001

250

12.5

2000

219

10.9

1999

174

8.7

1998

159

7.4

2004

152

7.6

1997

126

6.3

1996

103

5.1

1994

73

3.6

1995

42

2.1

1993

27

1.3