1.3.4.8.28 Comparison: Year Too Different

The Year Too Different comparison compares two String/String Array values containing a space-separated list of years and returns true if ALL of the years on one side of the comparison are significantly different to all of the years on the other side of the comparison based on the configuration options. The comparison uses the typographical edit distance and absolute distance of the values in its calculation. Years are considered 'too different' if there are too many typos AND the absolute difference is above the configured threshold.

Use the Year Too Different comparison to eliminate obvious record mismatches based on the year component of a date value. This is useful if the full date is incomplete or where there is low confidence in the day and month components of a date field.

This comparison does not support the use of result bands.

The following table describes the configuration options:

Option Type Description Default Value

Maximum allowed typos

Integer

Tolerance to consider two values similar if the Levenshtein Edit Distance between them is less than or equal to than the specified value.

2

Maximum difference

Integer

Tolerance to consider two years similar if the absolute difference between them is less than or equal to the specified value.

3

Example

This example demonstrates the effect of using the Year Too Different comparison.

Table 1-67 Example Options: Year Too Different

Option Setting

Maximum allowed typos

1

Maximum difference

5

Example results:

Table 1-68 Example Results: Year Too Different

Value A Value B Comparison Result

1981

1988

False

1989

1990

False

2014

2009

False

2014 2015

2009

False

2013 2014

2007

True