1.3.4.9.12 Match Transformation: Last N Characters

The Last N Characters transformation allows matching to ignore the beginning part of values when performing comparisons, by stripping the values to a number (N) of characters as read from the right of the value.

It is also useful when you want to cluster using the last few characters of an identifier as the cluster key.

Use the Last N Characters transformation when you are matching on an identifier where the beginning of the value might be 'noise'. This is often used in a secondary match rule, using an exact match comparison (see Comparison: Exact String Match), to find possible matches where the key part of the identifier (at the end of the value) is the same, but where the remaining parts are very different, and therefore hard to find using other comparisons. For example, when matching on a telephone number identifier, the beginning of the String may be in different formats (such as +44(0)1223, 01223, 1223 etc.) but the last five digits should identify the telephone number to a good degree of accuracy. Matching using only these characters may be useful in identifying matching records.

The following table describes the configuration options:

Configuration Description

Options

Specify the following options:

  • Number of characters: the number of characters (counted from the right) that you want to keep and use when transforming values for an identifier. Type: Integer. Default value: 1.

  • Characters to ignore: an optional number of characters (counted from the right of the value) that will be skipped before counting a number of characters to keep in the transformed value. This allows you to skip over common suffixes before transforming values. Type: Integer. Default value: 0.

Note:

Whitespace characters such as spaces and carriage returns are counted as characters like any others, if they exist in the values. You may want to use a Trim Whitespace transformation before using this transformation, in order to ensure that you are selecting data characters.

Example configuration

In this example, the Last N Characters transformation is used to match the last five digits of a telephone number identifier.

Number of characters: 5

Characters to ignore: 0

Example transformations

The following table shows examples of transformations using the above configuration:

Table 1-84 Example Transformations for Last N Characters

Value Transformed Value

01223 321430

21430

+44(0)1223 321430

21430

07775 571260

71260

(Mobile) +44 (0) 7775 71260

71260