1.3.4.9.15 Match Transformation: Metaphone

The Metaphone transformation creates common metaphone keys from values that sound the same, but may be different, for example due to misspellings.

The Metaphone transformation is extremely useful both when clustering and performing comparisons, especially when matching data that may contain mis-spellings, such as names.

When clustering, it provides a useful way of dividing records into cluster groups by creating groups of values that all have the same-sounding identifier - for example the same metaphone key ("KLT") is produced from all of the following surnames: "Gold", "Gould", and "Gauld".

When using comparisons, it is often useful to include a positive metaphone match to strengthen a match rule. For example, an edit distance of 2 or 3 characters on a name field may be quite a weak match, but if both values still sound the same, this may strengthen the match - for example, "John Clarke" might well be the same person as "Jon Clarke", but is far less likely to be the same person as "John Darke".

This provides a way of finding misspellings that are often due to the person entering the data not hearing the name correctly.

The following table describes the configuration options:

Configuration Description

Options

Specify the following options:

  • Maximum metaphone length to return: allows you to vary the sensitivity of the metaphone transformation. A shorter value means long values that have variation in the way they sound towards the end of the value will still have the same key generated. Type: Integer. Default value: 12.

Example

In this example, the Metaphone transformation is used to strengthen name matches. An exact String match comparison (see Comparison: Exact String Match) is performed on the transformed value, effectively forming a comparison that determines whether or not two values sound the same.

Example transformations

The following table shows examples of transformations using the above configuration:

Table 1-87 Example Transformations for Metaphone

Value Transformed Value

Ellen Wilson

ALNLSN

Eileen Wilson

ALNLSN

Pauline Bedham

PLNPTM

Pauline Beedham

PLNPTM

Lewis

LS

Louis

LS

Lees

LS

Pearce

PRS

Pierce

PRS