Family Name Cluster (dnClusterFamilyName)

The Family Name cluster provides a backup to the full name clusters. This is especially important where the given name data is incomplete, making it difficult to form a complete cluster key for two names.

For example, the following three example records do not share any Full Name cluster keys, due to the initials in the second record and the spacing and spelling variations seen throughout:

Table 5-3 Family Name Cluster

dnFullName Name Tokens and Trimmed Values Cluster Keys dnClusterFullNameTrim
STEPHEN JEQE NKOMO JEQE| JEQ

NKOMO| NKO

STEPHEN| STE

JEQNKO JEQSTE NKOSTE JEQNKO|JEQSTE|NKOSTE
S J NKOMO S| S

NKOMO| NKO

J| J

NKO NKO
STEPHEN JEKE N KOMO JEKE| JEK

KOMO| KOM

N| N

STEPHEN| STE

JEKKOM JEKSTE KOMSTE JEKKOM|JEKSTE|KOMSTE

Clustering only on the family name circumvents this issue, but results in large clusters and a concomitant increase in the processing required to cross-check all the records.

The Family Name cluster builder counters spacing and punctuation differences by generating Metaphone keys for all tokens of the family name, AND the whole of the family name after all white space is trimmed. This is to ensure that family names such as those in the last two records in the example table below are all clustered together despite the spacing differences.

The default logic of the cluster builder is as follows:
  1. Trim all white space from the normalized family name.
  2. Apply the Metaphonetransformation to the result, outputting a key with a length of up to 4 characters.
  3. Strip common name qualifiers from the normalized family name, such as Abd,Al.
  4. Split the family name into several name tokens, using a space delimiter.

    Note:

    Many other punctuation and noise characters are normalized to spaces before generating the cluster. For more information see Name Normalization.
  5. Apply the Metaphone transformation to each name token, outputting a key with a length of up to 4 characters. If there were no tokens remaining after stripping common name qualifiers then apply the Metaphone transformation to the each name token of the original normalized family name.
  6. Concatenate all the generated Metaphone keys.
  7. Deduplicate the list of keys.
The following table provides some examples.

Table 5-4 Metaphone Transformations for Family Name Cluster

dnFamilyName Tokens Derived from dnFamilyName Metaphone Transformations dnClusterFamilyName
ZHONG ZHONG JNK JNK
XIAOJIAN XIAOJIAN SJN SJN
ABACHE ABACHE APX APX
ABANDA ABANDA APNT APNT
ABD AL HAFIZ HAFIZ ABDALHAFIZ HFS APTL HFS|APTL
AL BUTHE BUTHE ALBUTHE P0 ALP0 P0|ALP0
AL AL AL AL
SOLEIMAN HAMAD SOLEIMAN HAMAD SOLEIMANHAMAD SLMN HMT SLMN SLMN|HMT
GOODRIDGE GOODRIDGE KTRJ KTRJ
GOODRICH SR GOODRICH SR GOODRICHSR KTRX SR KTRK KTRX|SR|KTRK
NKOMO NKOMO NKM NKM
N KOMO N KOMO NKOMO N KM NKM N|KM|NKM