6.2.3 Individual Given Names Cluster(dnClusterGivenNames)

The Given Names cluster provides a further backup to the remaining clusters, especially to deal with cases where names are not necessarily well-structured into family and given names.

Depending on the quality and culture of the name information, this cluster will often not be required. You can test the number of additional alerts identified by the cluster by running matching with this cluster disabled, and then running with it enabled. Comparing the new relationships against the old will highlight the relationships identified by using this cluster.

The default logic of the cluster builder is as follows:
  1. Split the normalized full name into several name tokens, using space as a delimiter.

    Many other punctuation and noise characters are normalized to spaces before generating the cluster. For further information see Name Normalization.

  2. Standardize the normalized given names before clustering. This ensures, for example, that names such as 'William' and 'Bill' will be clustered together, although their raw Metaphone values are not the same. A space delimiter is used to split the name before standardizing.
  3. Apply the Metaphonetransformation to the whole of the given names value after token standardization, outputting a key with a length of up to 4 characters.

    The following table describes the Given Names Cluster example.

    Table 6-8 Given Names Cluster

    dnGivenNames Metaphone values dnClusterGivenNames
    XIAO JIAN SJN SJN
    ZHONG JNK JNK
    MOHAMMED SANI MHMT MHMT
    JOSEPH TSANGA JSFT JSFT
    ABD AL WAHAB APTL APTL
    SULIMAN HAMD SULEIMAN SLMN SLMN
    AL BUTHE ALP0 ALP0
    REGINALD B RJNL RJNL
    STEPHEN JEQE STFN STFN
    S J SJ SJ
    STEPHEN JEKE STFN STFN