Master Data

Soundex

You can select the engine to be used for screening by specifying it in the Master Data > Power Data > Configurations > Service Preference window.

This engine uses a phonetic algorithm and fixed-length keys for indexing names by sound, as pronounced in English. Soundex code for a string consists of a letter followed by three numerical digits. The letter is the first letter of the name and the digits encode the remaining consonants. Similar sounding consonants share the same digit. Vowels can affect the coding, but are not coded themselves except as the first letter. Matching is performed in two steps:

  1. Encoding of the party and the restricted party details.
    • b, f, p, v => 1
    • c, g, j, k, q, s, x, z => 2
    • d, t => 3
    • l => 4
    • m, n => 5
    • r => 6
    • h, w are not coded
    • Two adjacent letters with the same number are coded as a single number.
    • Letters with the same number separated by an h or w are also coded as a single number.
    • Continue until you have one letter and three numbers. If you run out of letters, fill in 0s until there are three numbers.
  2. Matching the encoded output using the Dice Engine.

Detailed Example of Matching the Encoded Output:

Let us take an example of a party with the name “MULLAH” and a restricted party with the name “MAULAVI”. Following are the steps showing how the encoded output is matched.

  1. Encoding.

    Party: Full Name

    Restricted Party: Full Name

    Encoding(MULLAH) = M400

    Encoding(MAULAVI) = M410

  2. Dice engine matching:
    1. Bigrams(ML) = {M4, 40, 00}, Bigrams(MLF) = {M4, 41, 10}
    2. Common Bigrams = {M4}
    3. Dice Coefficient = (2*1)/(1+3) = 0.50
    4. Total Number of Letter Matches for Party word (MULLAH) = 0.5 * 6 = 3
    5. Match factor of Full Name =  Number of Letter Matches/Total Number of Letters on Party =  3 /6 = 0.5 = 50%

Related Topics