2.3.1 Name Matching Rules
Table 2-15 Name Matching Rules
Group Code | Matching Rule | Logic Summary | Example Matching Data | Example Matching Data |
---|---|---|---|---|
I010 | Exact name | Given names and family name match exactly. | Given Names | Family Name |
JOSEPH | TSANGA | |||
JOSEPH | T’SANGA | |||
I020 | Original script name exact | The original script Name fields match exactly. | Original Script Name | Original Script Name |
АЛЕКСАНДР ОСОКИН | АЛЕКСАНДР ОСОКИН | |||
I030 | Standardized given name | Given names match after name standardization using Given name map. | Given Names | Family Name |
Family name matches exactly. | BILL | JONES | ||
- | WILLIAM | JONES | ||
I040 | Full name | The full name matches exactly, after standardization of all name tokens using the Given Name Map. | Full Names | - |
JOHN MIKE SMITH | - | |||
JOHN MICHAEL SMITH | - | |||
I050 | Full name without titles | The full name matches exactly, after standardization of all name tokens using the Given Name Map and removal of titles. | Full Names | - |
DR DOUGLAS BAKER | - | |||
DOUGLAS BAKER | - | |||
I060 | Abbreviated standardized given name | Given names match using a 'Starts With' comparison, after name standardization using the Given Name Map. Family name matches exactly. | Given Names | Family Name |
JOSEPH | TSANGA | |||
ABANDA | - | |||
JOSEPH | T’SANGA | |||
I070 | Given name similar and sounds like | Given name matches with an Edit Distance of 1 or 2 after name standardization. At least one of the given names, excluding initials, must match by a 4-character Metaphone key. Family name matches exactly | Given Names | Family Name |
JOSEPH | ABANDA | |||
JOSEPH | ABANDA | |||
I080 | First name similar and sounds like | The first given name matches with an Edit | Given Names | Family Name |
AMER MOHAMMAD RASHEED |
AL UBAIDI | |||
AMIR RASHID MOHAMMED
|
AL UBAIDI | |||
I090 | Additional given names | All name tokens from the given names field with fewest tokens must be present in the other given names field. Family name matches exactly. | Given Names | Family Name |
MOHAMMED | HANIF | |||
DIN MOHAMED | HANIF | |||
I100 | Additional names | All name tokens from the full name with fewest tokens must be present in the other full name. At least 2 name tokens must match with the same matching logic; that is, if a name only has one token it is not considered a match. At least 2 name tokens must exist in the Full Name. Note: Word Match Count may return >1 if a single name matches twice in a longer name string. For example, ‘ABDUL’ matches ‘ABDUL ABDUL’ with a Word Match Count of 2. | Full Name | - |
LOTFI RIHANI | - | |||
LOTFI BEN ABDUL HAMID BEN | - | |||
ALI RIHANI | - | |||
I110 | Original script name in any order | All names in the original script name fields match, regardless of order. | Original Script Name | Original Script Name |
Καρλος Μολινα | Μολινα Καρλος | |||
I120 | Original script name with typos | Original script name fields match with an | Original Script Name | Original Script Name |
Καρλος Μολινα | Καρλος | |||
I130 | All names in any order | All names in the full name match (using a Word Edit Distance of 0) after name token standardization, in any order. A single typo (1 character edit) is allowed in each name token. | Full Name | - |
ABDUL JABBER OMARI | - | |||
OMARI ABDUL JABBER | - | |||
I140 | Abbreviated given name | Given names match using a 'Starts With' comparison. Family name is a close Metaphone match. | Given Names | Family Name |
CHRIS | HUNT | |||
CHRISTOPHER | HUNTER | |||
I150 | Abbreviated given name and family name typos | Given names match using a 'Starts With' comparison, after name standardization using Given Name Map. Family name matches with an edit difference of 1-2. At least one of the family name tokens, excluding initials, must match by a 4character Metaphone key. | Given Names | Family Name |
IBRAHIM | MOHAMED | |||
ABDUL SALAM | BOYASSEER | |||
IBRAHIM | BOYASEER | |||
I160 | Abbreviated given name without titles and family name with typos | The first given name matches with a ''Starts With'' match, after name token standardization and stripping titles. Family name matches with an edit difference of 12. At least one of the family name tokens, excluding initials, must match by a 4character Metaphone key. | Given Names | Family Name |
SAHIR | BARHAN | |||
DR SAHIR MUSA | BERHIN | |||
I170 | Original script name in any order with typos | All names in the original script name fields match, regardless of order, with each name requiring an 80%+ Character Match Percentage score. | Original Script Name | Original Script Name |
Хасан Ченгић | Ченгић Хасcан | |||
I180 | First name and full name similar and sounds like | The full name matches with a Character Match Percentage of 80% or above, after name token standardization. At least one of the family name tokens, excluding initials, must match by a 4-character Metaphone key. | Given Names | Family Name |
MOHAMMAD HUSAYN | MASTASAEED | |||
MOHAMMAD HASSAN | MASTASAEED | |||
I190 | Given name similar and family names and sounds like | The given name matches with an Edit Distance of 1 or 2, after name | Given Names | Family Name |
standardization. The given name matches by 4-character Metaphone key, after name standardization. The family name matches with an Edit Distance of 1-2. The family name matches by 4-character Metaphone key. | AMER | AL UBAIDI | ||
I200 | Abbreviated given name and family name similar | The first given name matches with a ''Starts With'' match, after name token standardization. The family name matches with an Edit Distance of 1 or 2. The family name matches by 4-character Metaphone key. | Given names | Family name |
VIKTOR ANATOLYEVIC H | BOUT | |||
VICTOR | BOOT | |||
I210 | Original script name additional names | All names in one original script name field must be fully contained within the other field, provided there are at least two names in each field. | Original Script Name | Original Script Name |
Миленко Врачар | Миленко | |||
I220 | Additional names typo tolerant | All name tokens from the full name with fewest tokens must be present in the other full name. A character error tolerance of 20% is allowed (that is, one character edit every 5 characters). At least 2 name tokens must match with the same matching logic. If a name contains only one token it is not considered a match according to this rule. Note: Word Match Count may return >1 if a single name matches twice in a longer name string. For example, ‘ABDUL’ matches ‘ABDUL ABDUL’ with a Word Match Count of 2. | Full Name | - |
ABDUL WAHED SHAFIQ | - | |||
ABDUL WAHAD | - | |||
I230 | Full name contained and multiple names in common | The full name matches with a 'Contains' match, after standardization of all name tokens using the Given Name Map. At least 2 name tokens must match in the full name. | Full Name | - |
ABU BAKAR | - | |||
ABU BAKAR BA’ASYI | - | |||
I240 | Full name characters longer |
The full name matches with a Longest Common Substring Sum Percentage of 90%+, relating to the longer string, and considering substrings of 5 characters or more in length, after name standardization.
|
Full Name | - |
MOHAMMEDAL GHABRA | - | |||
ALGHABRA MUHAMAD | - | |||
RAMATULLAH WAHIDYAR FAQIR MOHAMMAD |
- | |||
WAHIDYAR RAMA TULLAH | - | |||
I250 | Original script name additional names with typos | All names in one original script name field must be fully contained within the other field, provided there are at least two names (all of which have an 80%+Character Match Percentage) in each field. | Original Script Name | Original Script Name |
Юри Неёлов |
Юрий | |||
- | Васильевич | |||
- | Неёлов | |||
I260 | Abbreviated first name | The first given name matches with a ''Starts With'' match, after name token standardization. Family name matches exactly. | Given Names | Family Name |
KHADAF | JANJALANI | |||
ABUBAKAR | - | |||
KHADAFFI | JANJALANI | |||
I270 | Additional names in any order |
All name tokens from the full name with fewest tokens must be present in the other full name. At least 2 name tokens must match with the same matching logic. If a name contains only one token it is not considered a match according to this rule. Note: Word Match Count may return >1 if a single name matches twice in a longer name string. For example, ‘ABDUL’ matches ‘ABDUL ABDUL’ with a Word Match Count of 2. Matching is not order-sensitive.
|
Full Name | - |
HA THI NGUYEN | - | |||
THI HA | - | |||
I280 | Additional names in any order typo tolerant |
All name tokens from the full name with fewest tokens must be present in the other full name. A character error tolerance of 20% is allowed (that is, one character edit every 5 characters). At least 2 name tokens must match with the same matching logic. If a name contains only one token it is not considered a match according to this rule. Note: Word Match Count may return >1 if a single name matches twice in a longer name string. For example, ‘ABDUL’ matches ‘ABDUL ABDUL’ with a Word Match Count of 2. Matching is not order-sensitive. |
Full Name | - |
STEPHENS MARTIN | - | |||
MARRTIN JOHN STEPHENS | - |