Understanding the Sun Match Engine

The First Name Category File (personFirstName*.dat)

The first name category file defines standardized versions of first names and assigns a gender classification for each name. This file is used to standardize first names when comparing person names. The gender classification helps to further clarify the match. The Sun Match Engine uses this file when a first name field is defined for normalization or standardization in the Match Field file.

The syntax of this file is:

original-value standardized-form gender-class

You can modify or add entries in this table as needed. Table 10 describes the columns in the personFirstName*.dat file.

Table 10 First Name Category File

Column 

Description 

original-value

The original value of the first name. 

standardized-form 

The standardized version of the original value. A zero (0) in this field indicates that the original value is already in its standardized form. 

If this column contains a name instead of a zero, that name must also be listed in a different entry as an original value with a standardized form of “0”. 

gender-class 

An indicator of the gender with which the first name corresponds. The possible values are: 

  • N – The name is neutral, and can be applied to male or female first names.

  • F – The name is used for females.

  • M – The name is used for males.

Following is an excerpt from the personFirstNameUS.dat file. Certain rows contain a zero (0) for the standardized form, indicating that the name is already standard (for example, Stephen, Sterling, and Summer).


STEPHEN         0               M
STEPHENIE       STEPHANIE       F
STEPHIE         STEPHANIE       F
STEPHINE        STEPHANIE       F
STEPHNIE        STEPHANIE       F
STERLING        0               M
STEVE           STEPHEN         M
STEVEN          STEPHEN         M
STEVIE          STEPHEN         N
STEW            STUART          M
STEWART         STUART          M
STU             STUART          M
STUART          0               M
SU              SUSAN           F
SUE             SUSAN           F
SUHANTO         0               M
SULLIVAN        0               F
SULLY           SULLIVAN        F
SUMMER          0               F