Understanding Sun Master Index Processing (Repository)

Sun Match Engine Match Types (Repository)

The Sun Master Index wizard match types fall into four primary categories.

The actual standardization and match types entered into the Match Field file vary for each match type you select in the wizard. The match and standardization types for each type of field are listed in the following descriptions. The match types entered into the Match Field file correspond to the match types defined in the match configuration file, MatchConfigFile.cfg.

Person Match Types

The Person match types include PersonLastName and PersonFirstName. These match types are used to normalize and phonetically encode name fields for person matching. For each field with one of these match types, the wizard adds two fields to the object structure for phonetic and standardized versions. If you specify a field with a person match type for blocking in the wizard, the phonetic version of the name is automatically added to the blocking query. The following fields are created when you specify one of the Person match types for a field (field_name refers to the name of the field specified for Person matching).

The corresponding standardization and match types in the Match Field file are listed in Table 24.

Table 24 Person Name Standardization and Match Types

eView Wizard Match Type 

Match Field File Standardization Type 

Match Field File Match Type 

PersonLastName 

PersonName 

LastName 

PersonFirstName 

PersonName 

FirstName 

BusinessName Match Types

The BusinessName match type is designed to help parse, normalize, and phonetically encode a business name. BusinessName matching adds several fields to the object structure and to the match string. If you specify a business name field for blocking, each parsed business name field is added to the blocking query. The corresponding standardization type in the Match Field file for all fields selected for BusinessName matching is also BusinessName. The actual match type assigned to each field varies depending on the type of information in each field.

Table 25 lists the fields created when you select the BusinessName match type for a field along with their corresponding match types in the Match Field file (field_name refers to the name of the field selected for BusinessName matching).


Note –

Only specify this type of matching for one business name field; otherwise, the wizard will create duplicate entries in the object structure. If more than one field contains the business name, you can add those fields to the standardization structure in the Match Field file after the wizard creates the configuration files.


Table 25 BusinessName Match Types

Field Name 

Description 

Added to the Match String? 

Match Field File Match Type 

field_name_Name

The parsed and normalized version of the business name. 

Yes 

PrimaryName 

field_name_NamePhon

The phonetic version of the business name. 

No 

 

field_name_OrgType

The parsed organization type of the business name. 

Yes 

OrgTypeKeyword 

field_name_AssocType

The association type for the business. 

Yes 

AssocTypeKeyword 

field_name_Industry

The name of the industry for the business. 

Yes 

IndustryTypeKeyword 

field_name_Sector

The name of the industry sector (industries are a subset of sectors). 

Yes 

IndustrySectorList 

field_name_Alias

An alias for the business name. 

No 

 

field_name_Url

The business’ web site URL. 

Yes 

Url 

Address Match Types

The Address match type is designed to help parse, normalize, and phonetically encode an address for matching or standardizing address information. Address matching adds several fields to the object structure and to the match string. If you specify an address field for blocking, the parsed fields are added to the blocking query. The corresponding standardization type for fields selected for Address matching is Address. The actual match type assigned to each field varies depending on the type of information in each field.

The fields created when you select the Address match type for a field are listed below along with their corresponding match types in the Match Field file (field_name refers to the name of the field selected for Address matching).


Note –

Only specify this type of matching for one street address field; otherwise, the wizard will create duplicate entries in the object structure. If more than one field contains the street address, you can define the additional fields in the standardization structure in the Match Field file after the wizard creates the configuration files.


Table 26 Address Match Types

Field Name 

Description 

Added to Match String? 

Match Field File Match Type 

field_name_HouseNo

The parsed street number of the address. 

Yes 

HouseNumber 

field_name_StDir

The parsed and normalized street direction of the address. 

Yes 

StreetDir 

field_name_StName

The parsed and normalized street name of the address. 

Yes 

StreetName 

field_name_StPhon

The phonetic version of the street name. 

No 

 

field_name_StType

The parsed and normalized street type of the address, such as Boulevard, Street, Drive, and so on. 

Yes 

StreetType 

If you want to search on street addresses but do not want to use these fields for matching, select the Address match type for only one street address field in the wizard. When the wizard is complete, you can remove the address fields from the match string in the Match Field file.

Miscellaneous Match Types

Several additional match types are defined in the wizard for the Sun Match Engine. These match types are used to indicate matching on a string, date, or number fields other than those described above or to indicate matching on a field that contains a single character (such as the gender field, which might accept “F” for female or “M” for male). These match types do not define standardization for the specified field and do not add any fields to the object structure. If you specify one of these match types for a field in the wizard, the field is added to the match string with a match type of String, Date, Number, or Char.