Understanding the Sun Match Engine

Sun Match Engine Business Name Processing Fields

When matching on free-form business names, not all fields in a record need to be processed by the Sun Match Engine. The match engine only needs to process fields that must be parsed, normalized, or phonetically converted, and the fields against which matching is performed. These fields are defined in the Match Field file, and processing logic for each field is defined in the standardization and matching configuration files.

Business Name Match String Fields

The match string processed by the Sun Match Engine is defined by the match fields specified in the Match Field file. If you specify a “BusinessName” match type for any field in the wizard, most of the parsed business name fields are automatically added to the match string in the Match Field file, including the name, organization type, association type, sector, industry, and URL. You can remove any of these fields from the match string.

The match engine can process any combination of fields you specify for matching. By default, the match configuration file (matchConfigFile.cfg) includes rows specifically for matching on the fields that are parsed from the business name fields. The file also defines several generic match types. You can use any of the existing rows for matching or you can add rows for the fields you want to match.

Business Name Standardized Fields

The Sun Match Engine expects that business name data will be provided in a free-form text field containing several components that must be parsed. The match engine is designed to parse these components, and to normalize and phonetically encode the business name. You can specify additional fields for phonetic encoding.

If you specify the “BusinessName” match type for any field in the wizard, a standardization structure for that field is defined in the Match Field file. The fields defined as the target fields are listed in the next section, Business Name Object Structure.

Business Name Object Structure

For the default configuration of the business name data type, the address fields specified for standardization are parsed into several additional fields, one of which is also normalized. If you specify the appropriate match type in the wizard, the following fields are automatically added to the object structure and database creation script.

You can add these fields manually if you do not specify a match type in the wizard.