Understanding the Sun Match Engine

Sun Match Engine Business Name Matching Overview

Matching on the business name data type includes standardizing and matching on free-form business name fields. You can implement business name standardization and matching on its own or within a master index application designed to process person information. For example, standardizing business name fields allows you to include these fields as search criteria, even though matching might not be performed against these fields.

The Sun Match Engine can create standardized and phonetic values for business names. Several configuration files are designed specifically to handle business names to define additional logic for the standardization and phonetic encoding process. These include reference files, a patterns file, and key type files. The Sun Match Engine can match on any field as long as the match type for the field is defined in the match configuration file (matchConfigFile.cfg). The business name standardization files are common to all national domains, so no domain-specific configuration is required.

For more information about the fields involved in business name standardization and matching, see Sun Match Engine Business Name Processing Fields.

Sun Match Engine Business Name Processing Fields

When matching on free-form business names, not all fields in a record need to be processed by the Sun Match Engine. The match engine only needs to process fields that must be parsed, normalized, or phonetically converted, and the fields against which matching is performed. These fields are defined in the Match Field file, and processing logic for each field is defined in the standardization and matching configuration files.

Business Name Match String Fields

The match string processed by the Sun Match Engine is defined by the match fields specified in the Match Field file. If you specify a “BusinessName” match type for any field in the wizard, most of the parsed business name fields are automatically added to the match string in the Match Field file, including the name, organization type, association type, sector, industry, and URL. You can remove any of these fields from the match string.

The match engine can process any combination of fields you specify for matching. By default, the match configuration file (matchConfigFile.cfg) includes rows specifically for matching on the fields that are parsed from the business name fields. The file also defines several generic match types. You can use any of the existing rows for matching or you can add rows for the fields you want to match.

Business Name Standardized Fields

The Sun Match Engine expects that business name data will be provided in a free-form text field containing several components that must be parsed. The match engine is designed to parse these components, and to normalize and phonetically encode the business name. You can specify additional fields for phonetic encoding.

If you specify the “BusinessName” match type for any field in the wizard, a standardization structure for that field is defined in the Match Field file. The fields defined as the target fields are listed in the next section, Business Name Object Structure.

Business Name Object Structure

For the default configuration of the business name data type, the address fields specified for standardization are parsed into several additional fields, one of which is also normalized. If you specify the appropriate match type in the wizard, the following fields are automatically added to the object structure and database creation script.

You can add these fields manually if you do not specify a match type in the wizard.