Understanding the Sun Match Engine

Sun Match Engine Address Matching Overview

Matching on the address data type includes both standardizing and matching on address information in the master index application. You can implement street address standardization and matching on its own, or within a master index application designed to process person or business information. For example, standardizing address information allows you to include address fields as search criteria, even though matching might not be performed against these fields.

The Sun Match Engine can create standardized and phonetic values for street address information. Several configuration files are designed specifically to handle address data to define additional logic for the standardization and phonetic encoding process. These include address clues files, a patterns file, and a constants file. The United States address standardization engine is based on the work performed at the US Census Bureau. The clues files, in particular, are based on census bureau statistics.

The Sun Match Engine can match on any field as long as the match type for the field is defined in the match configuration file (matchConfigFile.cfg).

For information about the fields involved in address standardization and matching, see Sun Match Engine Address Data Processing Fields.

Sun Match Engine Address Data Processing Fields

When matching on address data, not all fields in a record need to be processed by the Sun Match Engine. The match engine only needs to process address fields that must be parsed, normalized, or phonetically encoded, and the fields against which matching is performed. These fields are defined in the Match Field file and processing logic for each field is defined in the standardization and matching configuration files.

Address Data Match String Fields

The match string processed by the Sun Match Engine is defined by the match fields specified in the Match Field file. If you specify an “Address” match type for any field in the wizard, the default fields that store the parsed data are automatically added to the match string in the Match Field file. These fields include the house number, street direction, street type, and street name. You can remove any of these fields from the match string.

The match engine can process any combination of fields you specify for matching. By default, the match configuration file (matchConfigFile.cfg) includes rows specifically for matching on the fields that are parsed from the street address fields, such as the street number, street direction, and so on. The file also defines several generic match types. You can use any of the existing rows for matching or you can add rows for the fields you want to match.

Address Data Standardized Fields

The Sun Match Engine expects that street address data will be provided in a free-form text field containing several components that must be parsed. By default, the match engine is configured to parse these components and to normalize and phonetically encode the street name. You can specify additional fields for phonetic encoding.

If you specify an “Address” match type for any field in the wizard, a standardization structure for that field is defined in the Match Field file. The fields listed below under Address Data Object Structure are automatically defined as the target fields. Each of these fields has several entries in the standardization structure. This is because different parsed components can be stored in the same field. For example, the house number, post office box number, and rural route identifier are all stored in the house number field. If you do not specify address fields for matching in the wizard but want to standardize the fields, you can create a standardization structure in the Match Field file.

Address Data Object Structure

The address fields specified for standardization are parsed into several additional fields. If you specify the “Address” match type in the wizard, the following fields are automatically added to the object structure and database creation script.

You can add these fields manually if you do not specify a match type in the wizard.


Note –

The object structure for Sun Master Patient Index uses a slightly different naming convention. For the names of the fields defined for Sun Master Patient Index, refer to the Sun SeeBeyond eIndex Single Patient View User’s Guide.