Understanding the Master Index Standardization Engine

Address Patterns File

The address patterns file (patterns.dat) defines the expected input patterns of each individual street address field being standardized so the Master Index Standardization Engine can recognize and process these values. Tokens indicate the type of address component in the input and output fields. This file contains two rows for each pattern. The first row defines the input pattern for each address field and provides an example. The second row defines the output pattern for each address field, the pattern type, the relative importance of the pattern compared to other patterns, and usage flags. Below is an example.


AU A1 TY                01 Oak B Street
NA NA ST                T* 75                TX

When an address is parsed, each line of the address is delineated by a pipe (|) and sent to the parser separately. The output tokens for each line are then concatenated and the output pattern is processed using the address patterns file to determine whether the output pattern is listed in the file. If the pattern is found, output patterns are modified as indicated in the patterns file to resolve any ambiguities that might arise when two lines of address information contain common elements. The relative importance determines which pattern to use when the format of the input field matches more than one pattern. This file should only be modified by personnel with a thorough understanding of address patterns and tokens.

The syntax of this file is:

input-pattern example output-pattern pattern-class pattern-modifier priority usage-flag exclude-flag

You can modify or add entries in this table as needed. The following table describes the columns in the address patterns file.

Table 6 Address Patterns File

Column 

Description 

input-pattern

Tokens that represent a possible input pattern from an individual unparsed street address field. Each token represents one component. For more information about address tokens, see Address Type Tokens.

example 

An example of a street address that fits the specified pattern. This file element is optional. 

output-pattern 

Tokens that represent the output pattern for the specified input pattern. Each token represents one component of the output of the Master Index Standardization Engine. For more information about address tokens, see Address Type Tokens.

pattern-class 

An indicator of the type of address component represented by the pattern. Possible pattern types are listed in Pattern ClassesPattern Classes.

pattern-modifier 

An indicator of whether the priority of the pattern is averaged against other patterns that match the input. Pattern modifiers are listed in Pattern Modifiers.

priority 

The priority weight to use for the pattern when the pattern is a sub-pattern of a larger input pattern. For more information, see Priority Indicators.

usage-flag 

A flag indicating how the term is used (for more information, see Pattern Classes). This file element is optional.

exclude-flag 

This file element is optional. 

Following is an excerpt from the address patterns file.


NU DR TY A1 AU                     01   123 South Avenida B Oak
HN PD PT NA NA                     H* 70

NU DR TY NU DR                     01   123 South Avenida 1 West
HN PD PT NA SD                     H* 70

NU A1 TY AU TY                     01   123 C circle hill drive
HN HS NA NA ST                     H* 70

NU A1 AM A1 TY                     01   123 M & M road
HN NA NA NA ST                     H* 65

NU TY AU A1                        01   123 Avenida Oak B
HN PT NA NA                        H* 60

NU TY NU A1                        01   123 Avenida 1 B
HN PT NA NA                        H* 60