Understanding the Master Index Standardization Engine

Address Clues File

The address clues file (clues.dat) lists common terms in street addresses, specifies a normalized value for each common term, and categorizes the terms into street address component types. A term can be categorized into multiple component types. A relevance value specifies which of the component types the term is most likely to be. For example, the term “Junction” is standardized as “Jct” and is classified as a street type, building unit, and generic term (giving relevance in that order).

This file helps the Master Index Standardization Engine recognize common terms in street addresses in order to parse and normalize the values correctly. The syntax of this file is:

common-term normalized-term ID-number/type-token

You can modify or add entries in this table as needed. The following table describes the columns in the address clues file.

Table 4 Address Clues File Columns

Column 

Description 

common-term

A term commonly found in street addresses. 

normalized-term 

The normalized version of the common term. 

ID-number/type-token 

An ID number and a token indicating the type of address component represented by the common term. The ID number corresponds to an ID number in the address master clues file, and the type token corresponds to the type specified for that ID number in the address master clues file. One term might have several ID number and token type pairs. Their order of appearance indicates their relevance value. 

Following is an excerpt from the US address clues file.


TRLR VLG          Trpk            59BU
TRPK              Trpk            59BU
TRPRK             Trpk            59BU
VILLA             Vlla            305TY          60BU
VLLA              Vlla            305TY          60BU
VILLAS            Vlla            60BU
VILL              Vlg             317TY          61BU        364AU
VILLAG            Vlg             317TY          61BU        364AU
VLG               Vlg             317TY          61BU        364AU
VILLAGE           Vlg             317TY          61BU        364AU
VILLG             Vlg             317TY          61BU        364AU
VILLIAGE          Vlg             317TY          61BU        364AU
VLGE              Vlg             317TY          61BU        364AU
VIVI              Vivi            62BU
VIVIENDA          Vivi            62BU
COLLEGE           Coll            64BU                       0AU
CLG               Coll            64BU
COTTAGE           Cott            65BU           65BP        0AU