The address clues file lists common terms in street addresses, specifies a normalized value for each common term, and categorizes the terms into street address component types. A term can be categorized into multiple component types. The relevance value specifies which of the component types the term is most likely to be. For example, the term “Junction” is standardized as “Jct”, and is classified as a street type, building unit, and generic term (giving relevance in that order).
This file helps the Sun Match Engine recognize common terms in street addresses, and to parse and normalize the values correctly. The syntax of this file is:
common-term normalized-term ID-number/type-token
You can modify or add entries in this table as needed. Table 16 describes the columns in the addressClueAbbrev*.dat file.
Table 16 Address Clues File Columns
Column |
Description |
---|---|
A term commonly found in street addresses. |
|
normalized-term |
The normalized version of the common term. |
ID-number/type-token |
An ID number and a token indicating the type of address component represented by the common term. The ID number corresponds to an ID number in the address master clues file, and the type token corresponds to the type specified for that ID number in the address master clues file. One term might have several ID number and token type pairs. |
Following is an excerpt from the addressClueAbbrevUS.dat file.
TRLR VLG Trpk 59BU TRPK Trpk 59BU TRPRK Trpk 59BU VILLA Vlla 305TY 60BU VLLA Vlla 305TY 60BU VILLAS Vlla 60BU VILL Vlg 317TY 61BU 364AU VILLAG Vlg 317TY 61BU 364AU VLG Vlg 317TY 61BU 364AU VILLAGE Vlg 317TY 61BU 364AU VILLG Vlg 317TY 61BU 364AU VILLIAGE Vlg 317TY 61BU 364AU VLGE Vlg 317TY 61BU 364AU VIVI Vivi 62BU VIVIENDA Vivi 62BU COLLEGE Coll 64BU 0AU CLG Coll 64BU COTTAGE Cott 65BU 65BP 0AU |