Understanding the Master Index Standardization Engine

Creating Normalization and Lexicon Files

Lexicon files list the possible values for a field so the standardization engine can quickly and accurately recognize different field components. Normalization files list the nonstandard values that might be found in a field along with the standard version so the standardization engine can present a common form for the data. You need to create a file for each lexicon or normalization file you referenced from standardizer.xml.

For more information about normalization and lexicon files, see Lexicon Files and Normalization Files.

ProcedureTo Create Normalization and Lexicon Files

  1. For each normalization file you referenced in standardizer.xml, do the following:

    1. Create a text file in /WorkingDirectory/resource.

    2. Save the file under the name you used to reference it from standardizer.xml.

    3. In the file, enter a list of nonstandard values along with their standardized values, separating the nonstandard value from the standard value with a pipe (|) as shown below.


      COR|COURT
      CRT|COURT
      CR.|COURT
      CT|COURT
      CT.|COURT
      DR|DRIVE
      DR.|DRIVE
      DRV|DRIVE
      ...
    4. When you are finished, save and close the file.

  2. For each lexicon file you referenced in standardizer.xml, do the following:

    1. Create a text file in /WorkingDirectory/resource.

    2. Save the file under the name you used to reference it from standardizer.xml.

    3. In the file, enter a list of all possible values for the field as shown below.


      E
      EAST
      ET
      N
      NO
      NORTH
      NTH
      S
      SO
      SOUTH
      ...
    4. When you are finished, save and close the file.

  3. Continue to Packaging and Importing the Variant.