Configuring Sun Master Indexes

Modifying Master Index Standardization Files

You can fine-tune the standardization process by modifying the standardization files. For example, you can insert additional names or terms into the normalization or lexicon files, such as giventNames.txt and givenNameNormalizatin.txt. Depending on your data requirements, you might need to modify additional standardization files. Some of the patterns files (most notably the address patterns files) are very complex and should only be modified by personnel who thoroughly understand the defined patterns and tokens. If you modify standardization files, make sure you modify them for each variant specified in mefa.xml.

You can modify the data configuration files (lexicon and normalization files), and you can also modify the process configuration files that define the data types, variants, and how data is standardized. The process files are more complex, and should only be modified by one who is familiar with standardization concepts and with the Master Index Standardization Engine. Instructions for modifying these files are not included here. For information about these files, see Understanding the Master Index Standardization Engine.

ProcedureTo Modify Standardization Data Configuration Files

  1. In the Projects window, expand the master index project to configure and then expand Standardization Engine.

  2. Expand instance, expand the variant to modify, and then expand resources.

  3. Open the file you want to modify in the NetBeans text editor.

  4. Modify the file in accordance with the information presented for each data type in Understanding the Master Index Standardization Engine.

  5. Save and close the file.