Configuring Sun Master Indexes (Repository)

Configuring the Standardization Engine (Repository)

You can configure the standardization engine by specifying the standardization engine to use, configuring the files that define data standardization, and loading standardization files into the master index application. You only need to specify the standardization engine to use if you are using an engine other than the Sun Match Engine.

Perform any of these steps to configure the standardization engine:

Specifying a Standardization Engine for the Master Index (Repository)

Sun Master Index can support standardization engines from different vendors depending on the adapter configured to communicate with the engine. Default classes are provided for using the Sun Match Engine. You can implement a custom standardization engine along with customized adapters. The standardization engine configuration is defined by standardizer-api and standardizer-config elements.


Note –

The default adapters for the Sun Match Engine are com.stc.eindex.matching.adapter.SbmeStandardizerAdapter and com.stc.eindex.matching.adapter.SbmeStandardizerAdapterConfig.


ProcedureTo Specify the Standardization Engine

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the standardizer-api element in the MatchingConfig section.

  3. Specify the Java class for the standardization adapter to use, using the fully qualified class name as shown below.


    <standardizer-api>
       <class-name>
        com.stc.eindex.matching.adapter.MyStandardizerAdapter
       </class-name>
    </standardizer-api>
  4. In the standardizer-config element, specify the Java class for the configuration of the standardization adapter, using the fully qualified class name as shown below.


    <standardizer-config>
       <class-name>
         com.stc.eindex.matching.adapter.SbmeStandardizerAdapterConfig
       </class-name>
    </standardizer-config>
  5. Save and close the file.

Modifying Master Index Standardization Files (Repository)

You can fine-tune the standardization process by modifying the standardization files. For example, you can insert additional names or terms into the nickname file or business name files. Depending on your data requirements, you might need to modify additional standardization files. Some of the patterns files (most notably the address patterns files) are very complex and should only be modified by personnel who thoroughly understand the defined patterns and tokens. If you modify standardization files, make sure you modify them for each national domain specified in the Match Field file.

ProcedureTo Modify Standardization Data Configuration Files

  1. In the Projects window, expand the master index project to configure and then expand Standardization Engine.

  2. If the file you want to modify is domain-specific, expand the domain name.

  3. Open the file you want to modify in the NetBeans text editor.

  4. Modify the file in accordance with the information presented for each data type in Understanding the Sun Match Engine.

  5. Save and close the file.

Loading Standardization Files to a Master Index Application (Repository)

Loading the standardization files brings them into the Repository and the master index project. This procedure is only required for projects that were upgraded from previous versions and that do not contain all the needed files. The files are loaded into the Standardization Engine node, with domain-specific files being loaded into their own subdirectory. In a fresh installation of Sun Master Index, all files are automatically loaded.

ProcedureTo Load Standardization Files

  1. In the Project window, expand the master index project, and then expand the master index application.

  2. Right-click the Standardization Engine folder, and then select Load Configuration Files from the context menu.

  3. In the Open dialog box, open the folder containing the files you want to load.

  4. Select the files to load, and then click Open.

  5. On the Information dialog box, click OK.