Configuring Sun Master Indexes (Repository)

Defining Phonetic Encoding for the Master Index (Repository)

Sun Master Index provides configurable phonetic encoding capabilities. Phonetic encoding is a part of the standardization process, and is the process of changing the value of a data field to its phonetic equivalent. It is used to retrieve records with similar field values from the database for matching. You can specify which fields are phonetically encoded before matching and how they are encoded. There are several different encoders you can use for this purpose. This is most commonly done for first names and street names.

Phonetic encoding is defined in the Match Field file. You can define phonetic encoding by either using the Configuration Editor or modifying the XML file directly. The changes you make on the Phoneticized Field page of the Configuration Editor are reflected in the phonetic encoding structures and the match service section of the Match Field file. The Configuration Editor provides a simplified way of defining phonetic encoding.

Perform any of the following tasks to define phonetic encoding rules:

Defining Master Index Fields for Phonetic Encoding (Repository)

You can specify the fields you want to be phonetically encoded, the fields that store the encoded values, and the type of phonetic encoder to use for each field, such as NYSIIS or Soundex. A sample phonetic encoding structure for the XML file is included at the end of these instructions.

ProcedureTo Define a Field for Phonetic Encoding (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. In the object structure in the left pane, create the field that will contain the phonetic value of the field to be encoded.

    For more information, see Adding a Field to the Master Index Object Structure (Repository).

  4. Click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  5. In the Phoneticized Fields section, click Add.

    The Phoneticized Field dialog box appears.

  6. Select values for the fields described in Master Index Phonetic Encoding Fields and Elements (Repository).

  7. Click OK.

    The phonetic encoding definition is added to the Phoneticized Fields list.

  8. On the Configuration Editor toolbar, click Save.

ProcedureTo Define a Field for Phonetic Encoding (XML Editor)

In the Object Definition file, create the field that will contain the phonetic value. For more information, see Adding a Field to the Master Index Object Structure (Repository).

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.

  3. Create a new phoneticize-field element within the phoneticize-fields element.

  4. In the new phoneticize-field element, create and define the elements described in Master Index Phonetic Encoding Fields and Elements (Repository).

  5. Save and close the file.


Example 4 Phonetic Encoding Structure


<phoneticize-fields>
   <phoneticize-field>
      <unphoneticized-source-field-name>Person.FirstName
      </unphoneticized-source-field-name>
      <phoneticized-target-field-name>Person.FirstNamePhoneticCode
      </phoneticized-target-field-name>
      <encoding-type>Soundex</encoding-type>
   </phoneticize-field>
</phoneticize-fields>

Master Index Phonetic Encoding Fields and Elements (Repository)

The following table lists and describes the Configuration Editor fields and XML file elements used to define how fields will be phonetically encoded.

Configuration Editor Field 

XML File Element 

Description 

Unphoneticized Source 

unphoneticized-source- field-name

The ePath of the source field in the system object that contains the data to be phonetically encoded (for example, Person.Address[*].StreetName). 


Note –

This can refer to the original field or to a standardized or normalized field.


Phoneticized Target 

phoneticized-target- field-name

The ePath of the field in the system object that will store the phonetically encoded value of the source field. 

 

phoneticized-object- field-id

A field ID to identify the field to the phonetic encoder. This is not currently used with the Sun Match Engine, but could be used with a custom standardization engine. 

Encoder 

encoding-type

The phonetic encoder to use for this field. This must correspond to an encoder defined in the Encoders section on the Configuration Editor (or the PhoneticEncodersConfig element of the XML file).

Modifying a Master Index Phonetic Encoding Definition (Repository)

Once you create a phonetic encoding definition, you can modify it as needed. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.

ProcedureTo Modify a Phonetic Encoding Definition (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. Click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  4. In the Phoneticized Fields section, select the phonetic encoding definition you want to modify, and then click Edit.

    The Phoneticized Field dialog box appears.

  5. Modify any of the fields listed in Master Index Phonetic Encoding Fields and Elements (Repository), and then click OK.

  6. On the Configuration Editor toolbar, click Save.

ProcedureTo Modify a Phonetic Encoding Definition (XML Editor)

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the phoneticize-fields element.

  3. In the phoneticize-field element that defines the phonetic encoding you want to modify, change the value of any of the elements described in Master Index Phonetic Encoding Fields and Elements (Repository).

  4. Save and close the file.

Deleting a Master Index Phonetic Encoding Definition (Repository)

Once you create a phonetic encoding definition, you can delete it as needed. Use caution when deleting definitions once a system is in production. This can cause inconsistent match results.

ProcedureTo Delete a Phonetic Encoding Definition (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. Click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  4. In the Phoneticized Fields section, select the phonetic encoding definition you want to delete.

  5. Click Remove.

  6. Click Yes on the dialog that appears.

  7. On the Configuration Editor toolbar, click Save.

ProcedureTo Delete a Phonetic Encoding Definition (XML Editor)

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.

  3. Do either of the following:

    • To delete a field currently specified for phonetic conversion, delete all text between and including the phoneticize-field element that defines the field.

      Using the example below, to delete the first name phonetic field, delete the boldface text.


      <phoneticize-fields>
         <phoneticize-field>
            <unphoneticized-source-field-name>Person.LastName
            </unphoneticized-source-field-name>
            <phoneticized-target-field-name>Person.LastNamePhoneticCode
            </phoneticized-target-field-name>
            <encoding-type>NYSIIS</encoding-type>
         </phoneticize-field>
         <phoneticize-field>
            <unphoneticized-source-field-name>Person.FirstName
            </unphoneticized-source-field-name>
            <phoneticized-target-field-name>Person.FirstNamePhoneticCode
            </phoneticized-target-field-name>
            <encoding-type>Soundex</encoding-type>
         </phoneticize-field>
      </phoneticize-fields>
    • To delete all fields currently specified for phonetic conversion, delete all text between, but not including, the phoneticize-fields element.

  4. Save and close the file.

Defining a Master Index Phonetic Encoder (Repository)

Each type of phonetic encoder provided with Sun Master Index is defined in the PhoneticEncodersConfig section of the Match Field file. They are listed in the Encoders section of the Configuration Editor.

ProcedureTo Define a Phonetic Encoder (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. Click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  4. In the Encoders section, click Add.

    The Phonetic Encoder dialog box appears.

  5. In the Encoder field, enter a descriptive name for the encoder.

  6. In the Implementation Class field, enter the fully qualified Java path of the class to use for the encoder.


    Note –

    For more information about the encoder class paths, see Master Index Encoder Elements and Types (Repository).


  7. Click OK.

    The phonetic encoder is added to the Encoders list.

  8. On the Configuration Editor toolbar, click Save.

ProcedureTo Define a Phonetic Encoder (XML Editor)

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the PhoneticEncodersConfig section.

  3. Create a new encoder element, and then define the elements described in Master Index Encoder Elements and Types (Repository).

  4. Save and close the file.

Master Index Encoder Elements and Types (Repository)

The following table lists and describes the elements that configure the phonetic encoders used by the master index application.

Element 

Description 

encoding-type

The name of the phonetic encoder, such as NYSIIS, Soundex, or Metaphone. See the following table for a list of default encoders for the Sun Match Engine. 

encoder-implementation- class

The fully qualified name of the Java class that determines the behavior of the phonetic encoder. See the following table for a complete list of default classes for the Sun Match Engine. 

The following table lists the phonetic encoders supported by the Sun Match Engine along with the names of their default classes.

Encoding Type 

class-name 

NYSIIS 

com.stc.eindex.phonetic.impl.Nysiis 

Soundex 

com.stc.eindex.phonetic.impl.Soundex 

Metaphone 

com.stc.eindex.phonetic.impl.Metaphone 

Double Metaphone 

com.stc.eindex.phonetic.impl.DoubleMetaphone 

Refined Soundex 

com.stc.eindex.phonetic.impl.RefinedSoundex 

French Soundex 

com.stc.eindex.phonetic.impl.SoundexFR 

Modifying a Master Index Phonetic Encoder (Repository)

Once you define a phonetic encoder, you can change the implementation class to use for the encoder. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.

ProcedureTo Modify a Phonetic Encoder (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. Click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  4. In the Encoders section, select the encoder you want to modify and then click Edit.

    The Phonetic Encoder dialog box appears.

  5. Change the implementation class, and then click OK.

  6. On the Configuration Editor toolbar, click Save.

ProcedureTo Modify a Phonetic Encoder (XML Editor)

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the PhoneticEncodersConfig section.

  3. Modify the value of any of the elements in the encoder element you want to modify (for more information, see Master Index Encoder Elements and Types (Repository).

  4. Save and close the file.

Deleting a Master Index Phonetic Encoder (Repository)

Once you define a phonetic encoder, you can delete it if needed. Use caution when deleting encoders once a system is in production. This can cause inconsistent match results. If you delete an encoder that is referenced by a phonetic encoding definition, make sure to modify that definition by referencing an existing encoder.

ProcedureTo Delete a Phonetic Encoder (Configuration Editor)

  1. In the Projects window, right-click the master index application you want to modify, and then click Open.

  2. If the Configuration Editor dialog box appears, click Edit to check out the listed files.

    The Configuration Editor appears.

  3. In the Configuration Editor toolbar, click the Phoneticized Fields tab.

    The Phoneticized Field page appears.

  4. In the Encoders section, select the encoder you want to delete.

  5. Click Remove.

  6. Click Yes on the dialog box that appears.

  7. On the Configuration Editor toolbar, click Save.

ProcedureTo Delete a Phonetic Encoder (XML Editor)

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.

    The file opens in the NetBeans XML editor.

  2. Scroll to the PhoneticEncodersConfig section.

  3. Delete the text between and including the encoder element the defines the encoder you want to remove.

  4. Save and close the file.