Sun Master Index provides configurable phonetic encoding capabilities. Phonetic encoding is a part of the standardization process, and is the process of changing the value of a data field to its phonetic equivalent. It is used to retrieve records with similar field values from the database for matching. You can specify which fields are phonetically encoded before matching and how they are encoded. There are several different encoders you can use for this purpose. This is most commonly done for first names and street names.
Phonetic encoding is defined in the Match Field file. You can define phonetic encoding by either using the Configuration Editor or modifying the XML file directly. The changes you make on the Phoneticized Field page of the Configuration Editor are reflected in the phonetic encoding structures and the match service section of the Match Field file. The Configuration Editor provides a simplified way of defining phonetic encoding.
Perform any of the following tasks to define phonetic encoding rules:
Defining Master Index Fields for Phonetic Encoding (Repository)
Modifying a Master Index Phonetic Encoding Definition (Repository)
Deleting a Master Index Phonetic Encoding Definition (Repository)
You can specify the fields you want to be phonetically encoded, the fields that store the encoded values, and the type of phonetic encoder to use for each field, such as NYSIIS or Soundex. A sample phonetic encoding structure for the XML file is included at the end of these instructions.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
In the object structure in the left pane, create the field that will contain the phonetic value of the field to be encoded.
For more information, see Adding a Field to the Master Index Object Structure (Repository).
Click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Phoneticized Fields section, click Add.
The Phoneticized Field dialog box appears.
Select values for the fields described in Master Index Phonetic Encoding Fields and Elements (Repository).
Click OK.
The phonetic encoding definition is added to the Phoneticized Fields list.
On the Configuration Editor toolbar, click Save.
In the Object Definition file, create the field that will contain the phonetic value. For more information, see Adding a Field to the Master Index Object Structure (Repository).
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.
Create a new phoneticize-field element within the phoneticize-fields element.
In the new phoneticize-field element, create and define the elements described in Master Index Phonetic Encoding Fields and Elements (Repository).
Save and close the file.
<phoneticize-fields> <phoneticize-field> <unphoneticized-source-field-name>Person.FirstName </unphoneticized-source-field-name> <phoneticized-target-field-name>Person.FirstNamePhoneticCode </phoneticized-target-field-name> <encoding-type>Soundex</encoding-type> </phoneticize-field> </phoneticize-fields> |
The following table lists and describes the Configuration Editor fields and XML file elements used to define how fields will be phonetically encoded.
Configuration Editor Field |
XML File Element |
Description |
---|---|---|
Unphoneticized Source |
The ePath of the source field in the system object that contains the data to be phonetically encoded (for example, Person.Address[*].StreetName). Note – This can refer to the original field or to a standardized or normalized field. |
|
Phoneticized Target |
The ePath of the field in the system object that will store the phonetically encoded value of the source field. |
|
A field ID to identify the field to the phonetic encoder. This is not currently used with the Sun Match Engine, but could be used with a custom standardization engine. |
||
Encoder |
The phonetic encoder to use for this field. This must correspond to an encoder defined in the Encoders section on the Configuration Editor (or the PhoneticEncodersConfig element of the XML file). |
Once you create a phonetic encoding definition, you can modify it as needed. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
Click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Phoneticized Fields section, select the phonetic encoding definition you want to modify, and then click Edit.
The Phoneticized Field dialog box appears.
Modify any of the fields listed in Master Index Phonetic Encoding Fields and Elements (Repository), and then click OK.
On the Configuration Editor toolbar, click Save.
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
In the phoneticize-field element that defines the phonetic encoding you want to modify, change the value of any of the elements described in Master Index Phonetic Encoding Fields and Elements (Repository).
Save and close the file.
Once you create a phonetic encoding definition, you can delete it as needed. Use caution when deleting definitions once a system is in production. This can cause inconsistent match results.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
Click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Phoneticized Fields section, select the phonetic encoding definition you want to delete.
Click Remove.
Click Yes on the dialog that appears.
On the Configuration Editor toolbar, click Save.
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.
Do either of the following:
To delete a field currently specified for phonetic conversion, delete all text between and including the phoneticize-field element that defines the field.
Using the example below, to delete the first name phonetic field, delete the boldface text.
<phoneticize-fields> <phoneticize-field> <unphoneticized-source-field-name>Person.LastName </unphoneticized-source-field-name> <phoneticized-target-field-name>Person.LastNamePhoneticCode </phoneticized-target-field-name> <encoding-type>NYSIIS</encoding-type> </phoneticize-field> <phoneticize-field> <unphoneticized-source-field-name>Person.FirstName </unphoneticized-source-field-name> <phoneticized-target-field-name>Person.FirstNamePhoneticCode </phoneticized-target-field-name> <encoding-type>Soundex</encoding-type> </phoneticize-field> </phoneticize-fields> |
To delete all fields currently specified for phonetic conversion, delete all text between, but not including, the phoneticize-fields element.
Save and close the file.
Each type of phonetic encoder provided with Sun Master Index is defined in the PhoneticEncodersConfig section of the Match Field file. They are listed in the Encoders section of the Configuration Editor.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
Click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Encoders section, click Add.
The Phonetic Encoder dialog box appears.
In the Encoder field, enter a descriptive name for the encoder.
In the Implementation Class field, enter the fully qualified Java path of the class to use for the encoder.
For more information about the encoder class paths, see Master Index Encoder Elements and Types (Repository).
Click OK.
The phonetic encoder is added to the Encoders list.
On the Configuration Editor toolbar, click Save.
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Create a new encoder element, and then define the elements described in Master Index Encoder Elements and Types (Repository).
Save and close the file.
The following table lists and describes the elements that configure the phonetic encoders used by the master index application.
Element |
Description |
---|---|
The name of the phonetic encoder, such as NYSIIS, Soundex, or Metaphone. See the following table for a list of default encoders for the Sun Match Engine. |
|
The fully qualified name of the Java class that determines the behavior of the phonetic encoder. See the following table for a complete list of default classes for the Sun Match Engine. |
The following table lists the phonetic encoders supported by the Sun Match Engine along with the names of their default classes.
Encoding Type |
class-name |
---|---|
NYSIIS |
com.stc.eindex.phonetic.impl.Nysiis |
Soundex |
com.stc.eindex.phonetic.impl.Soundex |
Metaphone |
com.stc.eindex.phonetic.impl.Metaphone |
Double Metaphone |
com.stc.eindex.phonetic.impl.DoubleMetaphone |
Refined Soundex |
com.stc.eindex.phonetic.impl.RefinedSoundex |
French Soundex |
com.stc.eindex.phonetic.impl.SoundexFR |
Once you define a phonetic encoder, you can change the implementation class to use for the encoder. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
Click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Encoders section, select the encoder you want to modify and then click Edit.
The Phonetic Encoder dialog box appears.
Change the implementation class, and then click OK.
On the Configuration Editor toolbar, click Save.
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Modify the value of any of the elements in the encoder element you want to modify (for more information, see Master Index Encoder Elements and Types (Repository).
Save and close the file.
Once you define a phonetic encoder, you can delete it if needed. Use caution when deleting encoders once a system is in production. This can cause inconsistent match results. If you delete an encoder that is referenced by a phonetic encoding definition, make sure to modify that definition by referencing an existing encoder.
In the Projects window, right-click the master index application you want to modify, and then click Open.
If the Configuration Editor dialog box appears, click Edit to check out the listed files.
The Configuration Editor appears.
In the Configuration Editor toolbar, click the Phoneticized Fields tab.
The Phoneticized Field page appears.
In the Encoders section, select the encoder you want to delete.
Click Remove.
Click Yes on the dialog box that appears.
On the Configuration Editor toolbar, click Save.
In the Projects window, expand the Configuration node in the project you want to modify, and then double-click the Match Field file.
The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Delete the text between and including the encoder element the defines the encoder you want to remove.
Save and close the file.