5 Normalization, Standardization, and Phonetic Encoding

This chapter provides information and procedures on how to define, modify and delete normalization and standardization rules; configure the Match and Standardization Engines; and define phonetic encoding for the Master Person Index.

This chapter includes the following sections:

"Defining Master Person Index Normalization Rules"
"Defining Master Person Index Standardization Rules"
"Defining Phonetic Encoding for the Master Person Index"
"Defining the Master Person Index Match String"
"Defining how Master Person Index Query Blocks are Processed"
"Configuring the Standardization Engine"

Defining Master Person Index Normalization Rules

Normalization is a part of the standardization process, and is the process of changing non-standard values to a common, standard value. For example, the first name a person uses might not be their given name, but might be a nickname instead. To ensure that a proper match is made between first names, nicknames are normalized based on a configurable list. For example, the common value for "Liz" and "Elizabeth" would be "Elizabeth".

Normalization is defined in mefa.xml. You can define normalization by either using the Configuration Editor or modifying the XML file directly. The changes you make on the Normalization page of the Configuration Editor are reflected in the normalization structures of mefa.xml. The Configuration Editor provides a simplified way of defining normalization.

Perform any of the following tasks to define normalization:

"Defining a Master Person Index Field to be Normalized"
"Modifying a Master Person Index Normalization Definition"
"Deleting a Master Person Index Normalization Definition"

Defining a Master Person Index Field to be Normalized

When you define a field for normalization, you define which field contains the data that needs to be normalized and which field will contain the normalized data. You can also specify one or more variants to use for normalization. A sample normalization structure for the XML file appears at the end of these instructions.

To Define a Field to be Normalized (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
In the object structure in the left pane, add the field that will contain the normalized value.

For more information, see "Adding a Field to the Master Person Index Object Structure".
Click the Normalization tab.

The Normalization page appears.
Click Add.

The Normalized Field dialog box appears.
Enter or select a value for each of the fields described in Master Person Index Normalization and Standardization Structure Properties.
To specify a variant for the type of data being standardized, do the following:
- In the Variant Field Name field, select the field whose value in incoming records will indicate which variant to use.
- In the Variants section, click Add.
- On the dialog box that appears, enter values in the fields described in "Master Person Index Variants Properties".
- Click OK.
  
  If you selected the multiple domain selector, you can add multiple variants; otherwise, you can add one default variant and one field-defined variant.
On the Normalized Field dialog box, click OK.

The new normalization definition appears in the list.
On the Configuration Editor toolbar, click Save.

To Define a Field to be Normalized (XML Editor)

Before you begin, in object.xml create the field that will contain the new normalized value. For more information, see "Adding a Field to the Master Person Index Object Structure".

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
In the structures-to-normalize element, create and name a new group element.

Make sure the new element falls within the structures-to-normalize element, but outside any existing group tags.
In the new group element, define the standardization-type and domain-selector attributes (these are described in "Master Person Index Normalization and Standardization Structure Properties".
If you specified the multiple domain selector for the domain-selector attribute, do the following:
- In the group element, create a locale-field-name element and a locale-maps element (described in "Master Person Index Normalization and Standardization Structure Properties" and "Master Person Index Variants Properties").
- For each variant you want to use, define a locale-codes, value, and locale element in the locale-maps element (described in "Master Person Index Variants Properties").
To specify the source fields to normalize, do the following:
- Create a new unnormalized-source-fields element in the group element.
- Create a source-mapping element in the new unnormalized-source-fields element.
- Define the unnormalized-source-field-name and standardized-object-field-id elements (these are described in "Master Person Index Normalization and Standardization Structure Properties").
To map the normalized data to destination fields, do the following:
- Create a new normalization-targets element under the unnormalized-source-fields element that defines the field to map.
- Create a target-mapping element in the new normalization-targets element.
- Define the standardized-object-field-id and standardized-target-field-name elements (these are described in "Master Person Index Normalization and Standardization Structure Properties").

Save and close the file.

Example 5-1 First and Last Name Normalization

<structures-to-normalize>
         <group standardization-type="PersonName" domain-selector=
          "com.sun.mdm.index.matching.impl.MultiDomainSelector">
            <locale-field-name>Person.PobCountry</locale-field-name>
            <locale-maps>
              <locale-codes>
                <value>GB</value>
                <locale>UK</locale>
              </locale-codes>
              <locale-codes>
                <value>UNST</value>
                <locale>US</locale>
              </locale-codes>
              <locale-codes>
                <value>Default</value>
                <locale>US</locale>
              </locale-codes>
            </locale-maps>
            <unnormalized-source-fields>
               <source-mapping>
                  <unnormalized-source-field-name>
                   Person.Alias[*].FirstName
                  </unnormalized-source-field-name>
                  <standardized-object-field-id>FirstName
                  </standardized-object-field-id>
               </source-mapping>
               <source-mapping>
                  <unnormalized-source-field-name>
                   Person.Alias[*].LastName
                  </unnormalized-source-field-name>
                  <standardized-object-field-id>LastName
                  </standardized-object-field-id>
               </source-mapping>
            </unnormalized-source-fields>

            <normalization-targets>
               <target-mapping>
                  <standardized-object-field-id>FirstName
                  </standardized-object-field-id>
                  <standardized-target-field-name>
                     Person.Alias[*].StdFirstName
                  </standardized-target-field-name>
               </target-mapping>
               <target-mapping>
                  <standardized-object-field-id>LastName
                  </standardized-object-field-id>
                  <standardized-target-field-name>
                     Person.Alias[*].StdLastName
                  </standardized-target-field-name>
               </target-mapping>
            </normalization-targets>
         </group>

Master Person Index Normalization and Standardization Structure Properties

The following table lists and describes the Configuration Editor fields and their corresponding XML elements that define the fields to be normalized or standardized in the master person index application.

You can specify one or more variants for data to be standardized. A variant is a subset of a data type. For example, if the data type is address, variants are defined for addresses from different countries. The rule set for each country is called a variant. For a single variant, you only need to specify the variant if you need to standardize data that is not from the United States. If you are standardizing data from multiple countries, use the multiple domain selector. This requires that one field in the object structure identify which variant to use for each field that will be standardized. For example, the value of the Country field in a system record could be used to tell the standardization engine which variant to use for a particular set of data. If you specified the multiple domain selector in the domain-selector element, you must also define the identifying field and then map the values that can be populated into that field to their corresponding variant.

The following rules apply to the multiple domain selector:

You can specify a value of Default for the identifying field. The corresponding variant is used if the identifying field is blank, contains the value "Default," or contains a value not defined by any of the value elements.
If a "Default" value is not defined, the system default variant, United States, is used as the default.

For more information about the fields and elements described in the following table, see Oracle Healthcare Master Person Index Standardization Engine Reference (Part Number E18471-01).

Configuration Editor Field	XML File Element or Attribute	Description
Data Type	standardization-type	The type of standardization to perform on the source fields. This is specific to the type of data being processed.
Domain Selector	domain-selector	The Java class used by the standardization engine to determine the variant of the data being processed. For the OHMPI Standardization Engine, the following classes can be specified. If no selector is specified, the default is US. The OHMPI Standardization Engine supports Australia, France, Mexico, United Kingdom, and United States variants. Possible values for this field are: com.sun.mdm.index.matching.impl. SingleDomainSelectorAU com.sun.mdm.index.matching.impl. SingleDomainSelectorFR com.sun.mdm.index.matching.impl. SingleDomainSelectorMX com.sun.mdm.index.matching.impl. SingleDomainSelectorUK com.sun.mdm.index.matching.impl. SingleDomainSelectorUS com.sun.mdm.index.matching.impl. MultipleDomainSelector
Variant Field Name	locale-field-name	The ePath to an identifying field in the object structure that identifies which of the defined variants (element locale-codes) to use. If no field is specified for the OHMPI Standardization Engine, the standardization engine defaults to the United States, regardless of whether any variants are defined. This field must be contained in the object that contains the fields defined for normalization in this structure.
Unnormalized Source	unnormalized-source- field-name	The field that contains the data to be normalized. The field is designated by its ePath (for example, Person.FirstName).
Unnormalized Standardization Component	standardized-object- field-id	An identification code that identifies the field to normalize to the standardization engine. This ID is specific to the standardization engine and must correspond to a standardization component defined by that engine.
Normalized Standardization Component	standardized-object- field-id	An identification code that identifies the field that contains the normalized data to the standardization engine. This is specific to the standardization engine in use and must correspond to a standardization component defined by that engine.
Normalized Target	standardized-target- field-name	The field that will store the normalized data. The field is designated by its ePath (for example, Person.Alias[*].StdLastName).

Master Person Index Variants Properties

The following table lists and describes the Configuration Editor fields and XML elements that define a variant for normalization or standardization. In the XML file, each value and locale pair are defined within a locale_codes element. A list of locale_codes elements can be defined in the locale_maps element.

Configuration Editor Field	XML File Element or Attribute	Description
Value	value	A value that indicates to the standardization engine which variant to use to standardize the data. When the value is contained in the Variant Field Name field (or the locale-field-name element), the standardization engine uses the corresponding Variant field (or locale element) to determine the variant. To specify a default variant, enter Default.
Variant	locale	A code indicating which variant to use to standardize data when the identifying field value in a transaction matches the corresponding Value field or element. Select one of the following codes. AU - Australia FR - France MX - Mexico UK - United Kingdom US - United States

Modifying a Master Person Index Normalization Definition

Once you create a normalization definition, you can modify it as needed. Use caution when modifying normalization definitions once a system is in production. This can cause inconsistent match results.

To Modify a Normalization Definition (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Normalization tab.

The Normalization page appears.
In the Normalization Mappings list, click the definition you want to modify.
Click Edit.
Do any of the following:
- Modify any of the fields described in "Master Person Index Normalization and Standardization Structure Properties".
- To modify a variant, select the variant under Variants, and then click Edit. Modify either field on the dialog box that appears.
- To remove a variant, select the variant under Variants, and then click Remove. Click Yes on the dialog box that appears.
- Click OK.
On the Configuration Editor toolbar, click Save.

To Modify a Normalization Structure (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the structures-to-normalize element in the StandardizationConfig element.
To modify the normalization type, change the value of the standardization-type attribute.
To change the national domain, change the value of the domain-selector element as described in "Master Person Index Normalization and Standardization Structure Properties".
To modify an existing source field, scroll to the unnormalized-source-fields element in the appropriate group element, and then change the value of any source field elements (these are described in "Master Person Index Normalization and Standardization Structure Properties").
To modify an existing destination field, scroll to the normalization-targets element in the appropriate group element, and then change the value of any target field elements (these are described in "Master Person Index Normalization and Standardization Structure Properties").
Save and close the file.

Deleting a Master Person Index Normalization Definition

If a defined normalization structure is not needed, you can delete the normalization structure from the standardization configuration. If no data requires normalization, you can delete all normalization structures. It is not recommended that you delete a normalization definition once a system is in production. This can cause inconsistent match results.

To Delete a Normalization Definition

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
In the Configuration Editor toolbar, click the Normalization tab.

The Normalization page appears.
In the Normalization Mappings list, click the definition you want to delete.
Click Remove.
On the Configuration Editor toolbar, click Save.

To Delete a Normalization Structure

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the structures-to-normalize element in the StandardizationConfig element.
Do either of the following:
- To delete an existing normalization structure, delete all text between and including the group element that defines the structure.
- To specify that no objects require normalization, delete all text between, but not including, the structures-to-normalize element.
Save and close the file.

Defining Master Person Index Standardization Rules

If any of the fields against which searching or matching is performed are entered in free-form text format, those fields must be standardized before being sent to the standardization engine. The process of standardization includes reformatting, or parsing, the input data field and then normalizing some of the parsed data to a standard value. For example, street addresses can be parsed into the house number, street name, street type, and so on. The street name and type can then be normalized to their commonly used values. "Ave" might be normalized to "Avenue", "St." to "Street", and so on.

Standardization is defined in mefa.xml. You can define standardization by either using the Configuration Editor or modifying the XML file directly. The changes you make on the Standardization page of the Configuration Editor are reflected in the standardization structures of mefa.xml. The Configuration Editor provides a simplified way of defining standardization.

Perform any of the following tasks to define standardization:

"Defining Master Person Index Fields to be Standardized"
"Modifying a Master Person Index Standardization Definition"
"Deleting a Master Person Index Standardization Definition"

Defining Master Person Index Fields to be Standardized

When you define fields for standardization, you can specify the type of standardization to perform on each field or group of fields, the nationality of the data, and a field that indicates which nationality to use (if you specify more than one). You also specify which fields contain the data that needs to be parsed and normalized, and which fields contain the parsed and normalized data. For each standardization structure, you can specify more than one source field, but they must use the same standardization type. The source fields in one standardization structure are concatenated before being parsed.

A sample standardization structure for the XML file is included at the end of these instructions.

To Define Fields to be Standardized (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
In the object structure in the left pane, create the fields that will contain the parsed components of the new field to be standardized.

For more information, see "Adding a Field to the Master Person Index Object Structure".
Click the Standardization tab.

The Standardization page appears.
Click Add.

The Standardization Type dialog box appears.
Enter values for the Data Type, Domain Selector, and Variant Field Name fields (these are described in "Master Person Index Normalization and Standardization Structure Properties".
To define a variant for the standardization engine to use, do the following:
- In the Variants section, Click Add.
- On the Variant dialog box, enter values in the fields described in "Master Person Index Variants Properties".
- Click OK.
  
  If you selected the multiple domain selector, you can add multiple variants; otherwise, you can define one default variant and one defined variant.
Under Source Fields to be Standardized, click Add.

The Select Source Field(s) dialog box appears.
In the left panel, select the field that contains the data that needs to be parsed and normalized, and then click the right arrow.

Note:
If the data is contained in more than one field, select all fields that contain the data. For example, a street address might be contained in two fields, such as Street Address and Unit. Both fields should be selected for standardization; they will be concatenated during the standardization process.
If you add a field in error, select the field in the Selected Source Field(s) list, and then click the left arrow.
Click OK.
For each field in which the parsed and normalized data will be stored, do the following:
- On the Standardized Fields dialog box, click Add under Target Mappings.
  
  The Target Mapping dialog box appears.
- In the Select Target field, select the name of a field that will contain standardized data.
- In the Available Standardization Components list, select the ID associated with the field, and then click Add between the left and right panels.
- To change the priority of a component in the Selected Standardization Components list, select the component and then click Move Up or Move Down.
- If you add a component in error, select the component in the Selected Standardization Components list, and then click Remove.
- Click OK.
  
  Note:
  For more information about standardization components and the fields to which they pertain, see Oracle Healthcare Master Person Index Standardization Engine Reference (Part Number E18471-01).
Click OK on the Standardization Type dialog box.

The new standardization definition appears in the list.
On the Configuration Editor toolbar, click Save.

To Define Fields to be Standardized (XML Editor)

Before you begin, in object.xml create the fields that will contain the parsed components of the field to be standardized. For more information, see "Adding a Field to the Master Person Index Object Structure".

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the free-form-texts-to-standardize element in the StandardizationConfig element.
Create a new group element in the free-form-texts-to-standardize element, and then define the standardization-type and domain-selector attributes (these are described in "Master Person Index Normalization and Standardization Structure Properties").

Make sure the new element falls within the free-form-texts-to-standardize element, but outside any existing group tags.
If you specified the multiple domain selector for the domain-selector attribute, in the group element create a locale-field-name element and a locale-maps element, and define the elements described in "Master Person Index Variants Properties".
To specify the source fields to standardize, do the following:
- If it does not currently exist, create an unstandardized-source-fields element in the appropriate group element (each group element can only include one unstandardized-source-fields element).
- For each field standardized by the specified standardization type, create and name a new unstandardized-source-field-name element in the new unstandardized-source-fields element.
  
  Note:
  If more than one source field is defined, the fields are concatenated prior to standardization with a pipe (|) between them for the OHMPI Standardization Engine. If you want the fields to be processed separately, you need to create two standardization structures. Source fields are designated by their ePaths.
To specify the destination fields for the standardized data, do the following:
- In the group element for which destination fields need to be defined, create a standardization-targets element after the unstandardized-source-fields element.
- In the new element, create a target-mapping element for each destination field, and then define the last two elements described in "Master Person Index Standardization Source and Target Field Elements".

Save and close the file.

Example 5-2 Address Standardization Structure

<group standardization-type="Address" domain-selector=
 "com.sun.mdm.index.matching.impl.MultiDomainSelector">
  <locale-field-name>Person.Address[*].CountryCode
  </locale-field-name>
  <locale-maps>
     <locale-codes>
         <value>GB</value>
         <locale>UK</locale>
      </locale-codes>
      <locale-codes>
         <value>UNST</value>
         <locale>US</locale>
      </locale-codes>
      <locale-codes>
         <value>AU</value>
         <locale>AU</locale>
      </locale-codes>
      <locale-codes>
         <value>Default</value>
         <locale>AU</locale>
      </locale-codes>
   </locale-maps>
   <unstandardized-source-fields>
      <unstandardized-source-field-name>Person.Address[*].AddressLine1
      </unstandardized-source-field-name>
      <unstandardized-source-field-name>Person.Address[*].AddressLine2
      </unstandardized-source-field-name>
   </unstandardized-source-fields>
   <standardization-targets>
      <target-mapping>
         <standardized-object-field-id>HouseNumber
         </standardized-object-field-id>
         <standardized-target-field-name>Person.Address[*].HouseNumber
         </standardized-target-field-name>
      </target-mapping>
      <target-mapping>
         <standardized-object-field-id>MatchStreetName
         </standardized-object-field-id>
         <standardized-target-field-name>Person.Address[*].StreetName
         </standardized-target-field-name>
      </target-mapping>
   </standardization-targets>
</group>

Master Person Index Standardization Source and Target Field Elements

The following table lists and describes the XML elements that define the source and target fields for standardization. The data from the source fields is standardized, and the standardized values are stored in the target fields.

XML File Element or Attribute	Description
unstandardized-source-field-name	The field or fields that contain the data to be standardized. The field is designated by its ePath (for example, Person.FirstName).
standardized-object-field-id	A code that identifies the standardized component from the source field to store in the target field. This is specific to the standardization engine in use and must correspond to a standardization component defined by that engine. For more information, see Oracle Healthcare Master Person Index Standardization Engine Reference (Part Number E18471-01).
standardized-target-field-name	The field that stores the standardized data. You can have multiple target fields, depending on how much of the standardized data you want to store. The fields are designated by their ePaths (for example, Person.Alias[*].StdLastName).

Modifying a Master Person Index Standardization Definition

You can modify an existing standardization definition. Use caution when modifying standardization after a system is in production because it can cause inconsistent matching results.

To Modify a Standardization Definition (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Standardization tab.

The Standardization page appears.
In the Standardization Types list, select the definition you want to modify, and then click Edit.

The Standardization Type dialog box appears.
Do any of the following:
- Modify any of the fields or perform any of the functions described in "Defining Master Person Index Fields to be Standardized".
- To modify a variant, select the code under Variants, and then click Edit. Modify either field on the dialog that appears.
- To remove a variant, select the code under Variants, and then click Remove. Click Yes on the dialog box that appears.
- To remove a source field, select the field under Source fields to be standardized, and then click Remove. Click Yes on the dialog box that appears.
  
  Note:
  There must be at least one field in this list.
- To edit a target field, select the field in the Specifying Target Mappings list and then click Edit.
  
  Note:
  You can select new components, move selected components up and down in priority, and remove components.
- To delete a target field, select the field in the Specifying Target Mappings list, and then click Remove. Click Yes on the dialog box that appears.
When you are done making changes, click OK on the Standardization Type dialog box.
On the Configuration Editor toolbar, click Save.

To Modify a Standardization Definition (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the structures-to-normalize element in the StandardizationConfig element.
To modify the standardization type, change the value of the standardization-type attribute.
To change the variant, change the value of the domain-selector element (described in "Master Person Index Normalization and Standardization Structure Properties").
To modify an existing source field, scroll to the appropriate group element, and then change the value of the unstandardized-source-field-name element to the ePath of the new field.
To modify an existing destination field, scroll to the target-mapping element in the standardization-targets section, and then change the value of either target mapping element (these are the last two elements described in "Master Person Index Standardization Source and Target Field Elements").
To remove an existing source field, delete all text between and including the unstandardized-source-field-name element that defines the field.

Note:
If no fields require standardization in a defined standardization structure, delete the entire structure as described in "Deleting a Master Person Index Standardization Definition".
To remove an existing destination field, delete all text between and including the target-mapping tags that define the field.

Note:
Each standardization structure must have at least one destination field defined for standardized data. If a structure does not contain any fields that need to be standardized, you can delete the entire structure, as described in "Deleting a Master Person Index Standardization Definition".
Save and close the file.

Deleting a Master Person Index Standardization Definition

You can delete an existing standardization definition. It is not recommended that a standardization definition be deleted after a system is in production since this can cause inconsistent matching results.

To Delete a Standardization Definition (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Standardization tab.

The Standardization page appears.
In the Standardization Types list, select the definition you want to delete.
Click Remove.
Click Yes on the dialog box that appears.
On the Configuration Editor toolbar, click Save.

To Delete a Standardization Definition (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the free-form-texts-to-standardize element in the StandardizationConfig element.
Do either of the following:
- To delete an existing standardization structure, delete all text between and including the group element that defines the structure.
  
  Using the example below, to delete the Address object, delete all boldface text.
```
<free-form-texts-to-standardize>
   <group standardization-type="BusinessName" domain-selector=
    "com.sun.mdm.index.matching.impl.SingleDomainSelectorUS">
      ...
   </group>
   <group standardization-type="Address" domain-selector=
    "com.sun.mdm.index.matching.impl.SingleDomainSelectorUS">
      ...
   </group>
</free-form-texts-to-standardize>
```
- To specify that no fields require standardization, delete all text between, but not including, the free-form-texts-to-standardize element.
  
  This deletes all standardization structures.
Save and close the file.

Defining Phonetic Encoding for the Master Person Index

Oracle Healthcare Master Person Index provides configurable phonetic encoding capabilities. Phonetic encoding is a part of the standardization process, and is the process of changing the value of a data field to its phonetic equivalent. It is used to retrieve records with similar field values from the database for matching. You can specify which fields are phonetically encoded before matching and how they are encoded. There are several different encoders you can use for this purpose. This is most commonly done for first names and street names.

Phonetic encoding is defined in mefa.xml. You can define phonetic encoding by either using the Configuration Editor or modifying the XML file directly. The changes you make on the Phoneticized Field page of the Configuration Editor are reflected in the phonetic encoding structures and the match service section of mefa.xml. The Configuration Editor provides a simplified way of defining phonetic encoding.

Perform any of the following tasks to define phonetic encoding rules:

"Defining Master Person Index Fields for Phonetic Encoding"
"Modifying a Master Index Phonetic Encoding Definition"
"Deleting a Master Person Index Phonetic Encoding Definition"
"Defining a Master Person Index Phonetic Encoder"
"Modifying a Master Person Index Phonetic Encoder"
"Deleting a Master Person Index Phonetic Encoder"

Defining Master Person Index Fields for Phonetic Encoding

You can specify the fields you want to be phonetically encoded, the fields that store the encoded values, and the type of phonetic encoder to use for each field, such as NYSIIS or Soundex. A sample phonetic encoding structure for the XML file is included at the end of these instructions.

To Define a Field for Phonetic Encoding (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
In the object structure in the left pane, create the field that will contain the phonetic value of the field to be encoded.

For more information, see "Adding a Field to the Master Person Index Object Structure".
Click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Phoneticized Fields section, click Add.

The Phoneticized Field dialog box appears.
Select values for the fields described in "Master Person Index Phonetic Encoding Fields and Elements".
Click OK.

The phonetic encoding definition is added to the Phoneticized Fields list.
On the Configuration Editor toolbar, click Save.

To Define a Field for Phonetic Encoding (XML Editor)

In object.xml, create the field that will contain the phonetic value. For more information, see "Adding an Object to the Master Person Index Object Structure".

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.
Create a new phoneticize-field element within the phoneticize-fields element.
In the new phoneticize-field element, create and define the elements described in "Master Person Index Phonetic Encoding Fields and Elements".

Save and close the file.

Example 5-3 Phonetic Encoding Structure

<phoneticize-fields>
   <phoneticize-field>
      <unphoneticized-source-field-name>Person.FirstName
      </unphoneticized-source-field-name>
      <phoneticized-target-field-name>Person.FirstNamePhoneticCode
      </phoneticized-target-field-name>
      <encoding-type>Soundex</encoding-type>
   </phoneticize-field>
</phoneticize-fields>

Master Person Index Phonetic Encoding Fields and Elements

The following table lists and describes the Configuration Editor fields and XML file elements used to define how fields will be phonetically encoded.

Configuration Editor Field	XML File Element	Description
Unphoneticized Source	unphoneticized-source- field-name	The ePath of the source field in the system object that contains the data to be phonetically encoded (for example, Person.Address[*].StreetName). Note: This can refer to the original field or to a standardized or normalized field.
Phoneticized Target	phoneticized-target- field-name	The ePath of the field in the system object that will store the phonetically encoded value of the source field.
phoneticized-object- field-id	A field ID to identify the field to the phonetic encoder. This is not currently used with the OHMPI Standardization Engine, but could be used with a custom standardization engine.
Encoder	encoding-type	The phonetic encoder to use for this field. This must correspond to an encoder defined in the Encoders section on the Configuration Editor (or the PhoneticEncodersConfig element of the XML file).

Modifying a Master Index Phonetic Encoding Definition

Once you create a phonetic encoding definition, you can modify it as needed. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.

To Modify a Phonetic Encoding Definition (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Phoneticized Fields section, select the phonetic encoding definition you want to modify, and then click Edit.

The Phoneticized Field dialog box appears.
Modify any of the fields listed in "Master Person Index Phonetic Encoding Fields and Elements", and then click OK.
On the Configuration Editor toolbar, click Save.

To Modify a Phonetic Encoding Definition (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the phoneticize-fields element.
In the phoneticize-field element that defines the phonetic encoding you want to modify, change the value of any of the elements described in "Master Person Index Phonetic Encoding Fields and Elements".
Save and close the file.

Deleting a Master Person Index Phonetic Encoding Definition

Once you create a phonetic encoding definition, you can delete it as needed. Use caution when deleting definitions once a system is in production. This can cause inconsistent match results.

To Delete a Phonetic Encoding Definition (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Phoneticized Fields section, select the phonetic encoding definition you want to delete.
Click Remove.
Click Yes on the dialog that appears.
On the Configuration Editor toolbar, click Save.

To Delete a Phonetic Encoding Definition (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the phoneticize-fields element in the PhoneticEncodersConfig element.

Do either of the following:

To delete a field currently specified for phonetic conversion, delete all text between and including the phoneticize-field element that defines the field.

Using the example below, to delete the first name phonetic field, delete the boldface text.

<phoneticize-fields>
   <phoneticize-field>
      <unphoneticized-source-field-name>Person.LastName
      </unphoneticized-source-field-name>
      <phoneticized-target-field-name>Person.LastNamePhoneticCode
      </phoneticized-target-field-name>
      <encoding-type>NYSIIS</encoding-type>
   </phoneticize-field>
   <phoneticize-field>
      <unphoneticized-source-field-name>Person.FirstName
      </unphoneticized-source-field-name>
      <phoneticized-target-field-name>Person.FirstNamePhoneticCode
      </phoneticized-target-field-name>
      <encoding-type>Soundex</encoding-type>
   </phoneticize-field>
</phoneticize-fields>

To delete all fields currently specified for phonetic conversion, delete all text between, but not including, the phoneticize-fields element.

Save and close the file.

Defining a Master Person Index Phonetic Encoder

Each type of phonetic encoder provided with Oracle Healthcare Master Person Index is defined in the PhoneticEncodersConfig section of mefa.xml. They are listed in the Encoders section of the Configuration Editor.

To Define a Phonetic Encoder (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Encoders section, click Add.

The Phonetic Encoder dialog box appears.
In the Encoder field, enter a descriptive name for the encoder.
In the Implementation Class field, enter the fully qualified Java path of the class to use for the encoder.

Note:
For more information about the encoder class paths, see "Master Person Index Encoder Elements and Types".
Click OK.

The phonetic encoder is added to the Encoders list.
On the Configuration Editor toolbar, click Save.

To Define a Phonetic Encoder (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Create a new encoder element, and then define the elements described in "Master Person Index Encoder Elements and Types".
Save and close the file.

Master Person Index Encoder Elements and Types

The following table lists and describes the elements that configure the phonetic encoders used by the master person index application.

Element	Description
encoding-type	The name of the phonetic encoder, such as NYSIIS, Soundex, or Metaphone. See the following table for a list of default encoders for the OHMPI Standardization Engine.
encoder-implementation- class	The fully qualified name of the Java class that determines the behavior of the phonetic encoder. See the following table for a complete list of default classes for the OHMPI Standardization Engine.

The following table lists the phonetic encoders supported by the Master Index Standardization Engine along with the names of their default classes.

Encoding Type	class-name
NYSIIS	com.sun.mdm.index.phonetic.impl.Nysiis
Soundex	com.sun.mdm.index.phonetic.impl.Soundex
Metaphone	com.sun.mdm.index.phonetic.impl.Metaphone
Double Metaphone	com.sun.mdm.index.phonetic.impl.DoubleMetaphone
Refined Soundex	com.sun.mdm.index.phonetic.impl.RefinedSoundex
French Soundex	com.sun.mdm.index.phonetic.impl.SoundexFR

Modifying a Master Person Index Phonetic Encoder

Once you define a phonetic encoder, you can change the implementation class to use for the encoder. Use caution when modifying definitions once a system is in production. This can cause inconsistent match results.

To Modify a Phonetic Encoder (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Encoders section, select the encoder you want to modify and then click Edit.

The Phonetic Encoder dialog box appears.
Change the implementation class, and then click OK.
On the Configuration Editor toolbar, click Save.

To Modify a Phonetic Encoder (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Modify the value of any of the elements in the encoder element you want to modify (for more information, see "Master Person Index Encoder Elements and Types".
Save and close the file.

Deleting a Master Person Index Phonetic Encoder

Once you define a phonetic encoder, you can delete it if needed. Use caution when deleting encoders once a system is in production. This can cause inconsistent match results. If you delete an encoder that is referenced by a phonetic encoding definition, make sure to modify that definition by referencing an existing encoder.

To Delete a Phonetic Encoder (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
In the Configuration Editor toolbar, click the Phoneticized Fields tab.

The Phoneticized Field page appears.
In the Encoders section, select the encoder you want to delete.
Click Remove.
Click Yes on the dialog box that appears.
On the Configuration Editor toolbar, click Save.

To Delete a Phonetic Encoder (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the PhoneticEncodersConfig section.
Delete the text between and including the encoder element that defines the encoder you want to remove.
Save and close the file.

Defining the Master Person Index Match String

The match string defines the fields that are passed to the match engine for probabilistic weighting. By default, the fields defined for matching are the fields you specified for matching in the wizard or Configuration Editor. You can modify and delete fields in the match string if necessary. At least one field must be defined in the match string or no weights will be generated.

If you do modify the match string, you might need to make corresponding changes to the match engine configuration files as well. For more information about modifying these files, see Oracle Healthcare Master Person Index Match Engine Reference (Part Number E18470-01).

Perform either of the following tasks to configure the match string.

"Creating the Master Person Index Match String"
"Modifying the Master Person Index Match String"

You can further configure the match string by defining exclusion lists that filter out unwanted values from the match process. For more information, see "Filtering Default Values From Master Person Index Processes".

Creating the Master Person Index Match String

A default match string is predefined based on the match type information you specified in the wizard. If no match types were defined using the wizard, the structure of the match string is still defined but with no fields. You can use normalized or phonetically encoded fields for the match string.

To Create the Match String (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Expand the object structure so all fields are visible.
To add a field to the match string, click on the field name and then select a value for the Match Type field on the Properties page.

Note:
The match types you can use are listed in the first column of matchConfigFile.cfg. For more information about OHMPI Match Engine match types, see Oracle Healthcare Master Person Index Match Engine Reference (Part Number E18470-01).
Perform the previous step for each field in the match string.
On the Configuration Editor toolbar, click Save.
To remove unwanted or invalid field values from the match process, see "Filtering Default Values From Master Person Index Processes".

To Create the Match String (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
In the MatchingConfig element, scroll to the match-columns element in the match-system-object element.
To add a field to the match string, do the following:
1. In the match-columns element, create a new match-column element.
2. In the new match-column element, create and define a column-name element.
  
  Enter the fully qualified field name of the field on which to match (for example, Enterprise.SystemSBR.Person.Address.City).
3. Following the column-name element, create and define a match-type element.
  
  Enter an ID that identifies the field to the match engine. For the OHMPI Match Engine, this value must correspond to a defined match type.
  
  For example:
```
<match-system-object>
   <object-name>Address</object-name>
   <match-columns>
     <match-column>
       <column-name>Enterprise.SystemSBR.Person.Address.StreetName
       </column-name>
       <match-type>StreetName</match-type>
     </match-column>
   </match-columns>
</match-system-object>
```
Repeat the previous step for each field to add to the match string.
Save and close the file.
To remove unwanted or invalid field values from the match process, see "Filtering Default Values From Master Person Index Processes".

Modifying the Master Person Index Match String

Once you define a match string, you can modify or delete information about the fields in the match string as necessary. This should only be done prior to moving to production. Otherwise, unexpected matching results might occur. For more information about OHMPI Match Engine match types and field IDs, see Oracle Healthcare Master Person Index Match Engine Reference. (Part Number E18470-01).

To Modify the Match String (Configuration Editor)

In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

The Configuration Editor appears.
Expand the object structure so all fields are visible.
To add a field to the match string, click the field name and then select a value for the Match Type field on the Properties page.

The field is added to the match string.
To modify the match type specified for a field, click the name of the field defined for matching and then select a new value for the Match Type field on the Properties page.
To remove a field from the match string, click the name of the field defined for matching and then select None for the Match Type field on the Properties page.
On the Configuration Editor toolbar, click Save.
To define or modify exclusion lists of values to be removed from the match process, see "Filtering Default Values From Master Person Index Processes".

To Modify the Match String (XML Editor)

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the MatchingConfig element, and then scroll to the match-system-object element.
To add a field to the match string, do the following:
- In the match-columns element, create a new match-column element.
- In the new match-column element, create and define a column-name element.
  
  Enter the fully qualified field name of the field on which to match (for example, Enterprise.SystemSBR.Person.Address.City).
- Following the column-name element, create and define a match-type element.
  
  Enter an ID that identifies the field to the match engine. For the OHMPI Match Engine, this value must correspond to a defined match type.
  
  For example:
```
<match-system-object>
   <object-name>Address</object-name>
   <match-columns>
     <match-column>
       <column-name>Enterprise.SystemSBR.Person.Address.StreetName
       </column-name>
       <match-type>StreetName</match-type>
     </match-column>
   </match-columns>
</match-system-object>
```
To change a field used in the match string, change the value of the column-name element.

Enter the fully qualified field name of the new field (for example, Enterprise.SystemSBR.Person.FirstName).
To change the type of matching to perform for a field, change the value of the match-type element.

Enter an ID that identifies the field to the match engine. For the OHMPI Match Engine, this value must correspond to a defined match type.

To delete a field from the match string, delete all text between and including the match-column element defining the field you want to delete.

Using the example below, to delete the HouseNo field from the match string, delete the boldface text.

<match-system-object>
  <object-name>Address</object-name>
  <match-columns>
     <match-column>
       <column-name>Enterprise.SystemSBR.Person.Address.StreetName
       </column-name>
       <match-type>StreetName</match-type>
     </match-column>
     <match-column>
       <column-name>Enterprise.SystemSBR.Person.Address.HouseNo
       </column-name>
       <match-type>HouseNumber</match-type>
     </match-column>

Save and close the file.
To define or modify exclusion lists of values to be removed from the match process, see "Filtering Default Values From Master Person Index Processes".

Defining how Master Person Index Query Blocks are Processed

The block picker and pass controller define how query blocks are processed for matching. Default components are defined by Oracle Healthcare Master Person Index, but you can create your own Java classes to define custom versions of these components.

The block picker determines the blocking strategy to use for each match pass. Blocking strategies define how the queries are created that check the database for a subset of the records to be used for matching. The default Block Picker has access to the match results from previous match passes, as well as lists of applicable blocking definitions that have been executed and of those that have not. The default Block Picker class is com.sun.mdm.index.matching.impl.PickAllBlocksAtOnce, which selects all blocks during the first pass.

The pass controller determines whether to continue processing the defined blocks. The matching process can be executed in multiple stages. Each query block in the blocking query is executed in a separate match pass. After a block is evaluated, the pass controller determines if the results found are sufficient or if the query should continue by performing another match pass. The default pass controller is com.sun.mdm.index.matching.impl.PassAllBlocks. This class instructs the match engine to continue calculating match weights until all applicable block definitions have been processed.

These components can only be configured by modifying the XML file directly.

To Specify the Block Picker

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the block-picker element in the MatchingConfig section.
In the class-name element, specify the Java class for the block picker you want to use, using the fully qualified class name.

For example:
```
<block-picker>
   <class-name>com.sun.mdm.index.matching.impl.PickAllBlocksAtOnce
   </class-name>
</block-picker>
```
Save and close the file.

To Specify the Pass Controller Class

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the pass-controller element in the MatchingConfig section.
In the class-name element, specify the Java class for the Pass Controller you want to use, using the fully qualified class name.

For example:
```
<pass-controller>
   <class-name>com.sun.mdm.index.matching.impl.PassAllBlocks
   </class-name>
</pass-controller>
```
Save and close the file.

Configuring the Standardization Engine

You can configure the standardization engine by specifying the standardization engine to use, configuring the files that define data standardization, and plugging in custom standardization and matching rules. You only need to specify the standardization engine to use if you are using an engine other than the OHMPI Standardization Engine.

Perform any of these steps to configure the standardization engine:

"Specifying a Standardization Engine for the Master Person Index"
"Modifying Master Index Standardization Files"
"Importing Standardization Data Types and Variants"
"Deleting a Standardization Variant or Data Type"

Specifying a Standardization Engine for the Master Person Index

Oracle Healthcare Master Person Index can support standardization engines from different vendors depending on the adapter configured to communicate with the engine. Default classes are provided for using the OHMPI Standardization Engine. You can implement a custom standardization engine along with customized adapters. The standardization engine configuration is defined by standardizer-api and standardizer-config elements.

Note:

The default adapters for the OHMPI Standardization Engine are com.sun.mdm.index.matching.adapter.SbmeStandardizerAdapter and com.sun.mdm.index.matching.adapter.SbmeStandardizerAdapterConfig.

To Specify the Standardization Engine

In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

The file opens in the NetBeans XML editor.
Scroll to the standardizer-api element in the MatchingConfig section.

Specify the Java class for the standardization adapter to use, using the fully qualified class name as shown below.

<standardizer-api>
   <class-name>
    com.sun.mdm.index.matching.adapter.MyStandardizerAdapter
   </class-name>
</standardizer-api>

In the standardizer-config element, specify the Java class for the configuration of the standardization adapter, using the fully qualified class name as shown below.
```
<standardizer-config>
   <class-name>
     com.sun.mdm.index.matching.adapter.SbmeStandardizerAdapterConfig
   </class-name>
</standardizer-config>
```
Save and close the file.

Modifying Master Index Standardization Files

You can fine-tune the standardization process by modifying the standardization files. For example, you can insert additional names or terms into the normalization or lexicon files, such as giventNames.txt and givenNameNormalizatin.txt. Depending on your data requirements, you might need to modify additional standardization files. Some of the patterns files (most notably the address patterns files) are very complex and should only be modified by personnel who thoroughly understand the defined patterns and tokens. If you modify standardization files, make sure you modify them for each variant specified in mefa.xml.

You can modify the data configuration files (lexicon and normalization files), and you can also modify the process configuration files that define the data types, variants, and how data is standardized. The process files are more complex, and should only be modified by one who is familiar with standardization concepts and with the OHMPI Standardization Engine. Instructions for modifying these files are not included here. For information about these files, see Oracle Healthcare Master Person Index Standardization Engine Reference (Part Number E18471-01).

To Modify Standardization Data Configuration Files

In the Projects window, expand the master person index project to configure and then expand Standardization Engine.
Expand instance, expand the variant to modify, and then expand resources.
Open the file you want to modify in the NetBeans text editor.
Modify the file in accordance with the information presented for each data type in Oracle Healthcare Master Person Index Standardization Engine Reference (Part Number E18471-01).
Save and close the file.

Importing Standardization Data Types and Variants

The OHMPI Standardization Engine is based on a very flexible framework that allows you to define new data types and variants so you can standardize any type of data in a custom manner. You can create new data types and variants based on the finite state machine and new variants for the existing rules-based data types. You need to import the data type or variant package into NetBeans to make it available to all master person index applications or only the current one.

This section only describes importing custom data types and variants after they have been created. For information about creating a custom data type or variant, in Oracle Healthcare Master Person Index Standardization Engine Reference. (Part Number E18471-01).

To Import a Data Type or Variant

In the Projects window, expand the main master person index project.
Right-click Standardization Engine, and select Import Standardization Plug-in.
In the dialog box that appears, navigate to the location of the plug-in package.
Select the file containing the plug-in, and then click Open.
Do one of the following:
- To import the plug-in and make it available to all future master person index applications, click Yes.
- To import the plug-in and make it only available to the current master person index application, click No.
The data type or variant is imported into the Standardization Engine node. Data types add folders just beneath the Standardization Engine node; variants add folders under the appropriate data type (as specified in the variant package).
In the Standardization Engine node, navigate to the new data type or variant you added and verify that all of the required files are there.

Deleting a Standardization Variant or Data Type

If you add a data type or variant to a master person index application in error, you can remove it from the Standardization Engine node. You can also delete any of the existing data types or variants if they are not in use.

Caution:

Be careful when removing variants or data types; this action cannot be undone.

To Delete a Variant or Data Type

Back up the source files for the data type or variant in case you need them at a later time.

The default data types and variants are stored in NetBeansHome/soa2/modules/ext/mdm/standardizer/deployment.
In the Project window, navigate to the Standardization Engine node in the master person index project and then to the data type or variant you want to remove.
Right-click the folder containing the files to remove, and then select Delete.

A confirmation dialog appears.
Click Yes.

The data type or variant is removed from the project.