Configuring Sun Master Indexes

Defining Master Index Fields to be Standardized

When you define fields for standardization, you can specify the type of standardization to perform on each field or group of fields, the nationality of the data, and a field that indicates which nationality to use (if you specify more than one). You also specify which fields contain the data that needs to be parsed and normalized, and which fields contain the parsed and normalized data. For each standardization structure, you can specify more than one source field, but they must use the same standardization type. The source fields in one standardization structure are concatenated before being parsed.

A sample standardization structure for the XML file is included at the end of these instructions.

ProcedureTo Define Fields to be Standardized (Configuration Editor)

  1. In the Projects window, right-click the Configuration node in the project you want to modify, and then click Edit.

    The Configuration Editor appears.

  2. In the object structure in the left pane, create the fields that will contain the parsed components of the new field to be standardized.

    For more information, see Adding a Field to the Master Index Object Structure.

  3. Click the Standardization tab.

    The Standardization page appears.

  4. Click Add.

    The Standardization Type dialog box appears.

  5. Enter values for the Data Type, Domain Selector, and Variant Field Name fields (these are described in Master Index Normalization and Standardization Structure Properties.

  6. To define a variant for the standardization engine to use, do the following:

    1. In the Variants section, Click Add.

    2. On the Variant dialog box, enter values in the fields described in Master Index Variants Properties.

    3. Click OK.

      If you selected the multiple domain selector, you can add multiple variants; otherwise, you can define one default variant and one defined variant.

  7. Under Source Fields to be Standardized, click Add.

    The Select Source Field(s) dialog box appears.

  8. In the left panel, select the field that contains the data that needs to be parsed and normalized, and then click the right arrow.


    Note –

    If the data is contained in more than one field, select all fields that contain the data. For example, a street address might be contained in two fields, such as Street Address and Unit. Both fields should be selected for standardization; they will be concatenated during the standardization process.


  9. If you add a field in error, select the field in the Selected Source Field(s) list, and then click the left arrow.

  10. Click OK.

  11. For each field in which the parsed and normalized data will be stored, do the following:

    1. On the Standardized Fields dialog box, click Add under Target Mappings.

      The Target Mapping dialog box appears.

    2. In the Select Target field, select the name of a field that will contain standardized data.

    3. In the Available Standardization Components list, select the ID associated with the field, and then click Add between the left and right panels.

    4. To change the priority of a component in the Selected Standardization Components list, select the component and then click Move Up or Move Down.

    5. If you add a component in error, select the component in the Selected Standardization Components list, and then click Remove.

    6. Click OK.


      Note –

      For more information about standardization components and the fields to which they pertain, see Understanding the Master Index Standardization Engine.


  12. Click OK on the Standardization Type dialog box.

    The new standardization definition appears in the list.

  13. On the Configuration Editor toolbar, click Save.

ProcedureTo Define Fields to be Standardized (XML Editor)

Before You Begin

In object.xml, create the fields that will contain the parsed components of the field to be standardized. For more information, see Adding a Field to the Master Index Object Structure.

  1. In the Projects window, expand the Configuration node in the project you want to modify, and then double-click mefa.xml.

    The file opens in the NetBeans XML editor.

  2. Scroll to the free-form-texts-to-standardize element in the StandardizationConfig element.

  3. Create a new group element in the free-form-texts-to-standardize element, and then define the standardization-type and domain-selector attributes (these are described in Master Index Normalization and Standardization Structure Properties).

    Make sure the new element falls within the free-form-texts-to-standardize element, but outside any existing group tags.

  4. If you specified the multiple domain selector for the domain-selector attribute, do the following:

    1. In the group element, create a locale-field-name element and a locale-maps element.

    2. Define the elements described in Master Index Variants Properties).

  5. To specify the source fields to standardize, do the following:

    1. If it does not currently exist, create an unstandardized-source-fields element in the appropriate group element (each group element can only include one unstandardized-source-fields element).

    2. For each field standardized by the specified standardization type, create and name a new unstandardized-source-field-name element in the new unstandardized-source-fields element.


      Note –

      If more than one source field is defined, the fields are concatenated prior to standardization (with a pipe (|) between them for the Master Index Standardization Engine). If you want the fields to be processed separately, you need to create two standardization structures. Source fields are designated by their ePaths.


  6. To specify the destination fields for the standardized data, do the following:

    1. In the group element for which destination fields need to be defined, create a standardization-targets element after the unstandardized-source-fields element.

    2. In the new element, create a target-mapping element for each destination field, and then define the last two elements described in Master Index Standardization Source and Target Field Elements.

  7. Save and close the file.


Example 3 Address Standardization Structure


<group standardization-type="Address" domain-selector=
 "com.sun.mdm.index.matching.impl.SingleDomainSelectorUS">
  <locale-field-name>Person.Address[*].CountryCode
  </locale-field-name>
  <locale-maps>
     <locale-codes>
         <value>GB</value>
         <locale>UK</locale>
      </locale-codes>
      <locale-codes>
         <value>UNST</value>
         <locale>US</locale>
      </locale-codes>
      <locale-codes>
         <value>AU</value>
         <locale>AU</locale>
      </locale-codes>
      <locale-codes>
         <value>Default</value>
         <locale>AU</locale>
      </locale-codes>
   </locale-maps>
   <unstandardized-source-fields>
      <unstandardized-source-field-name>Person.Address[*].AddressLine1
      </unstandardized-source-field-name>
      <unstandardized-source-field-name>Person.Address[*].AddressLine2
      </unstandardized-source-field-name>
   </unstandardized-source-fields>
   <standardization-targets>
      <target-mapping>
         <standardized-object-field-id>HouseNumber
         </standardized-object-field-id>
         <standardized-target-field-name>Person.Address[*].HouseNumber
         </standardized-target-field-name>
      </target-mapping>
      <target-mapping>
         <standardized-object-field-id>MatchStreetName
         </standardized-object-field-id>
         <standardized-target-field-name>Person.Address[*].StreetName
         </standardized-target-field-name>
      </target-mapping>
   </standardization-targets>
</group>