Understanding the Master Index Standardization Engine

Telephone Number Standardization and Sun Master Index

Master index applications rely on the Master Index Standardization Engine to process telephone number data. To ensure correct processing of telephone information, you need to customize the Matching Service for the master index application according to the rules defined for the standardization engine. This includes modifying mefa.xml to define standardization of the appropriate fields. You can modify mefa.xml using the Master Index Configuration Editor.

Standardization is defined in the StandardizationConfig section of mefa.xml, which is described in detail in Match Field Configuration in Understanding Sun Master Index Configuration Options . To configure the required fields for parsing, modify the standardization structure in mefa.xml.

The following topics provide information about the fields used in processing telephone data and how to configure telephone number standardization for a master index application. The information provided in these topics is based on the default configuration.

Telephone Number Processing Fields

When standardizing telephone data, not all fields in a record need to be processed by the Master Index Standardization Engine. The standardization engine only needs to process fields that must be parsed, normalized, or phonetically converted. For a master index application, these fields are defined in mefa.xml and processing logic for each field is defined in the Standardization Engine node configuration files.

Telephone Number Standardized Fields

The Master Index Standardization Engine can process telephone data that is contained in one long free-form field and can parse that field into its individual components. By default, the standardization engine separates telephone numbers into these field components: country code, area code, phone number, and extension.

Telephone Number Object Structure

To standardize telephone numbers in a master index application, you need to manually define the standardization structure and you need to add the fields that will store the standardized field components to the object structure. In the default implementation, you can store any combination of the following telephone number field components in the master index database.

The standardization engine has the capability to produce all of the above field components, but you only need to store the ones you need in the master index database.

Configuring a Standardization Structure for Telephone Numbers

For free–form name fields, the source fields you define for standardization should include the standardization components predefined for the PhoneNumber data type. For example, any fields containing telephone number information can include the country code, area code, phone number, and extension. The target fields you define can include any of these parsed fields. Follow the instructions under Defining Master Index Standardization Rules in Configuring Sun Master Indexes to define fields for standardization. For the standardization-type element, enter PhoneNumber. For a list of field IDs to use in the standardized-object-field-id element, see Telephone Number Standardization Components.

A sample standardization structure for telephone number data is shown below. No variant is defined in this structure because the standardization rules apply to global numbers.


<free-form-texts-to-standardize>
   <group standardization-type="PHONENUMBER"
    domain-selector="com.sun.mdm.index.matching.impl.MultiDomainSelector">
      <unstandardized-source-fields>
         <unstandardized-source-field-name>Person.Phone[*].PhoneNumber
         </unstandardized-source-field-name>
      </unstandardized-source-fields>
      <standardization-targets>
         <target-mapping>
            <standardized-object-field-id>countryCode</standardized-object-field-id>
            <standardized-target-field-name>Person.Phone[*].CountryCode
         </standardized-target-field-name>
         </target-mapping>
         <target-mapping>
            <standardized-object-field-id>areaCode</standardized-object-field-id>
            <standardized-target-field-name>Person.Phone[*].AreaCode
         </standardized-target-field-name>
         </target-mapping>
         <target-mapping>
            <standardized-object-field-id>phoneNumber</standardized-object-field-id>
            <standardized-target-field-name>Person.Phone[*].Number
         </standardized-target-field-name>
         </target-mapping>
         <target-mapping>
            <standardized-object-field-id>extension</standardized-object-field-id>
            <standardized-target-field-name>Person.Phone[*].Extension
         </standardized-target-field-name>
         </target-mapping>
      </standardization-targets>
   </group>
</free-form-texts-to-standardize>