Configuring Sun Master Indexes

Master Index Normalization and Standardization Structure Properties

The following table lists and describes the Configuration Editor fields and their corresponding XML elements that define the fields to be normalized or standardized in the master index application.

You can specify one or more variants for data to be standardized. A variant is a subset of a data type. For example, if the data type is address, variants are defined for addresses from different countries. The rule set for each country is called a variant. For a single variant, you only need to specify the variant if you need to standardize data that is not from the United States. If you are standardizing data from multiple countries, use the multiple domain selector. This requires that one field in the object structure identify which variant to use for each field that will be standardized. For example, the value of the Country field in a system record could be used to tell the standardization engine which variant to use for a particular set of data. If you specified the multiple domain selector in the domain-selector element, you must also define the identifying field and then map the values that can be populated into that field to their corresponding variant.

The following rules apply to the multiple domain selector:

For more information about the fields and elements described in the following table, see Understanding the Master Index Standardization Engine.

Configuration Editor Field

XML File Element or Attribute 

Description 

Data Type

standardization-type 

The type of standardization to perform on the source fields. This is specific to the type of data being processed. 

Domain Selector 

domain-selector 

The Java class used by the standardization engine to determine the variant of the data being processed. For the Master Index Standardization Engine, the following classes can be specified. If no selector is specified, the default is US. The Master Index Standardization Engine supports Australian, French, United Kingdom, and United States variants. Possible values for this field are: 

  • com.sun.mdm.index.matching.impl. SingleDomainSelectorAU

  • com.sun.mdm.index.matching.impl. SingleDomainSelectorFR

  • com.sun.mdm.index.matching.impl. SingleDomainSelectorUK

  • com.sun.mdm.index.matching.impl. SingleDomainSelectorUS

  • com.sun.mdm.index.matching.impl. MultipleDomainSelector

Variant Field Name 

locale-field-name 

The ePath to an identifying field in the object structure that identifies which of the defined variants (element locale-codes) to use. If no field is specified for the Master Index Standardization Engine, the standardization engine defaults to the United States, regardless of whether any variants are defined. This field must be contained in the object that contains the fields defined for normalization in this structure.

Unnormalized Source

unnormalized-source- field-name 

The field that contains the data to be normalized. The field is designated by its ePath (for example, Person.FirstName). 

Unnormalized Standardization Component

standardized-object- field-id 

An identification code that identifies the field to normalize to the standardization engine. This ID is specific to the standardization engine and must correspond to a standardization component defined by that engine. 

Normalized Standardization Component

standardized-object- field-id 

An identification code that identifies the field that contains the normalized data to the standardization engine. This is specific to the standardization engine in use and must correspond to a standardization component defined by that engine. 

Normalized Target

standardized-target- field-name 

The field that will store the normalized data. The field is designated by its ePath (for example, Person.Alias[*].StdLastName).