Master Index Normalization and Standardization
Structure Properties
The following table lists and describes the Configuration Editor fields
and their corresponding XML elements that define the fields to be normalized
or standardized in the master index application.
You can specify one or more variants for data to be standardized. A
variant is a subset of a data type. For example, if the data type is address,
variants are defined for addresses from different countries. The rule set
for each country is called a variant. For a single variant, you only need
to specify the variant if you need to standardize data that is not from the
United States. If you are standardizing data from multiple countries, use
the multiple domain selector. This requires that one field in the object structure
identify which variant to use for each field that will be standardized. For
example, the value of the Country field in a system record could be used to
tell the standardization engine which variant to use for a particular set
of data. If you specified the multiple domain selector in the domain-selector element, you must also define the identifying field and then map
the values that can be populated into that field to their corresponding variant.
The
following rules apply to the multiple domain selector:
-
You can specify a value of “Default” for the identifying
field. The corresponding variant is used if the identifying field is blank,
contains the value “Default”, or contains a value not defined
by any of the value elements.
-
If a “Default” value is not defined, the system
default variant, United States, is used as the default.
For more information about the fields and elements described in the
following table, see Understanding the Master Index Standardization Engine.
Configuration
Editor Field
|
XML File Element or Attribute
|
Description
|
Data Type
|
standardization-type
|
The type of standardization to perform on the source fields. This is
specific to the type of data being processed.
|
Domain Selector
|
domain-selector
|
The Java class used by the standardization engine to determine the variant of
the data being processed. For the Master Index Standardization Engine, the following classes can
be specified. If no selector is specified, the default is US. The Master Index Standardization Engine supports
Australian, French, United Kingdom, and United States variants. Possible
values for this field are:
-
com.sun.mdm.index.matching.impl. SingleDomainSelectorAU
-
com.sun.mdm.index.matching.impl. SingleDomainSelectorFR
-
com.sun.mdm.index.matching.impl. SingleDomainSelectorUK
-
com.sun.mdm.index.matching.impl. SingleDomainSelectorUS
-
com.sun.mdm.index.matching.impl. MultipleDomainSelector
|
Variant Field Name
|
locale-field-name
|
The ePath to an identifying field in the object structure that identifies
which of the defined variants (element locale-codes) to
use. If no field is specified for the Master Index Standardization Engine, the standardization engine
defaults to the United States, regardless of whether any variants are defined.
This field must be contained in the object that contains the fields defined
for normalization in this structure.
|
Unnormalized
Source
|
unnormalized-source- field-name
|
The field that contains the data to be normalized. The field is designated
by its ePath (for example, Person.FirstName).
|
Unnormalized Standardization Component
|
standardized-object- field-id
|
An identification code that identifies the field to normalize to the
standardization engine. This ID is specific to the standardization engine
and must correspond to a standardization component defined by that engine.
|
Normalized Standardization Component
|
standardized-object- field-id
|
An identification code that identifies the field that contains the normalized
data to the standardization engine. This is specific to the standardization
engine in use and must correspond to a standardization component defined by
that engine.
|
Normalized
Target
|
standardized-target- field-name
|
The field that will store the normalized data. The field is designated
by its ePath (for example, Person.Alias[*].StdLastName).
|