Configuring Sun Master Indexes

About the Master Index Configuration Files

Several XML configuration files define primary characteristics of the master index application, such as how data is processed, queried, and matched. These files configure runtime components of the master index application.

object.xml

In the wizard, you define the objects and fields contained in the object structure, along with properties for those fields. The information you specify is written to object.xml in the master index project. This file defines the objects stored in the master index application and their relationships to one another. It also defines the fields contained in each object, as well as certain properties of each field, such as length, data type, whether it is required, whether it is a unique key, and so on. This file contains one parent object; all other objects must be child objects to that parent object. The object structure you define in object.xml determines the structure of the database tables that store object data, the structure of the Java API, and the structure of the OTD generated for the project.

query.xml

In query.xml, you configure the Query Builder component of the master index application and define the available queries. In this file, you define the types of queries that can be performed from the Master Index Data Manager (MIDM) and the queries that are used during the match process. You can define both phonetic and alphanumeric searches for the MIDM. By default, these are called basic queries. You can also define blocking queries, which define blocks of criteria fields for the match process. The master index application queries the database using the criteria defined in each block, one at a time. After completing a query on the criteria defined in one block, it performs another pass using the next block of defined criteria. Blocking queries can also be used in place of the basic phonetic query in the MIDM.

mefa.xml

In mefa.xml, you configure the Matching Service by specifying the fields to be standardized and the fields to be used for matching, as well as defining how the fields are standardized and matched. This file specifies the match and standardization engines to use and the query process for matching. Standardization includes defining fields to be reformatted (or parsed), normalized, or phonetically encoded. For matching, you must also define the data string to be passed to the match engine. The rules you define for standardization and matching are dependent on the standardization and match engine in use. Understanding the Master Index Match Engine and Understanding the Master Index Standardization Engine describe the rules for the Master Index Standardization Engine and Master Index Match Engine.

You can also configure portions of the match process in master.xml, described below, which defines certain match parameters that control weight thresholds, how assumed matches are processed, how potential duplicates are processed, and the query to use for matching.

master.xml

In master.xml, you configure the Manager Service and define properties of the match process. You specify the match and duplicate thresholds in this file, and define certain system parameters, such as the update mode, how to process records above the match threshold, how to manage same system matches, and whether merged records can be updated. This file also specifies which of the queries defined in the Query Builder to use for matching queries.

This file also configures the EUIDs assigned by the master index application. You can specify an EUID length, whether a checksum value is used for additional verification, and a “chunk size”. Specifying a chunk size allows the EUID generator to obtain a block of EUIDs from the sbyn_seq_table database table so it does not need to query the table each time it generates a new EUID.

update.xml

In update.xml, you define formulas that determine which data in an enterprise record should be considered the most reliable and how updates to the single best record (SBR) will be handled. The survivor calculator uses these formulas to decide what data from each system record to include in each object’s SBR. The SBR is the portion of the enterprise record that represents the data that is considered to be the most accurate and current for an entity.

The SBR is defined by a mapping of fields from external system records. Since there might be many external systems, you can optionally specify a strategy to select the value for an SBR field from the list of external values. You can also specify any additional fields that might be required by the selection strategy to determine which external system contains the best data, such as the object’s update date and time.

You can create Java classes that define special processing to perform against a record when the record is created, updated, merged, or unmerged. These classes must be created in the Source Packages folder of the EJB project and can be specified for each transaction type in update.xml.

filter.xml

You can further configure the survivor calculator, blocking query, and match process by defining exclusion lists in filter.xml. Exclusion lists allow you to define values that should not be populated into the SBR, that should not be considered in the composite matching weight, and that should be ignored in the blocking query. Values you would want to filter out primarily include default values that are used when the actual value for a field is unknown. Default values can cause the blocking query to return records that are not a close match and can skew matching results.

validation.xml

By default, validation.xml defines certain validations for the local identifiers assigned by each external system. You can create custom Java classes that define rules for validating field values before they are saved to the master index database. You can then specify the Java classes in validation.xml to make them part of the master index application.

security.xml

This file defines security roles and permissions for the client applications that access the master index database.

midm.xml

In midm.xml, you configure the appearance and processing properties of the Master Index Data Manager (MIDM). In this file, you define each object and field that appears on the MIDM, along with the properties of each field, such as the field type and length, field labels, format masks, and so on. You can also define the order in which objects and fields appear on the MIDM pages.

This file defines several additional properties of the MIDM, including the types of searches available, whether wildcard characters can be used, the criteria for the searches, and the results fields that appear. You can also specify whether an audit log is maintained of each instance data is accessed through the MIDM. For healthcare-based master index applications this supports the privacy rules mandated by the HIPAA regulation for healthcare.

Finally, midm.xml defines certain implementation information, such as the application server in use, debugging rules, and security activation.

The files that configure the components of the master index application are created by the wizard and define characteristics of the application, such as how data is processed, queried, and matched, and how it appears on the MIDM. These files configure the runtime components of the master index application.