Understanding Sun Master Index Configuration Options

About the Configuration Files for Sun Master Index

Several XML configuration files define primary characteristics of the master index application, such as how data is processed, queried, and matched. These files configure runtime components of the master index application.

The configuration files include the following:

Master Index object.xml File

In the wizard, you define the objects and fields contained in the object structure, along with properties for those fields. The information you specify is written to object.xml in the master index project. This file defines the objects stored in the master index application and their relationships to one another. It also defines the fields contained in each object, as well as certain properties of each field, such as length, data type, whether it is required, whether it is a unique key, and so on. This file contains one parent object; all other objects must be child objects to that parent object. The object structure you define in object.xml determines the structure of the database tables that store object data, the structure of the Java API, and the structure of the OTD generated for the project.

Master Index query.xml File

The Query Builder component of the master index application is configured in query.xml, which defines the available queries. In this file, you define the types of queries that can be performed from the MIDM and the queries that are used during the match process. You can define both phonetic and alphanumeric searches for the MIDM. By default, these are called basic queries. You can also define blocking queries, which define blocks of criteria fields for the match process. The master index application queries the database using the criteria defined in each block, one at a time. After completing a query on the criteria defined in one block, it performs another pass using the next block of defined criteria. Blocking queries can also be used in place of the basic phonetic query in the MIDM.

Master Index mefa.xml File

In mefa.xml, you configure the Matching Service by specifying the fields to be standardized and the fields to be used for matching, as well as defining how the fields are standardized and matched. It also specifies the match and standardization engines to use and the query process for matching. Standardization includes defining fields to be reformatted (or parsed), normalized, or converted to their phonetic version. For matching, you must also define the data string to be passed to the match engine. The rules you define for standardization and matching are dependent on the match and standardization engines in use. Understanding the Master Index Match Engine and Understanding the Master Index Standardization Engine describe the rules for the Master Index Match Engine and Master Index Standardization Engine.

In addition, master.xml, described below, also configures the match process by defining certain match parameters that define weight thresholds, how assumed matches are processed, and how potential duplicates are processed. It also specifies the query to use for matching.

Master Index master.xml File

master.xml configures the Manager Service and defines properties of the match process. You specify the match and duplicate thresholds in this file, and define certain system parameters, such as the update mode, how to process records above the match threshold, how to manage same system matches, and whether merged records can be updated. This file also specifies which of the queries defined in the Query Builder to use for matching queries.

master.xml also configures the EUIDs assigned by the master index application. You can specify an EUID length, whether a checksum value is used for additional verification, and a “chunk size”. Specifying a chunk size allows the EUID generator to obtain a block of EUIDs from the sbyn_seq_table database table so it does not need to query the table each time it generates a new EUID.

Master Index update.xml File

In update.xml, you can define formulas that determine which data in an enterprise record should be considered the most reliable and how updates to the single best record (SBR) will be handled. The survivor calculator uses these formulas to decide what data from each system record to include in each object’s SBR. The SBR is the portion of the enterprise record that represents the data that is considered to be the most accurate and current for an object.

The SBR is defined by a mapping of fields from external system records. Since there might be many external systems, you can optionally specify a strategy to select the value for an SBR field from the list of external values. You can also specify any additional fields that might be required by the selection strategy to determine which external system contains the best data, such as the object’s update date and time.

This file also allows you to specify custom update procedures that you define in custom Java code you can plug in to the application. You can create Java classes that define special processing to perform against a record when the record is created, updated, merged, or unmerged. These classes must be created in the Source Packages folder of the EJB project and can be specified for each transaction type in update.xml.

Master Index filter.xml

You can further configure the survivor calculator, blocking query, and match process by defining exclusion lists in filter.xml. Exclusion lists allow you to define values that should not be populated into the SBR, that should not be considered in the composite matching weight, and that should be ignored in the blocking query. Values you would want to filter out primarily include default values that are used when the actual value for a field is unknown. Default values can cause the blocking query to return records that are not a close match and can skew matching results.

Master Index validation.xml File

By default, validation.xml (validation.xml) defines certain validations for the local identifiers assigned by each external system. You can create custom Java classes that define rules for validating field values before they are saved to the master index database. You can then specify the Java classes in validation.xml to make them part of the Sun Master Index application.

Master Index security.xml File

This file defines security roles and permissions for the client applications that access the master index database.

Master Index edm.xml File

Configuration of the appearance and certain processing properties of the MIDM is contained in midm.xml. In this file, you define each object and field that appears on the MIDM, along with the properties of each field, such as the field type and length, field labels, format masks, and so on. You can also define the order in which objects and fields appear on the MIDM pages.

This file defines several additional properties of the MIDM, including the types of searches available, whether wildcard characters can be used, the criteria for the searches, and the results fields that appear. You can also specify whether an audit log is maintained of each instance data is accessed through the MIDM. For healthcare-based master index applications, this supports the privacy rules mandated by the HIPAA regulation for healthcare. This file also includes the configuration of the reports generated from the MIDM.

Finally, midm.xml defines certain implementation information, such as the application server in use, debugging rules, and security activation.

The files that configure the components of the master index application are created by the wizard and define characteristics of the application, such as how data is processed, queried, and matched, and how it appears on the Master Index Data Manager (MIDM). These files configure the runtime components of the master index application.

Match and Standardization Engine Configuration Files

Several match and standardization engine configuration files are included in the project tree. You can customize matching logic and standardization information for the match and standardization engines by modifying these files. The match configuration file, which defines and configures the comparator functions, can be modified using the Master Index Configuration Editor or the NetBeans text editor. The standardization files, which provide information to the standardization engine about how data should be parsed and normalized, can be modified using the text editor.

For information about the structure of these files and how they can be modified, see Understanding the Master Index Match Engine and Understanding the Master Index Standardization Engine.