Configuring Sun Master Indexes (Repository)

Master Index Configuration Overview (Repository)

Sun Master Index provides a very flexible framework for creating a master index application that is customized for your requirements. A Sun Master Index project includes several files in XML format that define the configuration of the runtime environment. You can configure a master index application by modifying the XML files directly or by using the Configuration Editor. Make sure to verify the configuration of the application before deploying the project.


Note –

It is helpful to review the information provided in Understanding Sun Master Index Configuration Options (Repository) to learn about the relationships between the files and what components and processes can be configured. Certain components can only be configured by modifying the XML files directly.


The following topics provide an overview of the configuration files and editors.

About the Master Index Configuration Files (Repository)

Several XML configuration files define primary characteristics of the master index application, such as how data is processed, queried, and matched. These files configure runtime components of the master index application.

Object Definition

In the wizard, you define the objects and fields contained in the object structure, along with properties for those fields. The information you specify is written to the Object Definition file in the master index project. This file defines the objects stored in the master index application and their relationships to one another. It also defines the fields contained in each object, as well as certain properties of each field, such as length, data type, whether it is required, whether it is a unique key, and so on. This file contains one parent object; all other objects must be child objects to that parent object. The object structure you define in the Object Definition file determines the structure of the database tables that store object data, the structure of the Java API, and the structure of the OTD generated for the project.

Candidate Select

In the Candidate Select file, you configure the Query Builder component of the master index application and define the available queries. In this file, you define the types of queries that can be performed from the Enterprise Data Manager (EDM) and the queries that are used during the match process. You can define both phonetic and alphanumeric searches for the EDM. By default, these are called basic queries. You can also define blocking queries, which define blocks of criteria fields for the match process. The master index application queries the database using the criteria defined in each block, one at a time. After completing a query on the criteria defined in one block, it performs another pass using the next block of defined criteria. Blocking queries can also be used in place of the basic phonetic query in the EDM.

Match Field

In the Match Field file, you configure the Matching Service by specifying the fields to be standardized and the fields to be used for matching, as well as defining how the fields are standardized and matched. This file specifies the match and standardization engines to use and the query process for matching. Standardization includes defining fields to be reformatted (or parsed), normalized, or phonetically encoded. For matching, you must also define the data string to be passed to the match engine. The rules you define for standardization and matching are dependent on the standardization and match engine in use. Understanding the Sun Match Engine describe the rules for the Sun Match Engine.

You can also configure portions of the match process in the Threshold file, described below, which defines certain match parameters that control weight thresholds, how assumed matches are processed, how potential duplicates are processed, and the query to use for matching.

Threshold

In the Threshold file, you configure the Manager Service and define properties of the match process. You specify the match and duplicate thresholds in this file, and define certain system parameters, such as the update mode, how to process records above the match threshold, how to manage same system matches, and whether merged records can be updated. This file also specifies which of the queries defined in the Query Builder to use for matching queries.

This file also configures the EUIDs assigned by the master index application. You can specify an EUID length, whether a checksum value is used for additional verification, and a “chunk size”. Specifying a chunk size allows the EUID generator to obtain a block of EUIDs from the sbyn_seq_table database table so it does not need to query the table each time it generates a new EUID.

Best Record

In the Best Record file, you define formulas that determine which data in an enterprise record should be considered the most reliable and how updates to the single best record (SBR) will be handled. The survivor calculator uses these formulas to decide what data from each system record to include in each object’s SBR. The SBR is the portion of the enterprise record that represents the data that is considered to be the most accurate and current for an entity.

The SBR is defined by a mapping of fields from external system records. Since there might be many external systems, you can optionally specify a strategy to select the value for an SBR field from the list of external values. You can also specify any additional fields that might be required by the selection strategy to determine which external system contains the best data, such as the object’s update date and time.

You can create Java classes that define special processing to perform against a record when the record is created, updated, merged, or unmerged. These classes must be created in the Custom Plug-ins module and can be specified for each transaction type in the Best Record file.

Field Validation

By default, the Field Validation file defines certain validations for the local identifiers assigned by each external system. You can create custom Java classes that define rules for validating field values before they are saved to the master index database. You can then specify the Java classes in the Field Validation file to make them part of the master index application.

Security

This file is not currently used, and is a placeholder to be used in future versions.

Enterprise Data Manager

In the Enterprise Data Manager file, you configure the appearance and processing properties of the Enterprise Data Manager (EDM). In this file, you define each object and field that appears on the EDM, along with the properties of each field, such as the field type and length, field labels, format masks, and so on. You can also define the order in which objects and fields appear on the EDM pages.

This file defines several additional properties of the EDM, including the types of searches available, whether wildcard characters can be used, the criteria for the searches, and the results fields that appear. You can also specify whether an audit log is maintained of each instance data is accessed through the EDM. For healthcare-based master index applications, such as Sun Master Patient Index (an application built on the Sun Master Index platform), this supports the privacy rules mandated by the HIPAA regulation for healthcare.

Finally, the Enterprise Data Manager file defines certain implementation information, such as the application server in use, debugging rules, and security activation.

The files that configure the components of the master index application are created by the wizard and define characteristics of the application, such as how data is processed, queried, and matched, and how it appears on the EDM. These files configure the runtime components of the master index application.

Modifying the Master Index XML Files Directly (Repository)

Make sure that when you modify the configuration files directly, you use the Check Out and Check In commands to maintain version control. Version control is automatic with the Configuration Editor-Repository. If you open and modify a file without first checking the file out, a warning appears when you try to save the file. This warning lets you save and check out the file in one step. Also, be sure to verify that the modifications are valid by verifying the XML syntax. After modifying each file, save the changes.

There are a few restraints on modifying these files. In addition to the general rules listed below, the match or standardization engine you choose might place other requirements on customizations. Be sure to review Understanding the Sun Match Engine before modifying the Match Field file.

Keep the following guidelines in mind when modifying the XML files directly.

Using the Master Index Configuration Editor-Repository

The Configuration Editor has built in validations to ensure that integrity is maintained between the configuration files. For example, it does not allow you to define a field for normalization if that field is not already defined in the object structure. While you can use the Configuration Editor to modify most of the configurable components, some components can only be modified using the XML editor. Following is a summary of which features can be configured using the Configuration Editor and which need to be modified using an XML editor.

Object Definition File

You can modify most elements of the Object Definition file using the Configuration Editor. The following can only be modified using the XML editor:

It is not recommended you change the database type, but if you modify the database type or date format elements, you need to regenerate the application to create the updated database scripts. This does not recreate the Systems or Code Lists scripts; that needs to be done manually.

Candidate Select File

You can modify all elements in the Candidate Select file using the Configuration Editor. If you create a query to use in the EDM or to use for the matching query, you need to add the query to the appropriate file manually (the Threshold file or the Enterprise Data Manager file).

Threshold File

Most elements in the Threshold file cannot be modified using the Configuration Editor. You can only modify the duplicate and match thresholds from the Configuration Editor.

Match Field File

You can use the Configuration Editor to modify all commonly modified elements in the Match Field file, including defining standardization structures, normalization structures, and phonetic encoding. If you create custom classes to implement a block picker, pass controller, match engine, or standardization engine, you need to specify the implementation classes in this file using the XML editor.

Best Record File

The Configuration Editor does not modify the Best Record file. If you make any changes to the object structure (either through the Configuration Editor or XML editor) review this file to verify that all fields or objects are included in the survivor strategy and that the field and object names are correct.

Field Validation File

The Configuration Editor does not modify the Best Record file. If you create a custom field validation class, you need to specify the implementation class in this file using the XML editor.

Enterprise Data Manager File

Most elements in the Enterprise Data Manager file are not modified using the Configuration Editor. You can add and delete fields that appear on the EDM and modify the display name and the value and input masks. All other field properties can only be modified using the XML editor.

Field integrity is maintained when you delete a field using the Configuration Editor. The field is automatically deleted from the EDM object structure and from any EDM page definitions that include the field, such as a search page or report.

Match Configuration File (matchConfigFile.cfg)

You can modify all components of the Match Configuration file using the Configuration Editor, including adding and removing comparators. The Configuration Editor does not validate the extra parameters that can be used for certain comparators, so you should verify your changes by reviewing the match configuration file manually.

Maintaining Version Control in the Master Index (Repository) Configuration Files

When modifying the XML files directly, be sure to maintain version control by checking files out before you modify them and then checking them back in when you are finished. The Configuration Editor supports version control for the XML configuration files. You can manually check the master index configuration files in and out of the Repository, or you can let the Configuration Editor perform version control for you when you open and close the editor.

Checking Configuration Files Out With the Configuration Editor-Repository

If you access the Configuration Editor-Repository when all of the configuration files are already checked out, the Configuration Editor opens immediately. If any of the configuration files are checked in, a dialog box appears that allows you to choose whether to check out and open the files in edit mode or to open the files in view-only mode without checking them out.

Checking Configuration Files In With the Configuration Editor-Repository

After you modify properties in the Configuration Editor, click Save in the Configuration Editor toolbar to save the configuration files to the work space. This does not check the files back in to the Repository. To check the files in, you need to close the editor.

ProcedureTo Check Files In

  1. When you are done making changes, click the Close icon in the upper right corner of the Configuration Editor.

  2. If there are any unsaved changes, a confirmation dialog box appears. Click Yes to save the changes.

    The Check In dialog box appears.

  3. Do one of the following:

    • To close the editor without checking in the files, click Cancel.

    • To check in the files and close the editor, enter a check-in comment and then click Check In.

      The dialog box and editor close and the files are checked in.


      Note –

      If you close the Configuration Editor without making any changes, a dialog box gives you the option to undo the checkout of the configuration files instead of checking them back in at a new revision level.


Working With the XML Editor

Sun Master Index supports the version control functionality provided by Java CAPS. You can check files in and out, retrieve older versions to a workspace, view a version history, and so on. In addition, Sun Master Index supports recursive check-ins and check-outs. When you select Recurse project, you can check in or out all components below the selected node or a subset of those components.

Saving a Configuration File to the Repository

Before modifying a file, be sure to check the file out of the Repository. You can perform this step before or after opening the file. When you are done with your modifications, save the file to the Repository.

ProcedureTo Save a Configuration File to the Repository

  1. With the file open in the XML or text editor, right-click in the XML editor to display the XML editor context menu.

  2. On the context menu, click Save.

  3. For XML files, validate the file (described in the following procedure) and check it back in to the Repository.


    Note –

    If you did not check the file out before making changes and attempting to save it, a warning dialog box appears. Click Yes on this dialog box to automatically check out the file, save the changes, and check it back in.


Validating XML Files

Sun Master Index includes one XML schema definition (XSD) file for each configuration file. Before saving changes to a file, be sure to validate it against the XSD file to make sure no dependencies have been broken during modification.

ProcedureTo Validate XML Syntax

  1. After you save any changes to a configuration file to the Repository, keep the file open and right-click in the text of the file.

  2. On the context menu that appears, click Check XML.

  3. A message appears indicating the status of the validation and, if there were errors, includes a list of errors.

  4. Fix any errors found in the file and revalidate.

ProcedureTo Validate Against the Schema

  1. After you save any changes to a configuration file to the Repository, right-click that file in the project Explorer.

  2. On the context menu that appears, click Validate.

  3. A message appears indicating the status of the validation and, if there were errors, includes a list of errors.

  4. Fix any errors found in the file and revalidate.

Copying, Cutting, and Pasting Files

You can use standard cut, copy, and paste commands to copy or move files between projects. Sun Master Index follows the standard Java CAPS functionality, with the exception that you can only copy or move a component from one project into the same node of another project. For example, you can only paste a copied configuration file into the Configuration node of another project. In addition, you cannot cut components that are essential to a project, such as the configuration files, match and standardization files, and so on.