|Skip Navigation Links|
|Exit Print View|
|Oracle Java CAPS Master Index Configuration Reference Java CAPS Documentation|
The Update Manager contains the logic used to generate the single best record (SBR) for a given object. The SBR is defined by a mapping of fields from external systems to the SBR, allowing you to define the fields from each system that are kept in the SBR. For each field in the SBR, an ePath denotes the location in the external system records from which the value is retrieved. Since there can be many external systems, you can optionally specify a strategy to select the SBR field from the list of external values. You can also specify any additional fields that might be required by the selection strategy to determine which external system contains the best data (by default, the record’s update date and time is always taken into account). The Update Manager also specifies any custom Java classes to be used for different types of update transactions, such as merges, unmerges, changes to existing records, and new record inserts.
The survivor calculator generates and updates the SBR for each record. The SBR for an enterprise object is created from what is considered to be the most reliable information contained in each system record for a particular object. The information used from each local system to populate the SBR is determined by the survivor calculator defined in the Update Manager. The fields defined in the survivor calculator are also the fields contained in the SBR. You can configure the survivor calculator to determine the best fields for the SBR from a combination of all the source system records. The survivor calculator can consider factors such as the relative reliability of a system, how recent the data is, and whether data entered from the MIDM overwrites data entered from any other system.
The survivor calculator consists of the rules defined for the survivor helper and the weighted calculator.
Note - Phonetic and standardized fields do not need to be defined in update.xml since their field values are determined by the standardization engine for the SBR.
The logic that determines how the fields in the SBR are populated and how certain updates are performed is highly configurable in a master index application, allowing you to design and develop the match strategy that best suits your processing requirements.
Configuring the Update Manager consists of customizing the following components:
The survivor helper defines a list of fields on which survivor calculation is performed, and thus the list of fields included in the SBR. Each field is called a candidate field. For each candidate field, you specify whether to use the default survivor calculation strategy or a custom strategy. The survivor helper must list each field contained in the SBR; any fields that are not listed here will not be populated in the SBR.
For each field, you can specify system fields to be taken into consideration as well as a specific survivorship strategy. There are three basic strategies provided by Oracle Java CAPS Master Index to determine survivorship for each field. You can define and implement custom strategies.
You can further configure the strategy for each field by filtering out unwanted or invalid values from the SBR. For more information, see SBR, Matching, and Blocking Filter Configuration.
This strategy maps fields directly from the local system records to the SBR. When you specify the default survivor strategy for a field, you must also specify the parameter that defines the source system. For example, if you specify the default survivor calculator for the field “Person.LastName” and define the preferred system as “SystemA”, the last name field in the SBR is always taken from SystemA (unless the value is overridden in the MIDM).
The default survivor strategy is com.sun.mdm.index.survivor.impl.DefaultSurvivorStrategy.
This strategy is the most complex survivor strategy, and uses a combination of weighted calculations to determine the most reliable source of data for each field. This strategy is highly customizable and you can define which calculation or set of calculations to use for each field. The calculations can be based on the update date of the data, system reliability, and agreement between systems. In the default configuration of the file, the calculations are defined in the WeightedCalculator section of the file.
The weighted survivor strategy is com.sun.mdm.index.survivor.impl.WeightedSurvivorStrategy. You can define general weighted calculations to be performed by default for each field, and you can define specialized calculations to be performed for specific fields.
This strategy combines the data from all source systems to populate the fields in the SBR for which this strategy is specified. For example, if you store aliases for person names in the database, you want to store all possible alias records and not just the “best” alias information. In order to do this, specify the union strategy for the alias object. This means that all alias information from all source systems is stored in the SBR.
The union strategy is applied to entire objects rather than to fields. This strategy combines all child objects from an enterprise objects source systems to populate the SBR. If the source systems contain two or more instances of a child object with the same unique key (such as two home telephone numbers), the union strategy only populates the most current child object in the SBR. For example, if the union strategy is assigned to the address object and each address object is identified by a unique key (such as the address type), the SBR only contains the most current address record of each address type (for example, one home address, one office address, and so on).
The union strategy is com.sun.mdm.index.survivor.impl.UnionSurvivorStrategy.
By default, the weighted calculator implements the weighted strategy defined above. Use the WeightedCalculator section to define conditions and weights that determine the best information with which to populate the SBR. The weighted calculator selects a single value for the SBR from a set of system fields. The selection process is based on the different qualities defined for each field.
The weighted calculator defines two sets of rules. The default rules apply to all fields in a record except those fields for which rules are specifically defined. The candidate rules only apply to those fields for which they are specifically defined. If you modify the default rules, the changes will apply to all fields except the fields for which candidate rules are defined.
You can define several strategies to help the weighted calculator determine the best information to populate into each field of the SBR. Each of these strategies is defined by a quality, a preference, and a utility. The quality defines the type of weighted calculation to perform, the preference indicates the source being rated, and the utility indicates the reliability. You can define multiple strategies for each field, and a linear summation on the utility score of each strategy determines the best value to populate in the SBR field.
The weighted calculator strategies include:
This strategy indicates the best source system for a field, and is used when the quality of the field in question depends on its origin. For example, to indicate that the data from SystemA for a specific field is of a higher quality than SystemB, define a SourceSystem quality for “SystemA” and one for “SystemB”. Then assign SystemA a higher utility value (85.0, for example) and SystemB a lower utility value (30.0, for example). This indicates that SystemA is a more reliable source for the field. If both SystemA and SystemB contain the specified field, the value from SystemA is populated into the SBR. If the field is empty in SystemA but the field in SystemB contains a value, then the value from SystemB is used.
This strategy prorates the utility score based on the number of systems whose values for the specified field are in agreement. For example, if the first name field for SystemA is “John”, for SystemB is “John”, and for SystemC is “Jon”, SystemA and SystemB together receive two-thirds of the utility score, while SystemC only receives one-third. The value populated into the SBR is “John”. You do not need to define a preference for the SystemAgreement strategy, but you must define source systems.
This strategy ranks the field values from the source systems in descending order according to the time that the object was last modified. The value populated in the SBR comes from the most recently modified object. You do not need to define a preference for the MostRecentModified strategy, but you must define a utility.
The Update Manager policies specify custom Java classes that provide additional processing logic for each type of update transaction. By default, this additional processing is not defined in a standard master index application. You can define custom update policies by creating the custom classes in the Source Packages node of the EJB project associated with the main master index project. NetBeans also provides the ability to build and compile the custom Java code, and Oracle Java CAPS Master Index automatically incorporates the classes when you generate the application. The Java classes defining the update policies are specified for the master index application in the UpdateManagerConfig element of update.xml.
There are seven types of update policies defined in the Update Manager.
Enterprise Create Policy – The enterprise create policy defines additional processing to perform when a new record is inserted into the master index database. This policy is defined by the EnterpriseCreatePolicy element.
UndoAssumeMatchPolicy – The undo assume match policy defines additional processing to perform when an assumed match transaction is reversed. This policy is defined by the UndoAssumeMatchPolicy element.
The update policy section includes a flag that can prevent the update policies from being carried out if no changes were made to the existing record. When set to “true”, the SkipUpdateIfNoChange flag prevents the update policies from being performed when no changes are made to an existing record. Setting the flag to true helps increase performance when processing a large number of updates.
The properties for the update process are defined in update.xml. Some of the information entered into the default configuration file is based on the fields defined in the wizard and some is standard across all implementations. For most implementations, this file will require customization.
The following topics provide information about working with update.xml:
You can customize the configuration of the Update Manager by modifying update.xml. This file cannot be modified using the Configuration Editor; you need to modify the file directly. You can modify this file at any time, but it is not recommended after moving into production. The configuration controls how the SBR for each object is created, and modifying the file can cause discrepancies in how SBRs are formed before and after the modifications. It might also cause discrepancies in match results, since matching is performed against the SBR. You must regenerate the application and redeploy the project after modifying this file. The possible modifications to this file are restricted by the schema definition, so be sure to validate the file after making any changes.
This topic describes the structure of the XML file, general requirements, and constraints. It also provides a sample implementation.
Table 12 lists each element in update.xml and provides a description of each element along with any requirements or constraints for each element.
Table 12 update.xml File Structure
Below is a sample of update.xml using a very small object structure based on person data. Note that standardized and phonetic fields are included in the candidate fields to ensure that they are also included in the SBR. In this sample, all fields use the default strategy except those included in the Alias object, which uses the union strategy. The value that is populated in the LastName field of the SBR is dependent on the SSN field of the system objects. In addition, custom logic is defined only for the SSN field; the remaining fields use the default logic defined in the default-parameters element.
<SurvivorHelperConfig module-name="SurvivorHelper" parser-class="com.sun.mdm.index.configurator.impl.SurvivorHelperConfig"> <helper-class>com.sun.mdm.index.survivor.impl.DefaultSurvivorHelper </helper-class> <default-survivor-strategy> <strategy-class> com.sun.mdm.index.survivor.impl.WeightedSurvivorStrategy </strategy-class> <parameters> <parameter> <parameter-name>ConfigurationModuleName</parameter-name> <parameter-type>java.lang.String</parameter-type> <parameter-value>WeightedSurvivorCalculator </parameter-value> </parameter> </parameters> </default-survivor-strategy> <candidate-definitions> <candidate-field name="Person.LastName"> <system-fields> <field-name>Person.SSN</field-name> </system-fields> </candidate-field> <candidate-field name="Person.FirstName"/> <candidate-field name="Person.MiddleName"/> <candidate-field name="Person.DOB"/> <candidate-field name="Person.Gender"/> <candidate-field name="Person.SSN"/> <candidate-field name="Person.FnamePhoneticCode"/> <candidate-field name="Person.LnamePhoneticCode"/> <candidate-field name="Person.StdFirstName"/> <candidate-field name="Person.StdLastName"/> <candidate-field name="Person.Alias[*].*"> <survivor-strategy> <strategy-class> com.sun.mdm.index.survivor.impl.UnionSurvivorStrategy </strategy-class> </survivor-strategy> </candidate-field> </candidate-definitions> </SurvivorHelperConfig> <WeightedCalculator module-name="WeightedSurvivorCalculator" parser-class="com.sun.mdm.index.configurator.impl.WeightedCalculatorConfig"> <candidate-field name="Person.SSN"> <parameter> <quality>SourceSystem</quality> <preference>SBYN</preference> <utility>100.0</utility> </parameter> <parameter> <quality>MostRecentModified</quality> <utility>75.0</utility> </parameter> </candidate-field> <default-parameters> <parameter> <quality>MostRecentModified</quality> <utility>80.0</utility> </parameter> <parameter> <quality>SourceSystem</quality> <preference>SBYN</preference> <utility>100.0</utility> </parameter> </default-parameters> </WeightedCalculator> <UpdateManagerConfig module-name="UpdateManager" parser-class="com.sun.mdm.index.configurator.impl.UpdateManagerConfig"> <EnterpriseMergePolicy>com.sun.mdm.index.user.CustomMergePolicy </EnterpriseMergePolicy> <EnterpriseUnmergePolicy>com.sun.mdm.index.user.CustomUnmergePolicy </EnterpriseUnmergePolicy> <EnterpriseUpdatePolicy>com.sun.mdm.index.user.CustomUpdatePolicy </EnterpriseUpdatePolicy> <EnterpriseCreatePolicy>com.sun.mdm.index.user.CustomCreatePolicy </EnterpriseCreatePolicy> <SystemMergePolicy>com.sun.mdm.index.user.CustomSystemMergePolicy </SystemMergePolicy> <SystemUnmergePolicy>com.sun.mdm.index.user.CustomSystemUnmergePolicy </SystemUnmergePolicy> <UndoAssumeMatchPolicy>com.sun.mdm.index.user.CustomUndoMatchPolicy </UndoAssumeMatchPolicy> <SkipUpdateIfNoChange>true</SkipUpdateIfNoChange> </UpdateManagerConfig>
The following sample illustrates how the weighted calculator uses the parameters you define to determine which field values to use in the SBR. Using this sample, if there is a value in only one of the system records but not in the other, that value is used in the SBR regardless of update date. If there is a value in both system records and they were updated at the same time, the SAP field value is used (80.0>30.0). If there is a value in both system records, but CDW was the most recently modified, the value from CDW is populated into the SBR ((30.0+70.0)>80.0)
<default-parameters> <parameter> <quality>SourceSystem</quality> <preference>SAP</preference> <utility>80.0</utility> </parameter> <parameter> <quality>MostRecentModified</quality> <utility>70.0</utility> </parameter> <parameter> <quality>SourceSystem</quality> <preference>CDW</preference> <utility>30.0</utility> </parameter> </default-parameters>