When records are transmitted to the master index application, one of the “execute match” methods is usually called and a series of processes are performed to ensure that accurate and current data is maintained in the database. The execute match methods include executeMatch, executeMatchUpdate, executeMatchDupRecalc, and executeMatchUpdateDupRecalc. The EDM uses executeMatchGui. For more information about how these methods differ, refer to the Javadocs.
You can define these processes in the Collaboration using the functions defined in the customized method OTD. The steps performed by the standard executeMatch method are outlined below, and the diagrams on the following pages illustrate the message processing flow. The processing steps performed in your environment might vary from this depending on how you customize the Collaboration and Connectivity Map.
The steps outlined below refer to the following parameters in the Threshold file. They are described in Threshold Configuration (Repository) in Understanding Sun Master Index Configuration Options (Repository)).
There are several decision points in the match process that can be defined by custom logic using custom plug-ins. These decision points are not listed in the below steps, which describe the default processing logic.Master Index Custom Decision Point Logic (Repository) provides the same steps as below with the decision points included.
When a message is received by the master index application, a search is performed for any existing records with the same local ID and system as those contained in the message. This search only includes records with a status of A, meaning only active records are included. If a matching record is found, an existing EUID is returned.
If an existing record is found with the same system and local ID as the incoming message, it is assumed that the two records represent the same object. Using the EUID of the existing record, the master index application performs an update of the record’s information in the database.
If the update does not make any changes to the object’s information, no further processing is required and the existing EUID is returned.
If there are changes to the object’s information, the updated record is inserted into the database and the changes are recorded in the sbyn_transaction table.
If there are changes to key fields (that is, fields used for matching or for the blocking query) and the update mode is set to pessimistic, potential duplicates are reevaluated for the updated record.
If no records are found that match the record’s system and local identifier, a second search is performed using the blocking query. A search is performed on each of the defined query blocks to retrieve a candidate pool of potential matches.
Each record returned from the search is weighted using the fields defined for matching in the inbound message.
After the search is performed, the number of resulting records is calculated.
If a record or records are returned from the search with a matching probability weight above the match threshold, the master index application performs exact match processing (see Step 5).
If no matching records are found, the inbound message is treated as a new record. A new EUID is generated and a new record is inserted into the database.
If records were found within the high match probability range, exact match processing is performed as follows:
If only one record is returned from this search with a matching probability that is equal to or greater than the match threshold, additional checking is performed to verify whether the records originated from the same system (see Step 6).
If more than one record is returned with a matching probability that is equal to or greater than the match threshold and exact matching is set to false, then the record with the highest matching probability is checked against the incoming message to see if they originated from the same system (see Step 6).
If more than one record is returned with a matching probability that is equal to or greater than the match threshold and exact matching is true, a new EUID is generated and a new record is inserted into the database.
If no record is returned from the database search, or if none of the matching records have a weight in the exact match range, a new EUID is generated and a new record is inserted into the database.
Exact matching is determined by the OneExactMatch parameter, and the match threshold is defined by the MatchThreshold parameter. For more information about these parameters, see Threshold Configuration (Repository) in Understanding Sun Master Index Configuration Options (Repository).
When records are checked for same system entries, the master index application tries to retrieve an existing local ID using the system of the new record and the EUID of the record that has the highest match weight.
If a local ID is found and same system matching is set to true, a new record is inserted and the two records are considered to be potential duplicates. These records are marked as same system potential duplicates.
If a local ID is found and same system matching is set to false, it is assumed that the two records represent the same object. Using the EUID of the existing record, the master index application performs an update, following the process described in Step 2 above.
If no local ID is found, it is assumed that the two records represent the same object and an assumed match occurs. Using the EUID of the existing record, the master index application performs an update, following the process described in Step 2 above.
If a new record is inserted, all records that were returned from the blocking query are weighed against the new record using the matching algorithm. If a record is updated and the update mode is pessimistic, the same occurs for the updated record. If the matching probability weight of a record is greater than or equal to the potential duplicate threshold, the record is flagged as a potential duplicate (for more information about thresholds and the update mode, see Threshold Configuration (Repository) in Understanding Sun Master Index Configuration Options (Repository)).
The following flow charts provide a visual representation of the processes performed in the default configuration. Figure 4 and Figure 5 represent the primary flow of information. Figure 6 expands on update procedures illustrated in Figure 4 and Figure 5.