Bookshelf v7.7: Data Matching Components for the SDQ Matching Server

Siebel Data Quality Administration Guide > Siebel Data Quality Matching Server >

Data Matching Components for the SDQ Matching Server

The data matching process for the Siebel Data Quality (SDQ) Matching Server consists of two components, Match Key Generation and Search and Match.

Match Key Generation

The matching server must have match keys for the account, contact, and prospect records already in the database. When the matching server performs a matching task, it is not comparing the raw data for each record in the database. Instead, the matching server uses the existing match keys in the database to pick up candidate records for comparison.

These keys are generated by applying an algorithm to the name fields that translates each of the names and words into a set of keys. The keys can be compared for similarity by the matching server. The SDQ Matching Server generates multiple keys for each existing customer record, and the number of keys generated depends on the settings for the key type that you select. If you select the narrower key type (Limited), then the key generation algorithm performs only the most common permutations and generates fewer keys. If you select the wider key type (Standard), the key generation algorithm uses a wider set of permutations to provide the most exhaustive range of keys.

For batch processing, keys are generated for all records that meet the object WHERE clause. For real-time data matching, keys are automatically generated for a record whenever the user saves a new record or modifies and commits an existing record to the database.

Search and Match

From the user's perspective, match key generation is a single task. However, the matching server actually executes two subtasks to complete the match, that is search and match.

After keys are generated for the existing data, the matching server can look for matches, what is referred to as search. In this process, the matching server takes the keys for a selected record (the record entered by a real-time user or the active record in the batch job) and looks for all existing keys that are similar to any keys from the selected record. Based on the values specified (Narrow, Typical, Exhaustive) in the Search Type field in the Data Quality Settings view, the match process scans a smaller range of keys to provide fastest response (Narrow) or scans a wider range of keys to provide the most exhaustive search (Exhaustive). In general, if you are using a wider (more exhaustive) key type, you should also use a wider search type.

When a set of candidate records whose keys fall into the selected search range is found, the matching server computes and assigns a match score to indicate the degree of similarity between the candidate records and the selected record. This match score is based on a combined weighting for all the input fields (personal name, company name, address, and identifiers).

When the match results are returned, the value specified in the Match Threshold field in the Data Quality Settings view determines whether or not the application considers a returned record a match. Match results exceeding the threshold are logged to the S_DEDUP_RESULT match results table and displayed in the user interface (as a real-time pop-up window or as a record in the Administration - Data Quality views). Match results below the threshold are not stored.

Results from running these key tasks vary depending on the values you set in the Data Quality Settings view.

Siebel Data Quality Administration Guide