Siebel Data Quality Administration Guide > Siebel Data Quality Universal Connector > Data Cleansing and Data Matching with SDQ Universal Connector >

About Data Matching (Deduplication) Process for SDQ Universal Connector


The data matching (deduplication) functionality of the Siebel Data Quality (SDQ) Universal Connector uses validated third-party vendor software for the matching rules and algorithms and maintenance of any match keys.

The methodologies and matching capabilities of external applications vary by vendor. Matching rules and weightings are typically configurable within the external application. After running batch deduplication, the SDQ Universal Connector reports the possible matches in the Duplicate views in the Administration - Data Quality screen. A data administrator can then manually merge the records. For information about merging duplicate records, see Process of Searching for and Merging Duplicate Records.

During the batch deduplication process, all records in the database are passed to the third-party software. The software uses an optimized algorithm to separate records into groups to reduce the number of record comparisons. One key difference between the SDQ Universal Connector and the SDQ Matching Server is that the SDQ Universal Connector combines key generation and deduplication into one process. While running batch deduplication using the SDQ Universal Connector, the key values for records are saved in files by the third-party vendor software. During real-time duplication, the third-party software uses the key values stored in the files to find potential duplicates.

TIP:   You should run batch deduplication against a business component before running real-time deduplication.

Siebel Data Quality Administration Guide