|Bookshelf Home | Contents | Index | PDF|
Depending on your business requirements, you might want to use batch jobs to perform data matching on some or all of the records in the supported business components. If you must run a data matching batch job on all the records in a business component, the work can often be completed more quickly by splitting the work into a number of smaller batch jobs (not more than 50,000 to 75,000 records at a time). When data matching has been performed on all of the records in the business component, you can run future data matching batch jobs on just the new or changed records.
If you want to perform data matching for some number of mutually-exclusive subsets of the records in a business component, such as all the records where a field name starts with a given letter, use a separate job to specify each subset, with WHERE clauses as follows:
objwhereclause="[field_name] LIKE 'A*'"
You must run batch mode key generation on all existing records before you run real-time data matching. The Universal Connector requires generated keys in the key tables first before you can run real-time data matching. The key generation is done within the deduplication task, which is the reason for running deduplication on all existing records first. For more information about batch data cleansing and matching, see Batch Data Matching and Data Cleansing.
In a full data matching job, the records for which you want to locate duplicates and the candidate records that can include those duplicates are defined by the same search specification. A full data matching job is specified with the value Yes in the DQSetting parameter, see Table 19.
If you want to perform data matching for some number of nonexclusive subsets of the records in a business component, such as all the records that have been created or updated since you last ran data matching, use a WHERE clause that includes an appropriate timestamp, and also adjust the DqSetting clause of the command as shown in Table 19.
This kind of job is considered an incremental data matching job, because data matching was done earlier and does not need to be redone at this time. In an incremental data matching batch job, the records for which you want to locate duplicates are defined by the search specification, but the candidate records that can include those duplicates can be drawn from the whole applicable database table. Incremental data matching batch jobs are useful if you run them regularly, such as once a week. A typical example of a command for an incremental data matching job is as follows:
|Siebel Data Quality Administration Guide||Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Legal Notices.|