Siebel Data Quality Administration Guide > Data Quality Performance Considerations >

Optimizing Data Cleansing Performance


The following recommendations for data cleansing should help you achieve good performance when working with large volumes of data:

  • You can include only new or recently modified records in the batch data cleansing process.

    If you run data cleansing on a record twice, sometimes the record can change the second time. However, cleansing all records in the Siebel database each time a data cleansing is performed can cause performance issues. It is recommended you include only new or recently modified records in the batch data cleansing process. These records can be identified using the Object WHERE clause when you submit your server component job, as shown in Table 20.

    Table 20.  Recommended Data Cleansing Object WHERE Clause Solutions
    To Cleanse
    Use This in Your Object WHERE Clause

    Updated records

    [Last Clnse Date] < [Updated]

    New records

    [Last Clnse Date] IS NULL

    Updated and new records

    [Last Clnse Date] < [Updated] OR [Last Clnse Date] IS NULL

  • You can copy address files to your local machine.

    Address data cleansing (for business address and prospect data) needs to access the address data files frequently, so you should copy these files to your local machine.

  • You can set the ReservedOption parameter.

    For you to speed up the data cleansing task for large databases, set the ReservedOption component parameter to 0 (ReservedOption=0), and then cleanse a smaller number of records at a time using an Object WHERE clause. For more information, see About Running Data Cleansing in Batch Mode Using SDQ Universal Connector.

  • You can split the tasks into smaller tasks and run them concurrently.
Siebel Data Quality Administration Guide