Create Duplicate Identification Batches and Define Subset Rules

You can create a duplicate identification batch and define subset rules to retrieve a subset of the records to identify duplicates within the batch or in the database.

Subset rules, also known as batch selection criteria rules, specify the criteria for retrieving a subset of records in the duplicate identification batch. The data quality engine identifies potential duplicates from this subset of records based on one of these rules:

  • Match all keywords: Select this option to perform an AND operation.

  • Match any keyword: Select this option to perform an OR operation.

Now that you have an overview of the task, let's first create a duplicate identification batch to identify duplicate persons in the registry, and then create a rule to retrieve a subset of records where the person name contains John and the address contains Redwood. Note that you can use predefined or custom attributes.

  1. Navigate to the Duplicate Identification work area as follows: Navigator > Customer Data Management > Duplicate Identification .

  2. Click Create menu option or button. The Create Duplicate Identification Batch page appears.

  3. Enter a batch name and description.

    Note: Another way is to copy an existing duplicate identification batch and quickly create a new batch from it. You can modify the details for this batch before submitting it.
  4. Specify the Batch Match Mode such as Against the Registry or Within the Batch.

    In the Within the Batch Match mode, the duplicate identification is limited to the records in a batch that meet the subset rule conditions. In the Against the Registry Batch Match mode, the process aggregates the records that meet the subset rule conditions in a batch, and these records are matched against one another as well as against other records in the database.

  5. Specify the Party Type as Person.

  6. Specify the Automatic Processing option as Create Merge Request to merge the duplicate persons.

  7. Provide the Batch Options. The batch options available depends upon the selected Automatic Processing Option. The following options are available when Create Merge Request is selected as the Automatic Processing Option:

    • Select an appropriate value for Cluster Key Level such as Typical.

    • Enter a value between 1 and 101, such as 70 for Match Threshold.

    • Enter a value between 1 and 101, such as 75 for Automerge Threshold.

      Note: You need to keep in mind that the Automerge Threshold and Autolink Threshold values that you provide in the Batch Options area override the values set in the Manage Customer Hub Profile Options page.
    • Select Send Notifications to notify the status of the duplicate identification batch to all interested parties such as initiator or submitter. For more information about these statuses, see How You Merge Duplicate Records in the Related Topics section. The default value for the Send Notifications field is set using the Merge Request Notifications option in the Manage Customer Data Management Options Setup and Maintenance task. You can't set the value of the Send Notifications field if the Merge Request Notifications option is set to disable all notifications. For more information about Merge Request Notifications, see Duplicate Resolution Simplified Profile Options topic in the Related Topics section.

  8. Click Add menu option or button under Duplicate Identification Batch: Selection Criteria.

  9. Specify the Apply Rules options as Match any keyword.

  10. Enter the following sample information in the Duplicate Identification Batch: Selection Criteria table:

    Object

    Attribute

    Operator

    Value

    Person

    Name

    Starts with

    John

    Address

    Address Line 1

    Contains

    Redwood

  11. Click Save and Close or Schedule per your requirement.