Batch Duplicate Identification

This chapter covers the following topics:

Batch Duplicate Identification Overview

Although Oracle E-Business Suite applications share the same TCA Registry, each application uses TCA differently depending on the context and need for particular party information. Each application must quickly, accurately, and consistently retrieve information from the TCA Registry for transaction processing. Duplicate data in the Registry can reduce the efficiency and accuracy of party processing and reports.

To resolve duplicates, you can use batch duplicate identification to find duplicate parties that already exist in your TCA Registry and select the parties that you want to merge. The batch duplicate identification process compares all records for each party. For example, each contact point for a party is compared against all contact points.

Batch duplicate identification is powered by Data Quality Management (DQM) functionality. Your administrator can set up batch duplicate identification. See: Setting Up Batch Duplicate Identification, Oracle Trading Community Architecture Administration Guide and Data Quality Management Overview, Oracle Trading Community Architecture Administration Guide.

Process Overview

The batch duplicate identification process involves:

This diagram illustrates the process in more detail:

the picture is described in the document text

  1. In the Submit Duplicate Identification Batch window, define a duplicate identification batch, which can consist of a subset of parties. Specify whether to compare the parties against one another or against all parties.

  2. From the same window, run the DQM Duplicate Identification program, which searches for duplicates of the batch that you defined. The batch provides the input records that the staged schema matches against.

  3. The DQM Duplicate Identification program applies the match rule in the Submit Duplicate Identification Batch window to identify duplicates.

  4. The Duplicate Identification: Batch Review window displays the duplicate candidates for your batch.

  5. Review the results for your duplicate identification batch and use the Match Details window to see more information about the matches.

  6. In the Duplicate Identification: Batch Review window, specify the duplicate parties that you want to merge from and to and indicate which parties you do not want to be identified as duplicates in the future.

  7. In the same window, create a merge batch with the parties that you want to merge.

    Use the Review Party Merge Batches window to submit merge batches to Party Merge for the actual merge process. See: Party Merge Overview.

Related Topics

Using Oracle Trading Community Architecture

Defining Duplicate Identification Batches

Use the Submit Duplicate Identification Batch window to define and submit the batch of subset entries that you want to find duplicates for. When you submit the batch, the DQM Duplicate Identification program automatically applies the match rule from this window and scores potential duplicates.

If you do not define a subset, the DQM Duplicate Identification program compares all records in the staged schema against one another. This process can take a long time, depending on the detail of your match rule and the size of your staged schema.

Define a subset of records to compare against the rest of the staged schema or against one another, for two reasons:

You can select up to ten conditions to define the subset, using any of the attributes from the HZ_PARTIES table. You can also manually enter SQL statements to define the subset.

After the DQM Duplicate Identification program finishes, the results are displayed in the Duplicate Identification: Batch Review window.

To define and submit a duplicate identification batch

  1. Navigate to the Submit Duplicate Identification Batch window.

  2. Enter a name for the duplicate identification batch in the Batch Name field.

  3. Select a match rule from the list of values to use for identifying and scoring duplicates in the Match Rule field. The match rule defaults from the DQM Match Rule for Batch Duplicate Identification profile option, if defined.

    Even if the selected match rule is allowed for Automerge, the Automerge feature is not integrated with batch duplicate identification.

    Note: Use a match rule with the Bulk Duplicate Identification type if you want to identify only records that are almost exact duplicates. Match rules with Simple Duplicate Identification type provide fuzzier matches.

  4. In the Number of Workers field, enter the number of parallel workers that you want to use to improve performance.

    Workers are processes that run at the same time to complete a task that would otherwise take longer with a single process. The default number of workers is 1, and you cannot use more than ten workers.

  5. Uncheck the Match within Subset check box if you want to compare the subset against the entire staged schema for duplicates.

    By default, the records in the subset are only compared against one another.

  6. Check the Find Merged Parties check box if you want to include parties that were previously merged in the search.

  7. Navigate to the Define Subset region.

  8. In the Attribute fields, select attributes from the list of values that you want to define the subset with.

  9. For each attribute, select a condition:

    • Equal To

    • Greater Than

    • Less Than

    • Starts With

  10. In the Value fields, enter a value for each attribute and condition.

    For example, if you enter 1001 for the Party Number attribute with a less than condition, the subset includes only parties with a number of 1000 or lower.

  11. Press the Submit Batch button.

    The DQM Duplicate Identification program runs to identify duplicates for the subset of records that you defined, using the match rule that you specified.

Related Topics

Batch Duplicate Identification OverviewSubmitting the Merge Process

Reviewing Duplicates and Creating Merge Batches

Use the Duplicate Identification: Batch Review window to review the potential duplicates that the DQM Duplicate Identification program found and create a merge batch that consists of parties that you want to merge. For the subset that you defined for the duplicate identification batch, the Duplicate Identification: Batch Review window displays parties with matches as merge-to parties and their potential duplicates as merge-from parties. A merge-from party would be merged into the merge-to party during the party merge process.

You can use the Match Details window to see the reasons why the DQM Duplicate Identification program selected any pair of merge-to and merge-from parties as duplicate candidates. This information helps you determine whether the parties are in fact duplicates or not.

After you evaluate a pair of duplicate candidates, you can:

When you finish evaluating a batch, you can create a merge batch with all the duplicate pairs of parties that you select for merge.

Note: After you become familiar with DQM, you might choose to trust your match rules and submit a merge batch with the results of the DQM Duplicate Identification program without evaluating each duplicate candidate.

Prerequisites

To review duplicates and define merge batches

  1. Navigate to the Duplicate Identification: Batch Review window.

  2. Query the duplicate identification batch that you want to review.

    For the batch, the window displays the information shown in this table:

    Field Value
    Batch Name The name of the duplicate identification batch.
    Match Rule The match rule that the DQM Duplicate Identification program used to match and score duplicate candidates.
    Creation Date The date that the duplicate identification batch was created.
    Match Threshold The match threshold defined in the match rule. Records with a score that exceeds the match threshold are selected as matches, or potential duplicates.

    In the Merge-To Parties region, the Duplicate Identification: Batch Review window displays the information shown in this table for each merge-to party:

    Field Value
    Name The party name
    Number The party number
    Identifying Address The party identifying address
    Duplicates The number of potential duplicates, or merge-from parties

    In the Merge-From Parties region, the window displays all the duplicate candidates for the selected merge-to party. For each potential duplicate, you can see the information shown in this table:

    Field Value
    Name The party name.
    Number The party number.
    Identifying Address The party identifying address.
    Score The match score, which signifies how closely the merge-from party matches the merge-to party. The score is the sum of scores that the match rule assigns to attribute matches between the two parties. The match score must exceed the match threshold of the match rule for the party to be selected as a merge-from party.
    Merge Yes or no to indicate whether to merge the party into the merge-to party or not. The default of this value depends on the automatic merge threshold of the match rule. A party with a match score that exceeds the threshold is defaulted to be merged.
  3. To view information about why a pair of merge-to and merge-from parties was designated as a potential duplicate match, select the merge-to and merge-from party in the window.

  4. Press the View Match Details button. The Match Details window appears. The window shows again the name, number, and address of the merge-from and merge-to parties.

    The window also displays the information shown in this table:

    Field Value
    Matched Attribute The attribute that matches in both parties according to the match rule definitions.
    Merge-From Party Value The value of the matched attribute from the merge-from party.
    Merge-To Party Value The value of the matched attribute from the merge-to party.
    Score The score that the match rule assigns to the merge-from party for the particular attribute match to the merge-to party. The sum of these scores makes up the total match score of the merge-from party.
  5. To view attribute match details between the same merge-to party and another merge-from party, select another merge-from party in the list of values for the Merge-From field.

  6. Press the Close button when you finish viewing match details for the selected merge-to party.

  7. Repeat steps 3 to 6 for each merge-to party that you want to view match details for.

  8. To switch the selected merge-from party to a merge-to party and vice versa, select a merge-to party in the Duplicate Identification: Batch Review window.

  9. Press the Change Merge-To Party button and select the merge-from party that you want to replace the selected merge-to party.

    Alternatively, you can use the list of values from the Merge-To Parties Name field to change the merge-to party.

  10. For each merge-from party, specify in the Merge option whether to merge into the merge-to party or not. You can accept the defaults or, based on your evaluation, select not to merge parties that were defaulted for merge.

    Note: You can only override Merge options that are set to Yes.

  11. If you want to specify that a merge-from party is not a duplicate match for the merge-to party, select the merge-from party and check the Not Duplicate of Merge-To Party check box.

    Note: You can only select merge-from parties with the Merge option set to No.

  12. You can optionally enter or remove the end date for the merge-from party not to be selected as duplicate of that specific merge-to party.

  13. When you finish evaluating the batch, press the Create Merge Set button to create a merge batch that consists of merge-from parties with the Merge option set to Yes and their corresponding merge-to parties.

    The Review Party Merge Batches window automatically appears for the next procedure of submitting the sequence of parties in the batch to Party Merge. For more information, see: Submitting Merge Batches.

    If you decide later to change the Merge option to No for some parties in this batch, you can still do so in the Duplicate Identification: Batch Review window as long as you have not yet submitted the merge batch to Party Merge.

Related Topics

Batch Duplicate Identification Overview

Submitting Merge Batches

Use the Review Party Merge Batches window to submit merge batches to Party Merge. Party Merge is the Oracle Trading Community Architecture feature that performs the actual merging of parties. The Review Party Merge Batches window displays all the merge batches that you created from a specific duplicate identification batch.

For each merge batch, the window also displays the pairs of parties to be merged and the sequence in which they are submitted. DQM automatically determines the sequence when you create the merge batch to ensure that parties are successfully submitted and merged.

For example, if you are merging party A to party B, and party B to party C, you must merge party A to B before merging B to C. If you merge party B to C first, party B does not exist any more for party A to merge into.

Prerequisites

To submit merge batches

  1. Navigate to the Review Party Merge Batches window.

    The window automatically appears when you create a merge batch. Otherwise, navigate to the Duplicate Identification: Batch Review window, query the duplicate identification batch that you want to submit merge batches for, and press the View Merge Set button.

  2. You can change the merge batch names in the Party Merge Batch field if you want.

    The default names consist of the original duplicate identification batch name and a sequential number. For example, the first merge batch that you create with duplicate identification batch ABC is named 1-ABC. The second merge batch is called 2-ABC, and so on.

    The Status field displays the status of the merge batch:

    • Ready to Submit: This batch is the one that is submitted when you press the Submit Party Merge Batch button.

    • In Queue: This batch is in the queue for submission.

    • Pending: This batch has been submitted to Party Merge.

    • Partially Complete: This batch has been partially merged with success in Party Merge.

    • Complete: This batch has successfully merged in Party Merge.

      For each merge batch, the Merge Batch Parties region displays the names and numbers of all the merge-from and merge-to parties, in the default sequence that the pairs will be submitted.

  3. Press the Submit Party Merge Batch button to submit the batch with the Ready to Submit status.

    Note: You must submit batches in the order that they are displayed.

    When the submission successfully completes, the status for the batch changes to Pending and the next batch gets the Ready to Submit status.

  4. To continue to the Party Merge process, press the Go To Party Merge button. The Merge Parties window automatically appears for the selected merge batch.

    Note: You can only select merge batches with a Pending status for Party Merge.

    When the actual Party Merge process is run on a merge batch, the concurrent request number of the process is displayed in the Request ID field of the Review Party Merge Batches window.

Related Topics

Party Merge Overview

Batch Duplicate Identification Overview

Submitting the Merge Process