1 Duplicate Check Cartridge Overview

This chapter provides an overview of the Oracle Communications Offline Meditation Controller duplicate check enhancement processor (EP) cartridge, which identifies duplicate records while processing call detail records (CDRs).

Before reading this chapter, you should be familiar with:

  • Offline Mediation Controller cartridge concepts. For more information, see Offline Mediation Controller Cartridge Development Kit Developer's Guide.

About Duplicate Check

Offline Mediation Controller duplicate check is the process of identifying duplicate records while processing CDRs. The Offline Mediation Controller duplicate check EP cartridge creates partitions (in memory data structures) and duplicate check files to store duplicate check keys. The duplicate check keys are used to identify any duplicate records in the incoming CDRs.

Before you can use the duplicate check EP, do the following:

About Partitions

When processing CDRs, the Offline Mediation Controller duplicate check EP node performs duplicate checks by storing the duplicate check keys in partitions and in duplicate check files. A duplicate check key is a set of fields in the CDR that uniquely identify the record and the record timestamp. The key is stored in the partition and in the duplicate check file for a predefined time period.

The duplicate check EP node creates the partitions as configured based on the CDR timestamp. The partitions are created for the configured time period. When a CDR is processed, the duplicate check node creates the partition in memory and adds the duplicate check keys to the partition and to the duplicate check file.

When subsequent CDRs are processed, the duplicate check node compares the keys in the CDR to the keys in the partition. If the key already exists in the partition, the record is rejected and is written to a file in the duplicate record storage directory, and the information related to the duplicate records is written to a log file. If the key does not exist, it is added to the partition and the duplicate check file.

Workflow for CDRs

When a duplicate check EP node is configured in the mediation node chain, depending on the configuration in the NPL rule file, the records are processed as follows:

  1. The collection cartridge (CC) node processes the CDR input file.

  2. If the input file is successfully processed by the CC node, the duplicate check EP node processes the CDR data as follows:

    • Memory partitions and duplicate check files are created based on the configured partition size and the CDR timestamp.

    • When a CDR arrives, the appropriate partition is loaded into memory.

    • The duplicate check keys are stored in the appropriate partition and the duplicate check file based on the CDR timestamp.

    • The key from the incoming CDRs belonging to the same time interval is compared with the keys stored in the partition.

    • If a duplicate key is found in the partition, the record is rejected and the duplicate record is written to a file in the duplicate record storage directory.

    • If a duplicate key is not found, the key is added to the list of keys in the partition and the duplicate check file, and the CDR is distributed to the next node in the mediation node chain.

    • If the maximum number of partitions to be stored in memory is reached, the oldest partition is deleted from memory.

    • When the partition reaches its configured retention time, the partition and the duplicate check file are deleted.

    For example:

    • If a duplicate check EP node is configured with the partition size set to Hourly and the retention time set to 24 hours:

      When a CDR arrives with the following data:

      20140723104450,9945168238,VOICE,101
      20140723105450,9945168239,VOICE,102
      20140723104050,9945168240,VOICE,103
      20140723103050,9945168241,VOICE,104
      

      a new partition is created for the hour with the timestamp 2014-07-23T10-00-00.000_2014-07-23T11-00-00.000, the keys are stored in the partition and the duplicate check file, and the duplicate flag in the record is set to 0, indicating it as a non-duplicate record.

      When another CDR arrives with the following data:

      20140723104450,9945168238,VOICE,101
      20140723105450,9945168239,VOICE,102
      

      the CDR data keys are compared with the keys stored in the partition, and the duplicate flag in the record is set to 1, indicating it as a duplicate record.

      When another CDR arrives with the following data:

      20140722084450,9945168238,VOICE,101 
      20140722085450,9945168239,VOICE,102 
      

      the CDR is identified as an old record, and the duplicate flag in the record is set to -1.

      After 24 hours, the partition and the duplicate check file that have reached the retention time are deleted.

    • If a duplicate check EP node is configured with the partition size set to Daily and the retention time set to 2 days:

      When a CDR arrives with the following data:

      20140723104450,9945168238,VOICE,101
      20140723105450,9945168239,VOICE,102
      20140723104050,9945168240,VOICE,103
      20140723103050,9945168241,VOICE,104
      

      a new partition is created for the day with the timestamp 2014-07-23T00-00-00.000_2014-07-24T00-00-00.000, the keys are stored in the partition and the duplicate check file, and the duplicate flag in the record is set to 0, indicating it as a non-duplicate record.

      When another CDR arrives with the following data:

      20140723104450,9945168238,VOICE,101
      20140723105450,9945168239,VOICE,102
      

      the CDR data keys are compared with the keys stored in the partition, and the duplicate flag in the record is set to 1, indicating it as a duplicate record.

      When another CDR arrives with the following data:

      20140720104450,9945168238,VOICE,101 
      20140720105450,9945168239,VOICE,102 
      

      the CDR is identified as an old record, and the duplicate flag in the record is set to -1.

      After 2 days, the partition and the duplicate check file that have reached the retention time are deleted.

  3. The next node in the mediation node chain processes the CDR, which is distributed to the target system.