Data Quality Guide for Oracle Customer Hub > Data Quality Concepts >

Identification of Candidate Records


The way in which candidate records are identified differs for the Oracle Data Quality Matching Server and the Universal Connector as described in the following topics.

Identification of Candidate Records with the Oracle Data Quality Matching Server

When using the Oracle Data Quality Matching Server for data matching, identification of candidate records is irrelevant as match candidate acquisition takes place within the Oracle Data Quality Matching Server.

Identification of Candidate Records with the Universal Connector

Data quality queries the database for candidate records by using a Dedup Query Expression parameter specific to the current Business Component. A Dedup Query Expression is used rather than the related Dedup Token Expression, for the following reason: If a user does not specify a value for any of the fields that compose the Dedup Token Expression, then the token is constructed with an underscore (_) instead of a value in the part of the expression that corresponds to that field. If the token were to be used in a query, the effect would be for the query to seek records that had NULL values in corresponding fields. In contrast, the Dedup Query Expression replaces each underscore in the Dedup Token Expression with a '?' wildcard character that matches any single character, leading to the desired query results.

You can customize both the Dedup Token Expression and the Dedup Query Expression parameters through the Third Party Administration view. The configuration of these expressions must be consistent with the internal matching logic of the vendor, which is different for each vendor. For optimal results therefore, change these values only after consulting the relevant vendor. If you change the expressions, you must regenerate match keys.

See Table 1 for examples about how the default expressions can differ for different business components.

Table 1. Expressions Used for Keys and Queries (Example)
Business Component
Dedup Token Expression Parameter (Key)
Dedup Query Expression Parameter (for Queries)

Account

"IfNull (Left ([Primary Account Postal Code], 5), '_____') + IfNull (Left ([Name], 1), '_') + IfNull (Mid ([Street Address], FindNoneOf ([Street Address], '1234567890 '), 1), '_')"

"IfNull (Left ([Primary Account Postal Code], 5), '?????') + IfNull (Left ([Name], 1), '?') + IfNull (Mid ([Street Address], FindNoneOf ([Street Address], '1234567890 '), 1), '?')"

Contact

"IfNull (Left ([Postal Code], 5), '_____') + IfNull (Left ([Account], 1), '_') + IfNull (Left ([Last Name], 1), '_')"

"IfNull (Left ([Postal Code], 5), '?????') + IfNull (Left ([Account], 1), '?') + IfNull (Left ([Last Name], 1), '?')"

List Mgmt Prospective Contact

"IfNull (Left ([Postal Code], 5), '_____') + IfNull (Left ([Account], 1), '_') + IfNull (Left ([Last Name], 1), '_')"

"IfNull (Left ([Postal Code], 5), '?????') + IfNull (Left ([Account], 1), '?') + IfNull (Left ([Last Name], 1), '?')"

The maximum number of candidate records that are sent to the third-party software at one time is determined by the value of the following vendor parameters in the Third Party Administration view:

  • Realtime Max Num of Records. Used in real time, the default value is 200, which is the highest value that you can set. Usually there will not be more than 200 records to send, but if there are more than 200 records, the first 200 records are sent.
  • Batch Max Num of Records. Used in batch mode, the default is 200, which is the highest value that you can set. If there are more than 200 records to send, the first 200 records are sent, then up to 200 records in the next iteration, and so on.

NOTE:  Information in this topic does not apply if using the Oracle Data Quality Matching Server for data matching as match candidate acquisition takes place within the Oracle Data Quality Matching Server.

Data Quality Guide for Oracle Customer Hub Copyright © 2018, Oracle and/or its affiliates. All rights reserved. Legal Notices.