1.3.4.6 Link

The Link processor is used to link data streams together, where records in those data streams represent the same entity, using sophisticated matching that does not require the original records to be precisely the same.

Use the Link processor to link together matching records across data streams. Like all matching processors, Link offers the ability to match records using both automatic rules and manual decisions.

The output of the Linking process allows you to link together records in external systems.

Link is a type of matching processor. Matching processors consist of several sub-processors, where each sub-processor performs a different step of matching, and requires separate configuration. The following sub-processors make up the Link processor, each performing a distinct function as described below.

Sub-processor Description

Input

Select the attributes from the data streams to be linked.

Identify

Create identifiers to use in matching, and map them to attributes.

Cluster

Divide the data streams into clusters.

Match

Choose which comparisons to perform, and how to interpret them with match rules.

Inputs

Any attributes from the data streams you want to include in the matching process.

The inputs are configurable in the Input sub-processor. Note that you can input multiple working data streams, and multiple reference data streams, into the linking process. Reference data streams are never compared with each other, and unrelated records from reference data streams are never output from the matching process.

Options

All options are configured within the sub-processors above, except for the Advanced Options for Match Processors.

Outputs

The output data streams, and their attributes are configured in the Match and Merge sub-processors above.

Execution

It is possible to use a Link processor in a Real time Response process, provided the process contains only one match processor.

Calling a link match processor in this way results in special behavior on the response interface. See the Real time matching concept guide. More

Execution Mode Supported

Batch

Yes

Real time Monitoring

Yes

Real time response

Yes

Note:

The Link processor always appears with a re-run marker, indicating that it will be completely re-executed each time the process is run, regardless of whether or not its configuration has changed. This will also mean that processors that are downstream of the Link processor will also need to be rerun.

Results Browsing

The Link processor produces a number of views of results as follows. Any of the views may be seen by clicking on the Link processor in the process. The views may also be seen by expanding the Link processor to view its sub-processors, and selecting the sub-processor that produces the view.

Input Views (produced by Input)

An Input View is displayed for each input data stream. The selected attributes from each set are shown in the view.

Cluster Views (produced by Cluster)

A Cluster View is displayed for each configured cluster. Use these views to assess the sensitivity of your clustering, to ensure you are not making too many redundant comparisons, and not missing any potential matches. See the Clustering concept guide for further information. More

Statistic Meaning

Cluster

Each distinct cluster key value

Group size

The total number of records in the cluster; that is, the number of records with the same distinct cluster key value

Processed?

Indicates whether or not this cluster was actually processed. Values can be:

  • Yes

  • Skipped - cluster size limit

  • Skipped - comparison limit

[Data stream name]

For each input data stream:

A drillable count of the records in each cluster from each input data stream

Matching View (produced by Match) [Match Review only]

The Matching View summarizes how many records from working data streams were matched against, and therefore linked with, either other working records, or reference records.

Statistic Meaning

Matching records

The number of records from the working data streams that matched other working records, or reference records, with Match relationships; that is, the number of working records that are linked.

Note that this does not include records matched with Review relationships only, unless the advanced option to Use Review relationships in Match Groups is ticked. See Use review relationships in match groups [Match Review only].

Non-matching records

The total number of records that were not matched to any other records (and will therefore not be linked). This figure includes non-matching records from reference data streams.

Rules View (produced by Match)

The Rules View displays a summary of the number of relationships created by each automatic match rule:

Statistic Meaning

Rule id

The numeric identifier of the match rule.

Rule name

The name of the match rule.

Relationships

The number of relationships between records that were created by the match rule. Note that each distinct relationship between a pair of records (A and B) can only be created by a single rule. If a higher rule creates the relationship, lower rules will not apply. One of the records in a relationship may be related to another record (for example, A and C) by another rule.

Review Status View (produced by Match)

The Review Status view summarizes relationships by their review status:

Statistic Meaning

Review Status

The review status. A row is displayed for each possible review status, as follows:

  • Automatic match

  • Manual match

  • Pending

  • Awaiting review

  • Manual No match

Relationships

The number of relationships between records of the given review status. See note below.

Note:

The statistics in this view will update automatically based on decisions made during the review process, so the top-level statistics will always provide an up-to-date view of the review status of each relationship. However, the drilldowns to the data are generated on each run of the match processor, and will not update based on review decisions made since the last time the match processor was run. When this happens, the Results Browser informs you that the generated data that you are looking at is out-of-date.

Match Groups View (produced by Match) [Match Review only]

The Match Groups view summarizes the groups of matching records:

Statistic Meaning

Match groups

The total number of groups of matching records. Drill down to see a summary of the groups by group size (in number of records). Note that the match groups will not include records matched to others with Review relationships only, unless the advanced option to Use Review relationships in Match Groups is ticked. See Use review relationships in match groups [Match Review only].

Unmatched output records

The total number of unmatched records from working tables that were output.

Note that unmatched records from reference sources are not output.

Alert Groups View (produced by Match) [Case Management only]

The Alert Groups view summarizes the groups of matching records:

Statistic Meaning

Alert groups

The total number of alert groups. Drill down to see a summary of the groups by group size (in number of records).

Records not in alerts

The total number of records from the working data that were not included in any alerts.

Note that unmatched records from reference sources are not output.

Groups Output (produced by Match) [Match Review only]

The Groups Output is a Data View of the match groups created by the match processor. The groups that are output in the data view, and the attributes of the view, may vary depending on the options for the Groups Output in the Match sub-processor. For example, the data view may or may not include 'groups' which contain a single record.

Alerts Output (produced by Match) [Case Management only]

The Alerts Output is a Data View of the alerts created by the match processor. The alerts that are output in the data view, and the attributes of the view, may vary depending on the options for the Alerts Output in the Match sub-processor.

Relationships Output View (produced by Match)

The Relationships Output is a Data View of the distinct relationships (links) between pairs of records created by the match processor. The relationships that are output in the data view, and the attributes of the view, may vary depending on the options for the Relationships Output in the Match sub-processor. For example, the view may or may not include relationships formed by particular rules.

Output Filters

The following output filters are available from the Advanced Match processor:

  • Groups

  • Relationships

  • Merged

  • Decisions

The Groups, Relationships and Merged output filters correspond with the Groups Output, Relationships Output and Merged Output, as above.

Decisions Inputs and Outputs

The decisions input has the following purposes:

  • Importing historic match decisions that have been made in other products into EDQ. This is a one-time process. When complete, the data should be unwired from the decisions input.

  • Importing match decisions that have been made (and are regularly being made) in an external review system. This should be a part of the normal run process.

The decisions output enables a full audit trail of match decisions to be stored externally.

Note:

External match review uses the relationships output, which contains the latest match decisions. The decisions output differs, as it contains all decisions that have ever been made, including old ones and those that are no longer associated with a current relationship. For this reason the Decisions output is better suited for audit purposes.

See "Importing Match Decisions" and "Exporting Match Decisions" in Oracle Fusion Middleware Using Oracle Enterprise Data Quality for additional information about using the decision inputs and outputs.