Input

1.3.4.11.1 Input

The Input sub-processor of matching processors is used to map attributes from input data streams to matching processors.

The Input sub-processor is a necessary part of matching, used to control the data that is used in the matching process.

Normally, all attributes from each input data stream are included in a matching process. However, you may want to vary the attributes used in matching, and only include those that you need either to match on, use in the review of possible matches, or use in making output selections.

Note:

For versions of EDQ older than 7.0, it was also necessary to configure the selection of input attributes carefully as all input attributes would be included in the Decision Key used to re-apply ('remember') manual match decisions. However, it is now possible to configure which of the input attributes to use in the Decision Key - see Advanced options for match processors.

For example, from a typical Customer table, the following attributes might be included in a matching process:

Purpose	Attributes
Needed for matching	First_name Surname Birth_date Address_1 Postcode Email Home_tel_number
Needed for the review of possibly matching records	Title Address_2 Town County Customer_type
Needed to identify specific records for data updates	Customer_ID
Needed to make output decisions (for example, to choose the most recent record)	Last_modified_date Has_active_account

A number of other attributes in the source data might be excluded from the matching process.

In order to input data into matching, you first need to connect up the data stream(s) to the match processor on the canvas. Note that the number and type of data streams accepted by the processor depends on the type of processor, as follows:

Match Processor Type	Access input data streams
Group and Merge	A single working data stream
Deduplicate	A single working data stream
Enhance	A single working data stream, and any number of reference data streams
Link	Any number of working and reference data streams
Consolidate	Any number of working data streams
Advanced match	Any number of working and reference data streams

Data streams are connected to match processors either directly from Readers, or from output filters of other processors.

Once the data streams are connected, you can use the Inputs dialog to select attributes, in the same way as for all processors.

Two additional options appear when configuring the options for a match processor (except Group and Merge):

Compare against self - this option allows you to change whether or not the match processor will look for matches within the data stream (rather than between data streams). This option is set to the most likely default depending on the type of match processor. Note that working data streams are always compared with each other, and reference data streams are never compared with each other.

Enabled - this option allows you to retain the configuration of an input data stream, but to switch on and off the use of it in the match process - for example to run a match of some working data against some, but not all, configured reference data streams.