1.3.8.1 Merge Data Streams

The Merge Data Streams processor allows you to merge a number of input data streams into a single stream, by mapping each input data stream to a target structure.

Merge Data Streams does not perform any transformation, matching, or merging of records. All input records are output, mapped to the target structure.

Use Merge Data Streams where you have a number of sources of data that all represent the same type of entity, and where all the sources have similar attribute structures that can be easily mapped to a target structure. Once the data streams have been merged, you can define your processing to act on all the records from all the sources.

Inputs

Any attributes from the data streams you want to merge.

Options

The Merge Data Streams configuration screen is designed to allow you to map any number of data streams through to a target, working with each input data stream in turn. Use the following instructions to map each of your input data streams to a target output data stream:

  1. To switch between input data streams, use the tabs at the top of the screen. Note that the view of Output Attributes remains the same.

  2. To create new output attributes in the target data stream corresponding to all the latest versions of the attributes in an input data set, click the Add All Attributes button: add all attributes button. The selected input attributes will be mapped to their corresponding output attributes.

  3. To add specific attributes in the target data stream, select one or more input attributes, and click the Create output attribute button: Create output attribute button. The selected input attributes will be mapped to the newly created output attributes.

  4. To map an input attribute to an existing output attribute, use the Map to output attribute button: Map to output attribute button.

  5. To remove the mapping of an input attribute from an output attribute, without removing the output attribute, select the mapped input attribute on the right hand screen, and click on the Unmap or remove output attribute button: Unmap or remove output attribute.

  6. To remove an output attribute from the target data stream, select the output attribute on the right hand screen, and click on the Unmap or remove output attribute button: Unmap or remove output attribute.

  7. To remove all output attributes from the target data stream, click the Remove all output attributes button: Remove all output attributes button.

  8. To reorder the output attributes in the target data stream, use the Up and Down arrows: Up arrow button Down arrow button.

  9. To map one or many input attributes to existing output attributes by name, select the input attributes in the right hand pane and click on the Map by Name button: Map by Name button. Mappings will be created for all the selected attributes which have the same name and type as an existing output attribute. The name matching is not case sensitive.

  10. Change the default Data Stream Name from 'Merged' to give the output data stream a meaningful name.

Note on Processor connections

As Merge Data Streams outputs a completely new data stream from the streams input to it, it is not possible to connect processors before Merge Data Streams directly to processors after Merge Data Streams.

As a new data stream is output (but not necessarily completely written out), it is also not possible to link back to the snapshot or staged data used in a reader when drilling down to see results. This means that when drilling down on the results of processors downstream of a Merge Data Streams processor, you will only be able to see the attributes that were actively processed, rather than all attributes in the data set.

Outputs

Data attributes

The data attributes output by Merge Data Streams are user-defined using the configuration screen.

Flags

None

Execution

Execution Mode Supported

Batch

Yes

Real time Monitoring

Yes

Real time Response

Yes

Note on Progress reporting

The Merge Data Streams processor takes all of the input records and outputs a completely new data stream. This means that when running a process that contains a Merge Data Streams processor, you may see a higher record count in the progress bar than you expect. This is because EDQ counts all of the input records separately from the output records (in the new data stream). The same is true when running Match Processors as these also output new data streams.

Results Browsing

Merge Data Streams presents a view of the target data set only. The input data streams are not shown.

Output Filters

Merge Data Streams outputs a single Merged output filter, with all input records mapped to the target structure.

Example

In this example, a number of sources of records representing business contacts are merged into a single data stream.

Figure 1-1 Records from Source A

Description of Figure 1-1 follows
Description of "Figure 1-1 Records from Source A"

Figure 1-2 Records from Source B

Description of Figure 1-2 follows
Description of "Figure 1-2 Records from Source B"

Merge Data Streams Configuration