Creating a Processor From a Sequence of Configured Processors

EDQ allows you to create a single processor for a single function using a combination of a number of base (or 'member') processors used in sequence.

Note that the following processors may not be included in a new created processor:

  • Parse

  • Match

  • Group and Merge

  • Merge Data Sets

A single configured processor instance of the above processors may still be published, however, in order to reuse the configuration.

Processor creation example

To take a simple example, you may want to construct a reusable Add Gender processor that derives a Gender value for individuals based on Title and Forename attributes. To do this, you have to use a number of member processors. However, when other users use the processor, you only want them to configure a single processor, input Title and Forename attributes (however they are named in the data set), and select two Reference Data sets - one to map Title values to Gender values, and one to map Forename values to Gender values. Finally, you want three output attributes (TitleGender, NameGender and BestGender) from the processor.

To do this, you need to start by configuring the member processors you need (or you may have an existing process from which to create a processor). For example, the screenshot below shows the use of 5 processors to add a Gender attribute, as follows:

  1. Derive Gender from Title (Enhance from Map).
  2. Split Forename (Make Array from String).
  3. Get first Forename (Select Array Element).
  4. Derive Gender from Forename (Enhance from Map).
  5. Merge to create best Gender (Merge Attributes).
Processor example

To make these into a processor, select them all on the Canvas, right-click, and select Make Processor.

This immediately creates a single processor on the Canvas and takes you into a processor design view, where you can set up how the single processor will behave.