1.3.11.25 Merge Attributes

The Merge Attributes processor allows you to merge together a number of attributes into a single attribute, by selecting the first non-empty value from a number of input attributes.

Use Merge Attributes:

  • When cleaning data, to create a single merged attribute with cleaned values, using the fixes applied to the invalid records and the original values where the records were deemed valid

  • Where you have applied a number of fixes to different records for the same original attribute, in different processors, and need to merge these to create a single cleaned attribute across all records

  • In other scenarios where you need to select the value for a new attribute based on the values in an ordered set of input attributes

Merge Attributes will perform selections for each record, picking the first non-empty value from this ordered list of attributes.

Note:

A string that contains only whitespace or other non-text characters, such as line returns, is not the same as an empty string. To avoid the selection of strings containing non-text characters if they are the first value received, you should first pass the attributes to be merged through a Normalize No Data processor to change these characters to Nulls before passing them into the Merge Attribute Processor. The Nulls will then be ignored by the Merge Attributes processor.

A number of attributes may all be mapped to the same merged attribute, in order. For example, if you have an original firstname attribute, and you verified that it contains a valid Forename by using a List Check, you might apply a fix (for example, using the Replace processor) only on the invalid records. You therefore might have two attributes that you want to merge to a single MergedFirstname attribute. The Merge Attributes processor allows you to do this, by selecting the first non-empty value, considering a number of attributes in order, for example:

  1. Select the fixed name (firstname.Replaced), if it is not empty.

  2. Select the original name (firstname), if the fixed name is empty (which it will be for all the records that were not fixed).

It is possible to create a number of merged attributes in a single Merge Attributes processor.

For example, if you have applied fixes to the values for a title attribute in the same way as above, you might want to create both MergedFirstname and MergedTitle in the same processor.

The following table describes the configuration options:

Configuration Description

Inputs

Specify any attributes that you wish to merge together to create a new attribute. Attributes that are used for selection to create a new Merged Attribute must share the same data type (String, Number or DATE).

To map input attributes to create a new merged attribute, select the attributes you wish to merge on the left-hand side, and use the Merge button. Use the up and down arrow buttons on the dialog to change the order of selection of the input attributes within each merged attribute.

Options

Specify the following options:

  • Select empty strings: determines whether or not empty strings are selected when merging attributes. If set to Yes, an attribute value will only not be selected if it is Null, or if no attribute value exists (for example, the attribute was added on a stream the record did not go down). Specified as Yes/No. Default value: No.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The following data attributes are output:

  • The new merged attributes as named in the Inputs tab: new attributes containing values merged from the configured input attributes. Value: selected as the first not null value from the ordered input attributes.

Flags

None.

The Merge Attributes processor produces no summary view of its results. Use the Data View to check that the configured merge selections are working as expected.

Output Filters

None. All records input are output.

Example

In this example, replacements to title and firstname values have been applied to a subset of records (with Titles and Names that were not recognized as valid in list checks). The replaced values are used where available. Where not available, the original values for title and firstname are used:

title.Replaced title MergedTitle firstname.Replaced firstname MergedFirstname

[Null]

Miss

Miss

Cindy

Sindy

Cindy

[Null]

Ms

Ms

[Null]

Rebecca

Rebecca

Mr

Mister

Mr

[Null]

Paul

Paul

[Null]

Ms

Ms

[Null]

Lorraine

Lorraine

Rev

The Reverend

Rev

Claudia

Cluadia

Claudia

Professor

Prof.

Professor

Geoffrey

Geoffry

Geoffrey

After the attributes have been merged, you may wish to re-check the merged attributes - for example, to ensure that MergedFirstname and MergedTitle in the example above now both contain valid data.