1.3.11.14 Denoise

The Denoise processor removes user-defined 'noise' characters from text attributes, and returns the denoised value in a new output attribute.

The list of noise characters can be entered as a list on–screen, or a reference list may be used, or both.

Inconsistent formatting, punctuation and spurious control characters etc. can mask otherwise consistent values in data.

Use the Denoise processor to remove these 'noise' characters from text attributes, prior to other processing, such as before performing a List Check on a text attribute.

The following table describes the configuration options.

Configuration Description

Inputs

Specify any String or String Array type attributes that you want to denoise. Number and Date attributes are not valid inputs.

Note that if you input an Array attribute, the transformation will apply to all array elements, and an Array attribute will be output.

Options

Specify the following options:

  • Noise characters Reference Data: list of noise characters. Specified as Reference Data. Default value: *Noise Characters.

  • Noise characters: additional noise characters. Specified as free text. Default value: None.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The following data attributes are output:

  • [Attribute Name].Denoise: the denoised version of the attribute values. This may be a String or an Array, depending on the input attributes. Value is derived from the original attribute value, denoised.

Flags

None.

The Denoise transformer presents no summary statistics on its processing. In the Data view, each input attribute is shown with its new derived denoised attribute to the right.

Output Filters

None.

Example

In this example, the Denoise processor is used to remove all hash characters (#) from a NAME attribute:

NAME (asc) NAME.Denoise

# MCAULAY

MCAULAY

# RAE

RAE

# SWAN

SWAN

# WILLIAM

WILLIAM

A Test

A Test

Abigail Anderson

Abigail Anderson