1.3.11.27 Normalize No Data

The Normalize No Data processor allows different types of 'blank' values that exist in data attributes to be normalized to Null values, or to a specific value of your choice. This may be important in order to treat 'No Data' values consistently. For example, if you have an attribute containing only WHITESPACE characters, other processors (for example, comparisons in match processors) will not treat these values as No Data, unless they are normalized to NULL values.

Note that the Normalize No Data processor can perform the same function (when normalizing to Nulls) as No Data Handling, but allows you to do this within a process. This may be useful if you want to be aware of the different types of No Data that exist in your source when profiling, but still treat them as blank values (Nulls) in downstream processors, or because you have introduced blank values in attributes via other transformations, such as denoising or trimming a value until it consists only of WHITESPACE, and want to treat them as Nulls.

Use the Normalize No Data processor where you discover in Pattern Profiling that you have attribute values that contain only whitespace characters, and which you therefore want to consider as containing no data of any use. Downstream processors will then be guaranteed to treat such values as Null. For example, comparisons in match processors will give a No Data comparison result if comparing a Null string with a data value.

Alternatively, you can specifically transform No Data values to a value of your choice to differentiate them from values that were originally Null (before normalization).

The following table describes the configuration options:

Configuration Description

Inputs

Specify any number of String or String Array attributes where you wish to normalize No Data values to Nulls, or a specific value.

Note that if you input an Array attribute, the transformation will apply to all array elements, and an Array attribute will be output.

Options

Specify the following options:

  • No data handling reference data: lists the set of characters that you wish to treat as No Data characters. Values consisting entirely of these characters, and empty strings, will be normalized to Null, or the specified value. Specified as Reference Data (No Data Handling Category). Default value: *No Data Handling.

  • Normalize no data to: determines whether to normalize no data values to Null, or to a custom string value of your choice, specified by the option below. Specified as a Selection (Null values/custom string). Default value: Null values.

  • Custom string: The custom string value to which no data values will be normalized, if they are not being normalized to Null values. Specified as free text. Default value: No.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The following data attributes are output:

  • [Attribute Name].NoDataNormalized: Holds the new attribute values, after no data values have been normalized. Value is the original attribute value, transformed to either a Null value, or a custom string, if it was Null, an empty string, or contained only no data characters, using the specified reference data list.

Flags

None.

The Normalize No Data transformer presents no summary statistics on its processing.

In the Data view, each input attribute is shown with its new normalized attribute to the right.

Output Filters

None.

Example

In this example, the No Data Normalizer is used to normalize all blank values in a TITLE attribute to the custom string '#NO DATA#':

TITLE TITLE.NoDataNormalized

Ms

Ms

[Null]

#NO DATA#

Mr

Mr

Miss

Miss

[Null]

#NO DATA#

Ms

Ms

[Null]

#NO DATA#