1.3.11.6 Character Replace

The Character Replace processor replaces individual characters. This enables the standardization or normalization of characters matching a Reference Data map.

Inconsistent characters such as accented letters or variants on symbols (such as open and closed quotes) can mask otherwise similar data. Use the Character Replace processor to replace all instances of the character in the Reference Data map with its replacement character.

In some cases, Character Replace may be used for simple character-to-character transliteration, by mapping characters from one writing system to another.

The following table describes the configuration options:

Configuration Description

Inputs

Specify any String or String Array type attributes where you want to replace characters. Number and Date attributes are not valid inputs.

If you input an Array attribute, the transformation will apply to all array elements, and an Array attribute will be output.

Options

Specify the following options:

  • Ignore case: enables the replacement of both upper and lower case forms of characters (where they exist). Specified as Yes/No. Default value: No.

  • Transform Map Reference Data: maps a character to its replacement character. Specified as Reference Data. Default value: *Standardize Accented Characters.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

The following data attributes are output:

  • [Attribute Name].CharReplace: a new String or Array attribute with the replaced characters. Value is derived from the original attribute value, after character replacement.

Flags

None.

The Character Replace processor presents no summary statistics on its processing. In the Data view, the input attributes are shown with the new attribute, containing the character-substituted string, to their right.

Output Filters

None.

Example

In this example the Character Replace processor is used to standardize accented letters in a First name attribute.

Transformation Map Reference Data:

Lookup Map Comment

É

E

E acute

È

E

E grave

ô

o

o circumflex

Ignore case = Yes

Results:

accent names accent names.CharReplace

élise

elise

Aimée

Aimee

Marie-élise

Marie-elise

Cécile

Cecile

Note:

Upper case É is transformed to E and lower case é is transformed to e.