10.1 Abbreviate Processor

The Abbreviate Processor is designed to make an ID-like value from a product description string, and can also be used for general abbreviation of text strings. It takes a single string input and abbreviates it based upon various options, outputting a single string.

Table 10-1 Input Attributes

Attribute Name Data Type Description Mandatory

Input String

String

The input to abbreviate

Y

Table 10-2 Output Attributes

Attribute Name Data Type Description

Input.Abbreviated

String

Abbreviated input

Table 10-3 Options

Option Name Data Type Default Description

Characters to replace pre abbreviation

Reference Data

Abbreviate – Characters to Replace

Individual characters to replace pre abbreviation. Default reference data in PDS contains data to standardize accented characters and to replace certain punctuation characters with spaces.

Characters to strip pre abbreviation

Reference Data

Abbreviate – Characters to Strip

Characters to strip pre abbreviation. Default reference data strips diacritics and remaining punctuation characters.

Words to replace pre abbreviation

Reference Data

Abbreviate – Word Replacements

Individual words to replace pre abbreviation.

Remove vowels in the middle or end of words? (Y/N)

String

Y

Whether to remove vowels in the middle or end of words. Only applies to those tokens which are of sufficient length (according to Minimum word length setting), and which do not contain any numeric characters.

Remove vowels at the start of words? (Y/N)

String

N

Whether to remove vowels at the start of words. Only applies to those tokens which are of sufficient length (according to Minimum word length setting), and which do not contain any numeric characters.

Replace double consonants with single? (Y/N)

String

Y

Whether to replace double consonants with single. Only applies to those tokens which are of sufficient length (according to Minimum word length) setting, and which do not contain any numeric characters.

Abbreviate tokens containing numeric characters (Y/N)

String

N

Whether to apply the abbreviation options to tokens containing numeric characters.

Don’t abbreviate words of this number of

Integer

<Blank>

Minimum length for a token for it to have abbreviation options (first 3 options above) applied.

Truncate words of more than this number of characters (after abbreviation

Integer

<Blank>

Maximum length for a token after it has been processed according to other options. Any tokens longer than this length will be truncated to this length (truncated characters removed from the end).

Standardize words

Reference Data

<Blank>

Reference data for replacing words prior to abbreviation (but after simple standardization/normalization processing).

Separator for output words

String

<Blank>

Separator to be output between tokens in the output string.

Table 10-4 Examples

Input Remove vowels at Start Remove vowels middle/end Replace double consonants Abbreviate tokens with numerics Max token length Min token length Output delimiter Output

Sony Bravvia LCD TV, 47", Silver, Abacus

N

Y

Y

N

<Blank>

<Blank>

|

SNY|BRV|LCD|TV|47|SLVR|ABCS

TTB653SDS Hexarmor Sharpsmaster HV 7082 Needlestick-Resistant

N

Y

Y

N

8

<Blank>

|

TTB653SDS|HXRMR|SHRPSMST|HV|7082|NDLSTCKR|GLVS