1.3.4.9.24 Match Transformation: Strip Words

The Strip Words match transformation allows you to remove certain words from String values before clustering or comparing them. This works in exactly the same way as the main Strip Words processor.

The Strip Words transformation is very useful when clustering or comparing text values that contain a lot of different forms of certain words that are not needed to identify the value. For example, when matching company names, suffixes such as "LIMITED", "LTD", "GRP", "GROUP", "PLC" etc. may be stripped in order to match the meaningful parts of the identifier values.

Example

In this example, the Strip Words transformation is used in a comparison on a company name identifier.

Example configuration

Reference Data used includes the following words in the left-most column:

CORP, CORPORATION, LIMITED, LTD, PLC, GROUP, GRP

Delimiter Reference Data: *Delimiters

Delimiter characters: none

Ignore case?: Yes

Example transformations

The following table shows example transformations using the above configuration of the Strip Words transformation:

Table 1-97 Example Transformations for Strip Words

Value Transformed Value

ORACLE CORP

ORACLE

ORACLE CORPORATION

ORACLE

INTERCHANGE GROUP LIMITED

INTERCHANGE

INTERCHANGE GROUP

INTERCHANGE

INTERCHANGE GRP LTD

INTERCHANGE