1.3.4.8.3 Comparison: Character Edit Distance

The Character Edit Distance comparison compares two String/String Array values and determines how closely they match each other by calculating the minimum number of character edits (deletions, insertions and substitutions) needed to transform one value into the other.

The Character Edit Distance comparison is one of the most powerful and commonly used comparisons. Use the Character Edit Distance comparison to find exact or close matches for 2 values for an identifier. The Character Edit Distance comparison is good for matching textual values that may be misspelt, and thus have one or two character differences between each other. For example, the edit distance between "Matthews" and "Mathews" is 1.

This comparison supports the use of result bands.

The following table describes the configuration options:

Option Type Description Default Value

Match No Data pairs?

Yes/No

This option determines the result of a comparison when it compares two No Data (Null, or containing only whitespace characters) values for an identifier.

If set to No, the comparison will give a 'no data' result when comparing a No Data value against another No Data value.

If set to Yes, the comparison will give a full match (a Character Edit Distance of 0) when comparing a No Data value against another No Data value. A 'no data' result will only be returned if a No Data value is compared against a populated value.

No

Ignore case?

Yes/No

Sets whether or not to ignore case when comparing values.

For example, if case is ignored, "Oracle Corporation" will match "ORACLE CORPORATION" with a Character Edit Distance of 0.

Yes

Example

In this example, the Character Edit Distance comparison is used to match email addresses. The following options are specified:

Table 1-35 Example Options: Character Edit Distance

Option Setting

Match No Data pairs?

No

Ignore Case?

Yes

Example results:

Table 1-36 Example Results: Character Edit Distance

Value A Value B Comparison Result

john/smith@example.com

john.smith@example.com

1

John.Smith@example.com

john.smith@example.com

0

jhon_smith@hotmail.com

john_smith@hotmail.com

2

tom simpson@gmail.com

tomsimpson@gmail.com

1

andrew_johnson@email.net

andrew.johnstone@email.net

3

<null>

andrew.johnstone@email.net

no data

<null>

<null>

no data