Understanding the Sun Match Engine

Numeric Comparators

The Sun Match Engine provides several comparison functions for matching on numeric fields.

All but the nS comparison function can perform numeric string comparisons or relative distance calculations. When set for a string comparison, the functions compare numeric strings based on the advanced uncertainty comparator. When set for relative distance calculations, the matching weight between two numbers decreases as the numbers become further apart, until the relative distance plus one is reached. At this point, the numbers are considered non-matches. For example, if the relative distance is “10” and the base number for comparison is “2”, a field value of 8 receives a lower matching weight than a field value of 4; but a field value of 13 is considered a complete non-match (since the distance between 2 and 13 is 11).

Figure 3 illustrates how the weight is decreased as the difference between the two compared fields reaches the relative distance. In this diagram, the relative distance is 10 and the light blue line represents the agreement weight. When the difference between two fields reaches 11 (relative distance plus one), the fields are considered a non-match and are given the full disagreement weight.

Figure 3 Numeric Relative Distance Comparison

Figure shows how weights are assigned based on relative
distances.

Generic Number Comparator (n)

This is a basic numeric comparison function, processing numeric fields as described above. It accepts the parameters listed in Table 39.

Table 39 n, nI, and nR Comparison Function Parameters

Parameter 

Description 

distance-or-string

Specifies whether a relative distance calculation or a direct string comparison is used. Specify “y” to use a relative distance calculation; specify “n” to use a string comparison. 

relative-distance 

The greatest difference between two integers at which the values could still be considered a possible match. When the difference between two numbers is greater than the relative distance, the numbers are considered a non-match (the weight becomes zero when the actual difference is the relative distance plus one). 

Integer Comparator (nI)

This numeric comparison function matches specifically on integers and accepts the parameters listed in Table 39.

Real Number Comparator (nR)

This numeric comparison function matches specifically on real numbers and accepts the parameters listed in Table 39.

Alphanumeric Comparator (nS)

This numeric comparison function is designed specifically for matching on numeric strings and is very useful for matching social security numbers or other unique identifiers. This is the only numeric comparator that can compare alphanumeric values rather than just numeric values. It accepts the parameters listed in Table 40.

Table 40 nS Comparison Function Parameters

Parameter 

Description 

fixed-length

An optional parameter that takes the length of the field value into account. If a fixed length is specified, the match engine considers any field of a different length to be a non-match. Specify any integer smaller than the value specified for the size specified for the field (for more information, see Matching Rules).

character-type 

An indicator of whether the field must be all numeric. Specify “nu” for numeric only, or specify “an” to allow alphanumeric characters. The match engine considers any fields containing characters that are not allowed to be a non-match. 

invalid-characters 

A list of invalid characters for the field. If you specify a character, the match engine considers fields that consist of only that character to be a non-match. For example, if you specify “0”, then an SSN field cannot contain all zeros. Specify as many alphanumeric characters as needed, separated by a space.