Understanding the Sun Match Engine

Agreement and Disagreement Weight Ranges

Defining agreement and disagreement weight ranges is a more direct way to implement m-probabilities and u-probabilities. Like probabilities, the maximum agreement and minimum disagreement weights you define for each field allow you to define the relative reliability of each field; however, the match weight has a more linear relationship with the numbers you specify. When you use agreement and disagreement weight ranges to determine the match weight, you define a maximum weight for each field when they are in complete agreement and a minimum weight for when they are in complete disagreement. The Sun Match Engine assigns a matching weight to each field that falls between the agreement and disagreement weights specified for the field. This provides a more convenient and intuitive representation of conditional probabilities.

Using the SSN and gender field example above, the SSN field would be assigned a higher maximum agreement weight and a lower minimum disagreement weight than the gender field because it is more reliable. If you assign a maximum agreement weight of “10” and two SSNs match, the match weight for that field is “10”. If you assign a minimum disagreement weight of “-10” and two SSNs are in complete disagreement, the match weight for that field is “-10”. Agreement and disagreement weights are expressed as double values and can have up to 16 decimal points.