Understanding the Sun Match Engine

Weight Ranges Using Probabilities

Determining the match weight ranges when using m-probabilities and u-probabilities is a little more complicated than using agreement and disagreement weights. To determine the maximum weight that will be generated for each field, use the following formula:


LOG2(m_prob/u_prob)

To determine the minimum match weight that will be generated for each field, use the following formula:


LOG2((1-m_prob)/(1-u_prob))

Table 37 below illustrates a sample of m-probabilities and u-probabilities, including the corresponding agreement and disagreement weights that are generated with each combination of probabilities. As you can see, the range of match weights generated for a master index application with this configuration is from -35.93 to +38

Table 37 Sample m-probabilities and u-probabilities

Field Name 

m-probability 

u-probability 

Max Agreement Weight 

Min Disagreement Weight 

First Name 

.996 

.004 

7.96 

-7.96 

Last Name 

.996 

.004 

7.96 

-7.96 

Date of Birth 

.97 

.007 

7.11 

-5.04 

Gender 

.97 

.03 

5.01 

-5.01 

SSN 

.999 

.001 

9.96 

-9.96 

Maximum Match Weight

   

38 

 

Minimum Match Weight

     

-35.93