Configuring Sun Master Indexes (Repository)

Match Comparator Configuration Properties for Sun Master Index (Repository)

The following table lists and describes the Configuration Editor fields used to define the comparison functions. It also lists the corresponding column in the match configuration file if you want to modify the file directly.

Configuration Editor Field 

Match Configuration File Element and Column Number 

Description 

Match Type

match-type (column 1) 

A value that indicates to the Sun Match Engine how each field should be weighted. Each field included in the match string (the MatchingConfig section of the Match Field file) must have a match type corresponding to a match type defined in this file.

Match Size

size (column 2) 

The number of characters on which matching is performed, beginning with the first character. For example, to match on only the first four characters in a 10-digit field, the value of this column should be “4”. 

Null Field

null-field (column 3) 

An index that specifies how to calculate the total weight for null fields or fields that only contain spaces. You can specify any of the following values. The Configuration Editor value is given first, followed by the match configuration file value in parenthesis.

  • Zero weight (0) - If one or both fields are empty, the weight used for the field is 0 (zero).

  • Full Combination weight (1) - If both fields are empty, the agreement weight is used; if only one field is empty, the disagreement weight is used.

  • Full Agreement weight (a1) - Specifies to use the full agreement weight if both fields are null.

  • 1/x of the Agreement weight (ax) - Specifies to use the a fraction of the agreement weight if both fields are empty. The agreement weight is multiplied by the fraction 1/x to obtain the match weight for that field. When modifying the match configuration file directly, the default is “2” if no number is specified. You can specify any number from 1 through 10.

  • Full disagreement weight (d1) - Specifies to use the full disagreement weight if both fields are null.

  • 1/x of the disagreement weight (dx) - Specifies to use the disagreement weight if only one field is empty. The disagreement weight is multiplied by the fraction 1/x to obtain the match weight for the field. When modifying the match configuration file directly, the default is “2” if no number is specified. You can specify any number from 1 through 10.

In the above descriptions, the agreement and disagreement weights are either specified in this file or calculated using a logarithmic formula based on the m and u-probabilities (depending on the probability type). 

Function

function (column 4) 

The type of comparison to perform when weighting the field. For information about the available comparison functions, see Match Configuration Comparison Functions for Sun Match Engine (Repository), in Understanding the Sun Match Engine.

Agreement Weight

agreement-weight (column 7) 

The matching weight to be assigned to a field given that the fields match between two records; that is, the maximum match weight for a field. This number can be between 0 and 100 and can have up to 16 decimal points. Only set this value if the Probability Type is set to use agreement and disagreement weights.

Disagreement Weight

disagreement-weight (column 8) 

The matching weight to be assigned to a field given that the fields do not match between two records; that is, the minimum match weight for a field. This number can be between 0 and -100 and can have up to 16 decimal points. Only set this value if the Probability Type is set to use agreement and disagreement weights. 

M-Probability

m-prob (column 5) 

The initial probability that the specified field in two records will match if the records match. The probability is a double value between 0 and 1, and can have up to 16 decimal points. Only set this value if the Probability Type is set to use probabilities.

U-Probability

u-prob (column 6) 

The initial probability that the specified field in two records will match if the records do not match. The probability is a double value between 0 and 1, and can have up to 16 decimal points. Only set this value if the Probability Type is set to use probabilities. 

Extra Parameters

parameters (column 9) 

Parameters correspond to the comparison function specified in the Function field or column. Some comparison functions do not take any parameters and some take multiple parameters. For information about which functions take parameters and the parameters they take, see Match Configuration Comparison Functions for Sun Match Engine (Repository), in Understanding the Sun Match Engine.