Understanding the Sun Match Engine

Searching and Matching in Sun Match Engine Applications (Repository)

When a new record is passed to the master index database, the master index application selects a subset of possible matches from the database. The master index application then uses the Sun Match Engine matching algorithm to assign a matching probability weight for each record in this subset (known as the candidate selection pool). To create the candidate selection pool, the master index application makes a series of query passes of the existing data, searching for matches on specific combinations of data. These combinations are defined by the blocking query, which is defined in the Candidate Select file and specified in the Threshold file.

Matching is performed on the fields included in the match string defined in the Match Field file. Each field is assigned a matching weight. The weights for each field are summed to determine the matching probability weight for the entire record (known as the composite weight). Before matching on some fields, such as the first name, the index might standardize the field based on information in the standardization files. You can customize how each field is weighted by modifying the match configuration file.