JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Java CAPS Master Index Match Engine Reference     Java CAPS Documentation
search filter icon
search icon

Document Information

Master Index Match Engine Reference

About the Master Index Match Engine

Related Topics

Master Index Match Engine Overview

Data Matching Concepts

Deterministic and Probabilistic Data Matching

Weighting Thresholds

Probabilities and Direct Weights

Matching and Unmatching Probabilities

Agreement and Disagreement Weight Ranges

How the Master Index Match Engine Works

Master Index Match Engine Structure

Master Index Match Engine Configuration Files

Master Index Match Engine Matching Weight Formulation

Master Index Match Engine Data Types

The Master Index Match Engine and the Master Index Standardization Engine

Oracle Java CAPS Master Index Standardization and Matching Process

Master Index Match Engine Matching Configuration

The Master Index Match Engine Match Configuration File

Master Index Match Engine Match Configuration File Format

Match Configuration File Sample

Probability Type Section

Matching Rules Section

Master Index Match Engine Matching Comparison Functions At a Glance

Master Index Match Engine Comparator Definition List

Master Index Match Engine Comparison Functions

Bigram Comparators

Bigram Comparator (b1)

Advanced Bigram Comparator (b2)

Uncertainty String Comparators

Advanced Jaro String Comparator (u)

Winkler-Jaro String Comparator (ua)

Condensed String Comparator (us)

Advanced Jaro Adjusted for First Names (uf)

Advanced Jaro Adjusted for Last Names (ul)

Advanced Jaro Adjusted for House Numbers (un)

Advanced Jaro AlphaNumeric Comparator (ujs)

Unicode String Comparator (usu)

Unicode AlphaNumeric Comparator (usus)

Exact Character-to-Character Comparator (c)

Numeric Comparators

Integer Comparator (nI)

Real Number Comparator (nR)

Condensed AlphaNumeric SSN Comparator (nS)

Date Comparators

Date Comparator With Years as Units (dY)

Date Comparator With Months as Units (dM)

Date Comparator With Days as Units (dD)

Date Comparator With Hours as Units (dH)

Date Comparator With Minutes as Units (dm)

Date Comparator With Seconds as Units (ds)

Prorated Comparator (p)

Creating Custom Comparators for the Master Index Match Engine

Custom Comparator Overview

About the Comparator Package

Defining Custom Comparators

Before You Begin

Step 1: Create the Custom Comparator Java Class

initialize

Description

Syntax

Parameters

Return Value

Throws

compareFields

Description

Syntax

Parameters

Return Value

Throws

setRTParameters

Description

Syntax

Parameters

Return Value

Throws

stop

Description

Syntax

Parameters

Return Value

Throws

Step 2: Register the Comparator in the Comparators List

To Register the Comparators

Step 3: Define Parameter Validations (Optional)

To Define Parameter Validations

validateComparatorsParameters

Description

Syntax

Parameters

Return Value

Throws

Step 4: Define Data Source Handling (Optional)

To Define Data Source Handling

handleComparatorsDataSources

Description

Syntax

Parameters

Return Value

Throws

DataSourcesProperties Class

getDataSourcesList

Description

Syntax

Parameters

Return Value

Throws

isDataSourceLoaded

Description

Syntax

Parameters

Return Value

Throws

setDataSourceLoaded

Description

Syntax

Parameters

Return Value

Throws

getDataSourceObject

Description

Syntax

Parameters

Return Value

Throws

Step 5: Define Curve Adjustment or Linear Fitting (Optional)

To Define Curve Adjustment or Linear Fitting

processCurveAdjustment

Description

Syntax

Parameters

Return Value

Throws

Step 6: Compile and Package the Comparator

Step 7: Import the Comparator Package Into Oracle Java CAPS Master Index

To Import a Comparison Function

Step 8: Configure the Comparator in the Match Configuration File

Master Index Match Engine Configuration for Common Data Types

The Master Index Match String

Master Index Match Engine Match String Fields

Person Data Match String Fields

Address Data Match String Fields

Business Name Match String Fields

Master Index Match Engine Match Types

Configuring the Match String for a Master Index Application

Configuring the Match String for Person Data

Configuring the Match String for Address Data

Configuring the Match String for Business Names

Fine-Tuning Weights and Thresholds for Oracle Java CAPS Master Index

Data Analysis Overview

Customizing the Match Configuration and Thresholds

Determining the Match Fields

Customizing the Match Configuration

Probabilities or Agreement Weights

Defining Relative Value

Determining the Weight Range

Weight Ranges Using Agreement Weights

Weight Ranges Using Probabilities

Comparison Functions

Determining the Weight Thresholds

Specifying the Weight Thresholds

Weight Distribution Method

Percentage Method

Fine-tuning the Thresholds

Master Index Match Engine Match Types

The default match configuration file, matchConfigFile.cfg, defines several rules that you can customize for the type of data being processed. Each rule is identified by a match type in the first column of each row. This value identifies the type of matching to perform to the match engine. In a master index application, the match type is entered for each field in the match string section of mefa.xml.

The match configuration Master Index Match Enginefile appears under the Match Engine node of the master index project. For more information about the comparison functions used for each match type and how the weights are tuned, see Customizing the Match Configuration and Master Index Match Engine Comparison Functions.

The following four tables list match types that are typically used in processing different data types, including:

The following match types are designed for matching on person data.

Table 12 Person Data Match Types

This indicator ...
processes this data type ...
FirstName
A first name field, including middle name, alias first name, and alias middle name fields.
LastName
A last name field, including alias last name fields.
SSN
A field containing a social security number.
Gender
A field containing a gender code.

The following match types are designed for matching on address data.

Table 13 Address Match Types

This indicator ...
processes this data type ...
StreetName
The parsed street name field of a street address.
HouseNumber
The parsed house number field of a street address.
StreetDir
The parsed street direction field of a street address.
StreetType
The parsed street type field of a street address.

The following match types are designed for matching on business names.

Table 14 Business Name Match Types

This match type ...
processes this data type ...
PrimaryName
The parsed name field of a business name.
OrgTypeKeyword
The parsed organization type field of a business name.
AssocTypeKeyword
The parsed association type field of a business name.
LocationTypeKeyword
The parsed location type field of a business name.
AliasList
The parsed alias type field of a business name.
IndustrySectorList
The parsed industry sector field of a business name.
IndustryTypeKeyword
The parsed industry type field of a business name.
Url
The parsed URL field of a business name.

Miscellaneous match types provide additional logic for matching on a variety of data types, such as date, numeric, string, and character fields.

Table 15 Miscellaneous Match Types

This indicator ...
processes this data type ...
Date
The year of a date field.
DateDays
The day, month, and year of a date field.
DateMonths
The month and year of a date field.
DateHours
The hour, day, month, and year of a date field.
DateMinutes
The minute, hour, day, month, and year of a date field.
DateSeconds
The seconds, minute, hour, day, month, and year of a date field.
String
A generic string field.
Unistring
A generic Unicode string field.
Integer
A field containing integers.
Real
A field containing real numbers.
Char
A field containing a single character.
pro
Any field on which you want the Master Index Match Engine to use prorated weights.
Exac
Any field you want the Master Index Match Engine to match character for character.
CSC
A generic string.
DOB
A date of birth in string rather than date format.