Skip Navigation Links | |
Exit Print View | |
Oracle Java CAPS Master Index Match Engine Reference Java CAPS Documentation |
Master Index Match Engine Reference
About the Master Index Match Engine
Master Index Match Engine Overview
Deterministic and Probabilistic Data Matching
Probabilities and Direct Weights
Matching and Unmatching Probabilities
Agreement and Disagreement Weight Ranges
How the Master Index Match Engine Works
Master Index Match Engine Structure
Master Index Match Engine Configuration Files
Master Index Match Engine Matching Weight Formulation
Master Index Match Engine Data Types
The Master Index Match Engine and the Master Index Standardization Engine
Oracle Java CAPS Master Index Standardization and Matching Process
Master Index Match Engine Matching Configuration
The Master Index Match Engine Match Configuration File
Master Index Match Engine Match Configuration File Format
Match Configuration File Sample
Master Index Match Engine Matching Comparison Functions At a Glance
Master Index Match Engine Comparator Definition List
Master Index Match Engine Comparison Functions
Advanced Bigram Comparator (b2)
Uncertainty String Comparators
Advanced Jaro String Comparator (u)
Winkler-Jaro String Comparator (ua)
Condensed String Comparator (us)
Advanced Jaro Adjusted for First Names (uf)
Advanced Jaro Adjusted for Last Names (ul)
Advanced Jaro Adjusted for House Numbers (un)
Advanced Jaro AlphaNumeric Comparator (ujs)
Unicode String Comparator (usu)
Unicode AlphaNumeric Comparator (usus)
Exact Character-to-Character Comparator (c)
Condensed AlphaNumeric SSN Comparator (nS)
Date Comparator With Years as Units (dY)
Date Comparator With Months as Units (dM)
Date Comparator With Days as Units (dD)
Date Comparator With Hours as Units (dH)
Date Comparator With Minutes as Units (dm)
Date Comparator With Seconds as Units (ds)
Creating Custom Comparators for the Master Index Match Engine
Step 1: Create the Custom Comparator Java Class
Step 2: Register the Comparator in the Comparators List
Step 3: Define Parameter Validations (Optional)
To Define Parameter Validations
Step 4: Define Data Source Handling (Optional)
To Define Data Source Handling
Step 5: Define Curve Adjustment or Linear Fitting (Optional)
To Define Curve Adjustment or Linear Fitting
Step 6: Compile and Package the Comparator
Step 7: Import the Comparator Package Into Oracle Java CAPS Master Index
To Import a Comparison Function
Step 8: Configure the Comparator in the Match Configuration File
Master Index Match Engine Configuration for Common Data Types
Master Index Match Engine Match String Fields
Person Data Match String Fields
Address Data Match String Fields
Business Name Match String Fields
Master Index Match Engine Match Types
Configuring the Match String for a Master Index Application
Configuring the Match String for Person Data
Fine-Tuning Weights and Thresholds for Oracle Java CAPS Master Index
Customizing the Match Configuration and Thresholds
Customizing the Match Configuration
Probabilities or Agreement Weights
Weight Ranges Using Agreement Weights
Weight Ranges Using Probabilities
Determining the Weight Thresholds
The MatchingConfig section of mefa.xml determines which fields are passed to the Master Index Match Engine for matching (the match string). The match types specified in this section help the match engine determine the algorithm and custom logic to use for matching on each field.
If you are matching on fields parsed from a free-form text field, define each individual parsed field you want to use for matching in the Master Index Wizard or Configuration Editor. The match types you can use for each field in this section are defined in the first column of the match configuration file (matchConfigFile.cfg). Make sure the match type you specify has the correct matching logic defined in the match configuration file. See Master Index Match Engine Match Types for more information.
The following topics provide more information about matching on different types of data:
When matching on person data, you can include any field stored in the database for matching. To configure the match string, follow the instructions under Defining the Master Index Match String in Oracle Java CAPS Master Index Configuration Guide. For the Master Index Match Engine, each data type has a different match type (specified by the match-type element in the matching configuration file). The FirstName, LastName, SSN, Gender, and DOB match types are specific to person matching. You can specify any of the other match types defined in the match configuration file as well. For more information, see Master Index Match Engine Match Types.
A sample match string for person matching is shown below. This sample matches on first and last names, date of birth, social security number, gender, and the street name of the address.
<match-system-object> <object-name>Person</object-name> <match-columns> <match-column> <column-name> Enterprise.SystemSBR.Person.FirstName_Std </column-name> <match-type>FirstName</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.LastName_Std </column-name> <match-type>LastName</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.SSN </column-name> <match-type>SSN</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.DOB </column-name> <match-type>DateDays</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.Gender </column-name> <match-type>Char</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.Address.StreetName </column-name> <match-type>StreetName</match-type> </match-column> </match-columns> </match-system-object>
For matching on street address fields, make sure the match string you specify in the MatchingConfig section of mefa.xml contains all or a subset of the fields that contain the standardized data (the original text in street address fields is generally too inconsistent to use for matching). You can include additional fields for matching, such as the city name or postal code.
To configure the match string, follow the instructions under Defining the Master Index Match String in Oracle Java CAPS Master Index Configuration Guide. For the Master Index Match Engine, each component of a street address has a different match type (specified by the match-type element in the matching configuration file). The default match types for addresses are StreetName, HouseNumber, StreetDir, and StreetType. You can specify any of the other match types defined in the match configuration file, as well. For more information, see Master Index Match Engine Match Types.
A sample match string for address matching is shown below.
<match-system-object> <object-name>Person</object-name> <match-columns> <match-column> <column-name>Enterprise.SystemSBR.Person.Address.StreetName </column-name> <match-type>StreetName</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.Address.HouseNumber </column-name> <match-type>HouseNumber</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.Address.StreetDir </column-name> <match-type>StreetDir</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Person.Address.StreetType </column-name> <match-type>StreetType</match-type> </match-column> </match-columns> </match-system-object>
For matching on business name fields, make sure the match string you specify in the MatchingConfig section of mefa.xml contains all or a subset of the fields that contain the standardized data (the unparsed business names are typically too inconsistent for matching). You can include additional fields for matching if required.
To configure the match string, follow the instructions under Defining the Master Index Match String in Oracle Java CAPS Master Index Configuration Guide. For the Master Index Match Engine, each data type has a different match type (specified by the match-type element of the matching configuration file). The PrimaryName, OrgTypeKeyword, AssocTypeKeyword, IndustrySectorList, IndustryTypeKeyword, and Url match types are specific to business name matching. You can specify any of the other match types defined in the match configuration file, as well. For more information, see Master Index Match Engine Match Types.
A sample match string for business name matching is shown below. This sample matches on the company name, the organization type, and the sector.
<match-system-object> <object-name>Company/object-name> <match-columns> <match-column> <column-name>Enterprise.SystemSBR.Company.Name_PrimaryName </column-name> <match-type>PrimaryName</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Company.Name_OrgType </column-name> <match-type>OrgTypeKeyword</match-type> </match-column> <match-column> <column-name>Enterprise.SystemSBR.Company.Name_Sector </column-name> <match-type>IndustryTypeKeyword</match-type> </match-column> </match-columns> </match-system-object>