Skip Navigation Links | |
Exit Print View | |
Understanding Oracle Java CAPS Master Index Configuration Options (Repository) Java CAPS Documentation |
Understanding Oracle Java CAPS Master Index Configuration Options (Repository)
About Oracle Java CAPS Master Index (Repository)
Oracle Java CAPS Master Index Configuration
Features of Oracle Java CAPS Master Index
Configuration Overview for Oracle Java CAPS Master Index (Repository)
About the Configuration Files for Oracle Java CAPS Master Index (Repository)
Master Index Object Definition File
Master Index Candidate Select File
Master Index Field Validation File
Master Index Enterprise Data Manager File
Match and Standardization Engine Configuration Files
Using the Editors for Oracle Java CAPS Master Index (Repository)
Configuration Editor - Repository
Master Index Object Definition Configuration (Repository)
Master Index Object Definition Components (Repository)
Master Index Object Definition Objects
Master Index Object Definition Fields
Master Index Object Definition Relationships
The Master Index Object Definition File (Repository)
Modifying the Master Index Object Definition
Object Definition File Description
Object Definition File Example
Candidate Select Configuration (Repository)
Query Builder Components (Repository)
Basic Queries in a Master Index (Repository)
Blocking Queries in a Master Index (Repository)
Phonetic Queries in a Master Index (Repository)
The Candidate Select File (Repository)
Modifying the Candidate Select File
Range Search Processing (Repository)
Blocking Query Range Searching
Blocking Query Offset and Constant Combinations
Threshold Configuration (Repository)
Manager Service Components (Repository)
The Threshold File (Repository)
Match Field Configuration (Repository)
Matching Service Components (Repository)
Match and Standardization Engines
Block Picker and Pass Controller
Sample Standardization and Matching Sequence (Repository)
The Match Field File (Repository)
Modifying the Match Field File
Best Record Configuration (Repository)
The Survivor Calculator and the SBR (Repository)
Update Manager Components (Repository)
Survivor Helper Default Strategy
Survivor Helper Weighted Strategy
Survivor Helper Union Strategy
Weighted Calculator SourceSystem Strategy
Weighted Calculator SystemAgreement Strategy
Weighted Calculator MostRecentModified Strategy
Update Manager Update Policies
Update Manager Update Policy Flag
The Best Record File (Repository)
Modifying the Best Record File
Field Validation Configuration (Repository)
The Field Validation File (Repository)
Modifying the Field Validation File
Field Validation File Structure
Enterprise Data Manager Configuration
The Enterprise Data Manager File Structure
Modifying the Enterprise Data Manager File
Enterprise Data Manager File Description
In the Candidate Select file, you configure properties of the Query Builder, which is a class that uses defined criteria and options to generate queries and query results from a master index database. The criteria and options used by the Query Builder to create database queries are defined in the Candidate Select file. The criteria must be fields that are defined in the Object Definition, and the options are key and value pairs that fine-tune the query operation. You can define the characteristics of the searches performed from the Enterprise Data Manager and of the queries used by the master index application to search for a candidate pool of potential matches for incoming records.
The following topics provide information about queries and the structure of the Candidate Select file:
The master index application performs two types of queries. Users perform manual queries from the EDM and the master index application automatically performs queries before processing matches for an incoming record. Two types of queries, basic queries and blocking queries, are predefined in the Query Builder. By default, basic queries are defined for the EDM and blocking queries are defined for match processing, though this is not required. You can also use a blocking query for the phonetic searches performed from the EDM. Both types of queries are configured by the Candidate Select file, and custom queries can be created and implemented with the master index application.
You can configure certain query properties. You can configure both basic and blocking queries to search on standardized or phonetic versions of the search criteria, and you can also specify that they search on exact values or a range of values. Basic queries can be configured to allow wildcard characters. For the blocking queries, you define the criteria to include in each block of query criteria.
The following topics provide additional information about the different types of queries:
By default, searches performed from the EDM follow the logic defined in the configured basic queries. You can specify which query type to use for each search defined for the EDM (this is specified in the Enterprise Data Manager file). These searches can be weighted, which means that the match engine calculates the likelihood that the search results match the actual search criteria and assigns a matching weight to each returned record. You can specify whether the search is performed on the original or phonetic version of the criteria.
The basic query uses all supplied search criteria to create a single SQL query. For this query, each field in the WHERE clause is joined by an AND operator, meaning that only records that match on all search fields are returned. This query has an option to allow wildcard characters in the search criteria (a percent sign (%) indicates multiple unknown characters). When this option is set to true, the query uses the LIKE operator rather than EQUALS. This option allows you to search by criteria for which you have incomplete data.
The searches performed from the EDM can be further customized in the Enterprise Data Manager file (for more information, see Enterprise Data Manager Configuration).
When the master index application evaluates possible matches of records sent to the master index application from external systems and from the EDM, the index performs a set of predefined SQL queries to retrieve a subset of possible matches. These queries are known as blocking queries. The matching algorithm processes the input record against the profiles retrieved from the blocking query (known as the candidate pool) and assigns them matching probability weights.
In the Candidate Select file, you define the criteria and conditions for querying the database to retrieve the subset of possible matches to the incoming record, including Oracle hints and SQL Server OPTION hints. You can define multiple queries, known as blocks, for each blocking query, and the master index application performs each of these queries in turn until sufficient records are retrieved (called a match pass). Using the default Query Builder, a block is only processed if the search criteria include all of the fields defined for that block. Each field in a block is joined by an AND operator in the WHERE clause, and each block is joined by a UNION operator. This type of search can also be used as a phonetic search in the EDM.
The blocking queries you define here are referenced in the Threshold file, which specifies which one of the defined blocking queries to use for match processing. They might also be referenced in the Enterprise Data Manager file if a blocking query is used for phonetic searches from the EDM. To enable extensive searching (that is, searching against additional tables, such as an alias table for a person index), you must add the fields from that table to the blocking query.
You can configure both basic queries and blocking queries to perform phonetic searches from the EDM. If you use a basic query, then all entered criteria must match existing records in order to return results from the search. If you use a blocking query, several queries are performed using different combinations of data until enough matching records are returned or until all defined combinations have been tried.
For example, if you use a basic query and enter first and last name, date of birth, gender, and SSN for criteria, the basic query might not return any matches if any one of those fields does not match the criteria. However, if you use a blocking query for the same example, it might search on SSN, then on first name and date of birth, and then on last name and gender. The query returns any matching records from any of the query passes.
Both basic and blocking queries can be configured to perform exact searches or range searches. An exact search performs a query for the exact value entered into a field as search criteria; range searches perform a query on a range of values based on the value entered into a field as search criteria. The basic query supports standard range searching, where both the lower and upper limits of the range is supplied. The blocking query supports standard range searching plus two additional types that use predefined offset values or constants.
Offset values allow you to specify values to be added to or subtracted from the entered value to determine the range on which to search. Constants provide a default value to use as a range when no value is entered or when incomplete information is available.
Range searching is configured in both the Enterprise Data Manager file and the Candidate Select file. The processing logic for different types of range searching is described in Range Search Processing (Repository).
The properties for the predefined queries are defined in the Candidate Select file in XML format. Some of the information entered into the default configuration file is based on the fields you specified for blocking in the wizard, and some is standard across all implementations. For most implementations, this file will require some customization.
The following topics provide information about working with the Candidate Select file:
You can modify the Candidate Select file at any time, but you must regenerate the application and redeploy the project after making any changes to the file. The properties of the blocking query used by the match process should not be modified after moving into production because it can cause unexpected matching weight results. The possible modifications to this file are restricted by the schema definition, so be sure to validate the file after making any changes. Most of the components in this file can be configured using the Configuration Editor, which simplifies the process of defining queries by providing a graphical interface to perform the required tasks.
Table 2 lists each element in the Candidate Select file and provides a description of each element along with any requirements or constraints for each element.
Table 2 Candidate Select File Structure
|
Below is a sample illustrating the elements in the Candidate Select file.
<QueryBuilderConfig module-name="QueryBuilder" parser-class= "com.stc.eindex.configurator.impl.querybuilder.QueryBuilderConfiguration"> <query-builder name="ALPHA-SEARCH" class="com.stc.eindex.querybuilder.BasicQueryBuilder" parser-class="com.stc.eindex.configurator.impl.querybuilder. KeyValueConfiguration" standardize="true" phoneticize="false"> <config> <option key="UseWildcard" value="true"/> </config> </query-builder> <query-builder name="PHONETIC-SEARCH" class="com.stc.eindex.querybuilder.BasicQueryBuilder" parser-class="com.stc.eindex.configurator.impl.querybuilder. KeyValueConfiguration" standardize="true" phoneticize="true"> <config> <option key="UseWildcard" value="false"/> </config> </query-builder> <query-builder name="BLOCKER-SEARCH" class="com.stc.eindex.querybuilder.BlockerQueryBuilder" parser- class="com.stc.eindex.configurator.impl.blocker.BlockerConfig" standardize="true" phoneticize="true"> <config> <block-definition number="ID000000"> <block-rule> <equals> <field>Enterprise.SystemSBR.Person.FnamePhonetic </field> <source>Person.FnamePhoneticCode</source> </equals> <equals> <field>Enterprise.SystemSBR.Person.LnamePhonetic </field> <source>Person.LnamePhoneticCode</source> </equals> </block-rule> </block-definition> <block-definition number="ID000001"> <block-rule> <equals> <field>Enterprise.SystemSBR.Person.SSN</field> <source>Person.SSN</source> </equals> </block-rule> </block-definition> <block-definition number="ID000002"> <hint>ALL_ROWS</hint> <block-rule> <equals> <field>Enterprise.SystemSBR.Person.FnamePhonetic </field> <source>Person.FnamePhoneticCode</source> </equals> <range> <field>Enterprise.SystemSBR.Person.DOB</field> <source>Person.DOB</source> <default> <lower-bound type="offset">-5</lower-bound> <upper-bound type="offset">5</upper-bound> </default> </range> <equals> <field>Enterprise.SystemSBR.Person.Gender</field> <source>Person.Gender</source> </equals> </block-rule> </block-definition> </config> </query-builder> </QueryBuilderConfig>