Oracle Text (OT)
This topic provides information on the database objects, related tables, available services, OT matching job workflows as well as the limitations and key considerations for OT matching.
Database Objects and OT-Related Tables
All the OT-related tables are located in the entity resolution (ER) schema. The following is a list of the key tables:
Table 4-3 Key Tables and their Descriptions
| Table Name | Description |
|---|---|
| FCC_MR_RULESET | This table contains the OT match rule sets along with the relevant OT metadata. It is located in the ER schema. |
| FCC_ER_FULL_OT | This table is created from FCC_ER_FULL. It is a source table for OT matching. |
| FCC_MR_SELECTED_CANDIDATES | This table is located in the ER schema. It consists of the OT matching candidates for each source record prior to scoring |
| FCC_MR_MATCHED_RESULTS | This table is located in the ER schema. It consists of the selected candidates based on scoring output. |
| FCC_ER_OT_RUNSKEY_MAP | This table maintains the mapping between the ER runskey and the OT runskey for each batch run. |
| FCC_M_PIPELINE_BATCH_RUN | This table contains OT Batch specific details such as OT Runskey, BATCH_ID etc. It is used to track OT Batch executions. |
| FCC_MATCH_RULESET (Studio Schema) | This table is located in the ER schema. It is used to track the OT matching process execution. |
Services Architecture
For information on the services that are available for OT matching, see thePort Numbers for Application
OT Matching Integration in ER Jobs
Job 1: Rule set Synchronization & Setup
- The job synchronizes the rule sets from
FCC_MATCH_RULESET(Studio schema) toFCC_MR_RULESET(ER schema). - Only the OT rule sets that are not yet synced to the ER schema will be synced in the job execution.
- Ensure that all the OS rule sets are disabled before the Job 1 execution.
- Domain index creation for OT matching is handled in Job 1
- Synonym Matching: Thesaurus:
- In general terms, a thesaurus is referred to as a book or resource that lists words in groups of synonyms and related concepts.
- Similarly, Oracle text provides user ability to create, store & update virtual thesaurus into your database
-
Note:
- There can be only one thesaurus object in a Database server.
- If the thesaurus is owned by a different user and needs to be created by a new one, then the older user needs to remove the thesaurus first. The new user needs to set the
F_IS_RECENTLY_CHANGEDflag column toY, in theFCC_IDX_M_LOOKUPtable against the respective thesaurus record. - This step must be performed in case the thesaurus data is updated.
- Update the
V_SYNONYM_NAMEcolumn in theFCC_MR_M_ENTITY_ATTRIBUTEStable against the desired column / attribute to use synonym data.
Synonyms:
- A synonym is a word or phrase which has the exact same meaning, or a very similar meaning, to another word.
- Oracle Text enables the user to create case-sensitive or case-insensitive thesauruses that define synonyms and hierarchical relationships between words and phrases. You can create a thesaurus and load it into the system.
- CTX_THES Package: To maintain and browse your thesaurus programmatically, you can use the CTX_THES PL/SQL package. With this package, you can browse terms and hierarchical relationships, add and delete terms, add and remove thesaurus relations, and import and export thesauri in and out of the thesaurus tables.
- Thesaurus Operators: To expand query terms according to your loaded thesaurus, you can use the thesaurus operators in the CONTAINS clause. For example, use the SYN operator to expand a term such as user to its synonyms: 'syn(user)'
- Scenarios where it might be required:
- Data containing different names but having the same meaning.
- Data containing the same sounding names
- Steps to enable synonym matching:
- Set the
result.applySynonymsproperty to true, which is present in theapplications.propertiesfile ofmatching-service-otandscoring-engineservices. - For
Matching-service-ot, navigate to the following path:<COMPLIANCE_STUDIO_INSTALLATION_PATH>/deployed/matching-service-ot/conf
- For
scoring-engine, navigate to the following path:<COMPLIANCE_STUDIO_INSTALLATION_PATH>/deployed/scoring-engine/conf
- Set the
- The new table for OT,
FCC_ER_FULL_OT, is populated from theFCC_ER_FULLtable. TheFCC_ER_FULL_OTtable works as the source for OT matching
Job 2: OT Matching Execution
- For each OT batch run, including
RERUN, a new OT runskey is generated and mapped to the ER runskey. - Mapping is stored in the
FCC_ER_OT_RUNSKEY_MAPtable and is used for tracking in theFCC_M_PIPELINE_BATCH_RUNtable.
Limitations and Key Considerations
The following are the limitations and key considerations for OT matching
- Out-Of-the-Box (OOB) rule sets for OS and OT have different structures. Ensure that only the OT OOB rulesets are enabled for OT matching.
- Custom rule sets for OT must not have groups or sub-rules.