Oracle Text Application Developer's Guide
Release 9.0.1

Part Number A90122-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

Working With a Thesaurus, 4 of 5


Using a Thesaurus in a Query Application

Defining a custom thesaurus allows you to process queries more intelligently. Since users of your application might not know which words represent a topic, you can define synonyms or narrower terms for likely query terms. You can use the thesaurus operators to expand your query into your thesaurus terms.

There are two ways to enhance your query application with a custom thesaurus so that you can process queries more intelligently:

Each approach has its advantages and disadvantages.

Loading a Custom Thesaurus and Issuing Thesaural Queries

To build a custom thesaurus, follow these steps:

  1. Create your thesaurus. See "Defining Thesaural Terms" in this chapter.

  2. Load thesaurus with ctxload. For example, the following example imports a thesaurus named tech_doc from an import file named tech_thesaurus.txt:

    ctxload -user jsmith/123abc -thes -name tech_doc -file tech_thesaurus.txt 
    
    
    
  3. Use THES operators to query. For example, you can find all documents that contain XML and its synonyms as defined in tech_doc:

    'SYN(XML, tech_doc)'
    

Advantage

The advantage of using this method is that you can modify the thesaurus after indexing.

Limitations

This method requires you to use thesaurus expansion operators in your query. Long queries can cause extra overhead in the thesaurus expansion and slow your query down.

Augmenting Knowledge Base with Custom Thesaurus

You can add your custom thesaurus to a branch in the existing knowledge base. The knowledge base is a hierarchical tree of concepts used for theme indexing, ABOUT queries, and deriving themes for document services.

When you augment the existing knowledge base with your new thesaurus, you query with the ABOUT operator which implicitly expands to synonyms and narrower terms. You do not query with the thesaurus operators.

To augment the existing knowledge base with your custom thesaurus, follow these steps:

  1. Create your custom thesaurus, linking new terms to existing knowledge base terms. See "Defining Thesaural Terms" and "Linking New Terms to Existing Terms".

  2. Load thesaurus with ctxload. See "Loading a Thesaurus with ctxload".

  3. Compile the loaded thesaurus with ctxkbtc compiler. "Compiling a Loaded Thesaurus" later in this section.

  4. Index your documents. By default the system creates a theme component to your index.

  5. Use ABOUT operator to query. For example, to find all documents that are related to the term politics including any synonyms or narrower terms as defined in the knowledge base, issue the query:

    'about(politics)'
    

Advantage

Compiling your custom thesaurus with the existing knowledge base before indexing allows for faster and simpler queries with the ABOUT operator. Document services can also take full advantage of the customized information for creating theme summaries and Gists.

Limitations

Use of the ABOUT operator requires a theme component in the index, which requires slightly more disk space. You must also define the thesaurus before indexing your documents. If you make any change to the thesuarus, you must recompile your thesaurus and re-index your documents.

Linking New Terms to Existing Terms

When adding terms to the knowledge base, Oracle recommends that new terms be linked to one of the categories in the knowledge base for best results in theme proving.

See Also:

Oracle Text Reference for more information about the supplied English knowledge base. 

If new terms are kept completely separate from existing categories, fewer themes from new terms will be proven. The result of this is poor precision and recall with ABOUT queries as well as poor quality of gists and theme highlighting.

You link new terms to existing terms by making an existing term the broader term for the new terms.

Example: Linking New Terms to Existing Terms

You purchase a medical thesaurus medthes containing a a hierarchy of medical terms. The four top terms in the thesaurus are the following:

To link these terms to the existing knowledge base, add the following entries to the medical thesaurus to map the new terms to the existing health and medicine branch:

health and medicine
 NT Anesthesia and Analgesia
 NT Anti-Allergic and Respiratory System Agents
 NT Anti-Inflamammatory Agents, Antirheumatic Agents, and Inflamation Mediators
 NT Antineoplastic and Immunosuppressive Agents

Loading a Thesaurus with ctxload

Assuming the medical thesaurus is in a file called med.thes, you load the thesaurus as medthes with ctxload as follows:

ctxload -thes -thescase y -name medthes -file med.thes -user ctxsys/ctxsys

Compiling a Loaded Thesaurus

To link the loaded thesaurus medthes to the knowledge base, use ctxkbtc as follows:

ctxkbtc -user ctxsys/ctxsys -name medthes 

Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback