Oracle8i interMedia Text Reference
Release 2 (8.1.6)

Part Number A77063-01

Library

Product

Contents

Index

Go to previous page Go to beginning of chapter Go to next page

Executables, 4 of 4


Knowledge Base Extension Compiler (ctxkbtc)

The ctxkbtc compiler takes one or more specified thesauri and compiles them with the interMedia Text knowledge base to create an extended knowledge base. The extended information can be application-specific terms and relationships.

The extended knowledge base overrides any terms and relationships in the knowledge base where there is overlap. The extended knowledge base is accessed during tasks that use the knowledge base, such as theme indexing, processing ABOUT queries in English, and extracting document themes with document services.

See Also:

For more information about the knowledge base packaged with interMedia Text, see Appendix J, "Knowledge Base - Category Hierarchy".

For more information about the ABOUT operator, see ABOUT operator in Chapter 4.

For more information about document services, see Chapter 8, "CTX_DOC Package"

Syntax

ctxkbtc -user uname/passwd
       [-name thesname1 [thesname2 ... thesname16]]
       [-revert]
       [-verbose]
       [-log filename]
-user

Specify the username and password for the administrator creating an extended knowledge base.

-name

Specify the name(s) of the thesauri (up to 16) to be compiled with the knowledge base to create the extended knowledge base. The thesauri you specify must already be loaded with ctxload.

-revert

Reverts the extended knowledge base to the default knowledge base provided by interMedia Text.

-verbose

Displays all warnings and messages, including non-NLS messages, to the standard output.

-log

Specify the log file for storing all messages. When you specify a log file, no messages are reported to standard out.

Usage Notes

Knowledge base extension cannot be performed when theme indexing is being performed.

In addition, any SQL sessions that are using interMedia Text functions must be exited and reopened to make use of the extended knowledge base.

There can be only one user extension per installation. Since a user extension affects all users at the installation, only administrators or terminology managers should extend the knowledge base.

Running ctxkbtc twice removes the previous extension.

Before being compiled, each thesaurus must be loaded into interMedia Text case sensitive with the "-thescase Y" option in ctxload.

Constraints on Thesaurus Terms

Terms are case sensitive. If a thesaurus has a term in uppercase, for example, the same term present in lowercase form in a document will not be recognized.

The maximum length of a term is 80 characters.

Disambiguated homographs are not supported.

Constraints on Thesaurus Relations

The following constraints apply to thesaurus relations:

Linking New Terms to Existing Terms

Oracle recommends that new terms be linked to one of the categories in the knowledge base for best results in theme proving when appropriate.

See Also:

For more information about the knowledge base, see Appendix J, "Knowledge Base - Category Hierarchy" 

If new terms are kept completely disjoint from existing categories, fewer themes from new terms will be proven. The result of this is poorer precision and recall with ABOUT queries as well poor quality of gists and theme highlighting.

You link new terms to existing terms by making an existing term the broader term for the new terms.

Example

You purchase a medical thesaurus medthes containing a a hierarchy of medical terms. The four top terms in the thesaurus are the following:

To link these terms to the existing knowledge base, add the following entries to the medical thesaurus to map the new terms to the existing health and medicine branch:

health and medicine
 NT Anesthesia and Analgesia
 NT Anti-Allergic and Respiratory System Agents
 NT Anti-Inflamammatory Agents, Antirheumatic Agents, and Inflamation Mediators
 NT Antineoplastic and Immunosuppressive Agents

Assuming the medical thesaurus is in a file called med.thes, you load the thesaurus as medthes with ctxload as follows:

ctxload -thes -thescase y -name medthes -file med.thes -user ctxsys/ctxsys

To link the loaded thesaurus medthes to the knowledge base, use ctxkbtc as follows:

ctxkbtc -user ctxsys/ctxsys -name medthes 

Order of Precedence for Multiple Thesauri

When multiple thesauri are to be compiled, precedence is determined by the order in which thesauri are listed in the arguments to the compiler (most preferred first). A user thesaurus always has precedence over the built-in knowledge base.

Size Limits

The following table lists the size limits associated with creating and compiling an extended knowledge base:

Description of Parameter  Limit 

Number of RTs (from + to) per term 

32 

Number of terms per a single hierarchy (i.e., all narrower terms for a given top term) 

64000 

Number of new terms in an extended knowledge base 

1 million 

Number of separate thesauri that can be compiled into a user extension to the KB 

16 


Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index