thesaurus

A thesaurus is a list of terms or phrases with relationships specified among them, such as a synonym, a broader term, and a narrower term. When a user issues a search query, Oracle SES can expand the search results to include matches for the related terms.

A thesaurus contains domain-specific knowledge. You can build a thesaurus, buy an industrial-specific thesaurus, or use utilities to extract a thesaurus from a specific corpus of documents. The thesaurus must be compliant with both the ISO-2788 and ANSI Z39.19(1993) standards.

A thesaurus must be loaded in Oracle SES for thesaurus-based query expansion. If no thesaurus is loaded or if the specified term or phrase cannot be found in the loaded thesaurus, then query expansion is not possible. Oracle SES only returns documents containing the original term or phrase. The default expansion level is one.

The proper encoding of an XML document for thesaurus configuration is UTF-8, which is the Oracle SES default language setting. Ensure that the NLS_LANG environment variable setting is consistent with the XML document encoding.

Object Type

Creatable

Object Key

name

Object Key Command Syntax

--NAME=object_name

-n object_name

State Properties

None

Supported Operations

create
delete
export
getAllObjectKeys
update

Administration GUI Page

None

XML Description

The <search:thesauruses> element defines a thesaurus:

<search:thesauruses>
   <search:thesaurus>
      <search:name>
      <search:thesaurusContent>

Element Descriptions 

<search:thesauruses>

Contains a <search:thesaurus> element, which describes a thesaurus.

<search:thesaurus>

Describes a thesaurus. It contains these child elements:

<search:name>
<search:thesaurusContent>
<search:name>

The name of the thesaurus. This name must be DEFAULT. (Required)

<search:thesaurusContent>

The thesaurus content. (Required)

Enter each term on a separate line within a CDATA element. You can identify broader terms (BT), narrower terms (NT) and synonyms (SYN). Note the one-space indentation of the related terms:

dog
 BT mammal
 NT domestic dog
 NT wild dog
 SYN canine

Example

This XML document defines the default thesaurus:

<?xml version="1.0" encoding="UTF-8"?>
<search:config productVersion="11.1.2.0.0" xmlns:search="http://xmlns.oracle.com/search">
   <search:thesauruses>
      <search:thesaurus>
         <search:name>DEFAULT</search:name>
         <search:thesaurusContent>
<![CDATA[
cat
 SYN feline
 NT domestic cat
 NT wild cat
 BT mammal
mammal
 BT animal
domestic cat
 NT Persian cat
 NT Siamese cat
wild cat
 NT tiger
tiger
 NT Bengal tiger
dog
 BT mammal
 NT domestic dog
 NT wild dog
 SYN canine
domestic dog
 NT German Shepard
wild dog
 NT Dingo
]]>
  </search:thesaurusContent>
      </search:thesaurus>
   </search:thesauruses>
</search:config>