Oracle ConText Option Application Developer's Guide, Rel. 2.3 Go to Product Documentation Library
Library
Go to books for this product
Product
Go to Contents for this book
Contents
Go to Index
Index



Go to previous file in sequence Go to next file in sequence

Understanding Query Expressions


This chapter explains how to use ConText to create query expressions to find relevant text in documents. The topics covered in this chapter are:

About Query Expressions

A query expression defines the search criteria for retrieving documents using ConText. A query expression consists of query terms (words and phrases) and other components such as operators and special characters which allow users to specify exactly which documents are retrieved by ConText.

A query expression can also call stored query expressions (SQEs) to return stored query results or call PL/SQL functions to return values used in the query.

When a query is executed using any of the methods supported by ConText, one of the arguments included in the query is a query expression. ConText then returns a list of all the documents that satisfy the search criteria, as well as scores that measure the relevance of the document to the search criteria.

Query Terms

Query terms can consist of words and phrases. Query terms can also contain stopwords.

Words and Phrases

The words in a query expression are the individual tokens on which the query expression operators perform an action. If multiple words are contained in a query expression, separated only by blank spaces (no operators), the string of words is considered a phrase and the entire string is searched for during a query.

Stopwords

Stopwords are common words, such as and, the, of, and to, that are not considered significant query terms by themselves because they occur so often in text. However, stopwords can provide useful search information when combined with more significant terms.

For example, a query for documents containing the phrase peanut butter and jelly returns different results than a query for documents containing the terms peanut butter and jelly.

When you define a policy for a column, ConText lets you identify a list of stopwords. When stopwords are encountered in the documents in the column, they are not included as indexed terms in the text index; however, they are recorded.

As a result, stopwords cannot be searched for explicitly in text queries, but can be included as part of a phrase in a query expression.

See Also:

For more information about querying with stopwords, see "Querying with Stopwords" in this chapter.

 

Stoplists can be created in any language supported by ConText. ConText provides a default stoplist in English.


Note:

Stopwords do not have an affect on the theme indexes generated by ConText for your English-language documents.

 

Query Expression Components

In addition to query terms, a query expression may contain any or all of the following components:

Component   Purpose  

Operators

 

Define the relationships between the terms in a query expression and specify the output returned by the query. The different types of operators are: logical, ranking, result set, proximity, expansion, and thesaurus.

 

Wildcard Characters

 

Expand query terms using pattern matching

 

Grouping Characters

 

Group terms and operators in a query expression

 

Stored Query Expressions (SQEs)

 

Return the results of a query that has been executed and the results stored in an SQE table

 

PL/SQL Functions

 

Execute a function and use the results in a query expression

 

Base-Letter Queries

For languages that use an 8-bit character set, such as French and Spanish, Context gives you the option of converting characters to their base-letter representation before text indexing. This means that words with accents, umlauts, and so on are converted to their base-letter representation before their tokens are placed in the text index.

When you specify a text index that has used base-letter conversion in a query, ConText converts the term in the query expression to match the base-letter representation before the query is processed. In addition, all expansion and stopword checking for the query is performed on the base-letter terms.


Note: The terms in a thesaural query are not converted to base-letter representation before look-up in the thesaurus. The base-letter conversion takes place after the thesaurus look-up and is performed on all the terms returned by the thesaurus.:

For more information about creating an index that supports base-letter conversion, see Oracle ConText Option Administrator's Guide.

 

Query Expression Examples

The following example of a one-step query returns all articles that contain the word wine in the TEXTTAB.TEXT_COLUMN column. The query expression consists only of the query term wine, surrounded by single quotes.

SELECT articles FROM texttab
WHERE CONTAINS(textcol, 'wine') > 0;

The following example of a one-step query returns all articles that contain the phrase wine and roses in the TEXTTAB.TEXT_COLUMN column. The query expression consists of the query phrase wine and roses, surrounded by single quotes.

SELECT articles FROM texttab 
WHERE CONTAINS(textcol, '{wine and roses}') > 0;
 

See Also:

For more information about the CONTAINS function used in one-step queries, see CONTAINS in Chapter 9.

 

Logical Operators

Logical operators combine the terms in a query expression. All single words and phrases may be combined with logical operators. When query terms are combined, the number of spaces around the logical operator is not significant.

Logical operators link query terms together to produce scores that are based on the relationship of the terms to each other. The logical operators combine the scores of their operands up to a maximum value of 100. Operands can be any query terms, as well as other operators.

Operator   Syntax   Description  

AND

 

term1&term2

term1 and term2

 

Returns documents that contain term1 and term2. Returns the minimum score of its operands. All query terms must occur; lower score taken.

 

OR

 

term1|term2

term1 or term2

 

Returns documents that contain term1 or term2. Returns the maximum score of its operands. At least one term must exist; higher score taken.

 

NOT

 

term1~term2

term1 not term2

 

Returns documents that contain term1 and not term2.

 

EQUIVALENCE

 

term1=term2

term1 equiv term2

 

Specifies that term2 is an acceptable substitution for term1.

 

AND Operator

Use the AND operator to search for documents that contain at least one occurrence of each of the query terms. For example, to obtain all the documents that contain the terms batman and robin and penguin, issue the following query:

'batman & robin & penguin'

In an AND query, the score returned is the score of the lowest query term. In the example above, if the three individual scores for the terms batman, robin, and penguin is 10, 20 and 30 within a document, the document scores 10.

OR Operator

Use the OR operator to search for documents that contain at least one occurrence of any of the query terms. For example, to obtain the documents that contain the term cats or the term dogs, use one of the following:

'cats | dogs'
'cats OR dogs'

In an OR query, the score returned is the score for the highest query term. In the example above, if the scores for cats and dogs is 30 and 40 within a document, the document scores 40.

NOT Operator

Use the NOT operator to search for documents that contain one query term and not another.

For example, to obtain the documents that contain the term animals but not dogs, use the following expression:

'animals ~ dogs'

Similarly, to obtain the documents that contain the term transportation but not automobiles or trains, use the following expression:

'transportation not (automobiles or trains)' 


Note:

The NOT operator does not affect the scoring produced by the other logical operators.

 

Equivalence Operator

Use the equivalence operator to specify an acceptable substitution for a word in a search. For example, if you want all the documents that contain the phrase alsatians are big dogs or labradors are big dogs, you can write:

'labradors=alsatians are big dogs'

ConText processes the above query faster and more efficiently than the same query written with the accumulate operator. For example, you could write the above query less efficiently and less concisely as follows:

'labradors are big dogs, alsatians are big dogs'

The savings you gain in using the equivalence operator over the accumulate operator is most significant when you have more than one equivalence operator in the query expression. For example, the following query

'labradors=alsatians are big canines=dogs'

is a more efficient, more concise form of:

'labradors are big dogs, 
alsatians are big dogs, 
alsatians are big canines, 
labradors are big canines'
Precedence of Equivalence Operator

The equivalence operator has higher precedence that all other operators except the unary operators (fuzzy, soundex, stem, and PL/SQL function calls).

WITHIN Operator

Use the WITHIN operator to narrow down a query into pre-defined document sections.

For example in an HTML document set, you or your ConText administrator can define a section for all headings delimited with <HEAD> and <\HEAD> and subsequently issue a query for a term in a heading across all documents.

See Also:

For more information about defining sections, see the Oracle ConText Option Administrator's Guide.

 

The syntax for the WITHIN operator is as follows:

Syntax   Description  

term WITHIN section

 

Searches for term within the pre-defined section. The WITHIN operator has no effect on score.

 


Note:

The WITHIN operator requires you to know the name of the section you wish to search. A list of defined sections can be obtained using the CTX_ALL_SECTIONS or CTX_USER_SECTIONS views.

 

Examples

To find all the documents that contain the term San Francisco within the pre-defined section Headings, write your query as follows:

'San Francisco within Headings'

To find all the documents that contain the term sailing and contain the term San Francisco within the pre-defined section Headings, write your query as follows:

'(San Francisco within Headings) and sailing'

To find all documents that contain the terms dog and cat within the pre-defined section Headings, write your query as follows:

'dog and cat within Headings'

Note that the above query is logically different from:

'dog within Headings and cat within Headings'

which finds all documents that contain dog and cat where the terms dog and cat are in different Headings sections.

To find all documents in which dog is near cat within the section Headings, write your query as follows:

'dog near cat within Headings'

Limitations

The WITHIN operator has the following limitations:

Score-Changing Operators

Score changing operators behave like logical operators in that they return documents given the terms you specify. However, these operators affect document scores differently and, as such, can be used to change a document's rank in a hitlist with respect to a query term. The following table describes these operators:

Operator   Syntax   Description  

ACCUMULATE

 

term1,term2

term1 accum term2

 

Returns documents that contain term1 or term2. Calculates score by adding the score of each operand. Similar to OR, except that the returned score is the sum of all scores.

 

MINUS

 

term1-term2

term1 minus term2

 

Returns documents that contain term1. Calculates score by subtracting occurrences of term2 from occurrences of term1.

 

NEAR

 

term1;term2

term1 near term2

 

Returns documents that contain term1 and term2. Calculates score based on how close term1 is to term2; a score of 100 means terms are adjacent to one another.

 

WEIGHT

 

term*n

 

Returns documents that contain term. Calculates score by multiplying the raw score of term by n, where n is a number from 0.1 to 10.

 

Accumulate Operator

Use the accumulate operator to search for documents that contain at least one occurrence of any of the query terms, where the documents that contain the most frequent occurrences of the query terms are given the highest score.

For example, to search for documents that contain either term Brazil or soccer and to have the highest scores attached to the documents that contain the most occurrences of these words, you can issue:

'soccer,Brazil'
	

Accumulate is similar to OR, in the sense that a document satisfies the query expression if any of the terms occur in the document; however, the scoring is different. OR returns a score based only on the query term that occurs most frequently in a document. Accumulate combines the scores for all the query terms that occur in a document, topping out at 100 when the sum exceeds 100. Thus documents that contain the most query terms are ranked the highest.

MINUS Operator

Use the MINUS operator to search for documents that contain a query term, and when you want the presence of a second query term to cause the document to be ranked lower.

The minus operator is useful for lowering the score of documents that contain "noise". For example, suppose a query on the term cars always returned high scoring documents about Ford cars. You can lower the scoring of the Ford documents by using the expression:

'cars - Ford'

In essence, this expression returns the documents that contain the term cars. However, the score returned for a document is the number of occurrences of cars minus the number of occurrences of Ford. When a returned document does not contain Ford, the occurrence of the term Ford is counted as zero.

Near Operator

Words or phrases that occur close together are considered to be more closely associated than those that are farther apart. The proximity operator calculates a score based on how close words are to each other rather than on how often the word or phrase appears in the document.

The score for a document is the highest score out of all the query terms that occur in proximity to each other. A score of 100 is returned when the query terms are adjacent. When the terms are not adjacent, ConText returns a score based on the following formula:

100 - (number of words between the two query terms)

When there are more than 100 words separating the terms, ConText returns 1.

For example, if the query expression is ice;cream, the phrase I love ice cream would score 100, while the phrase ice is colder than cream would score 97. If both phrases occurred in a document, ConText retrieves the document and scores it as 100.

Weight Operator

The weight operator multiplies the score by the given factor, topping out a 100 when the product exceeds 100. For example,'cat, dog*2' sums the score of cat with twice the score of dog, topping out at 100 when the score is greater than 100.

In expressions that contain more than one query term, use the weight operator to adjust the relative scoring of the query terms. You can reduce the score of a query term by using the weight operator with a number less than 1; you can increase the score of a query term by using the weight operator with a number greater than 1 and less than 10.

The weight operator is useful in accumulate, OR, or AND queries when the expression has more than one query term. With no weighting on individual terms, the score cannot tell you which of the query terms occurs the most. If you are interested in documents that contain a particular query term more than another term, the overall ranking tells you nothing about which documents pertain to the term that you are most interested in.

Example

You have a collection of sports articles. You are interested in the articles about soccer, in particular Brazilian soccer. It turns out that a regular query on soccer, Brazil returns many high ranking articles on US soccer. To raise the ranking of the articles on Brazilian soccer, you can issue the following query:

'soccer, Brazil*3'

Table 3-1 illustrates how the weight operator can change the ranking of three hypothetical documents A, B, and C, which all contain information about soccer. The columns in the table show the total score of four different query expressions on the three documents.

Table 3-1
  soccer   Brazil   soccer,Brazil   soccer,Brazil*3  

A

 

20

 

10

 

30

 

50

 

B

 

10

 

30

 

40

 

100

 

C

 

50

 

10

 

60

 

70

 

The score in the third column containing the query soccer, Brazil is the sum of the scores in the first two columns. The score in the fourth column containing the query soccer,Brazil*3 is the sum of the score of the first column soccer plus three times the score of the second, Brazil.

With the initial query of soccer,Brazil, the documents are ranked in the order C B A. With the query of soccer,Brazil*3, the documents are ranked B C A, which is the preferred ranking.

Result-Set Operators

Use the result-set operators to control what documents are returned from a query result set. The operands for these operators are expressions, which can be an individual query term or a logical combination of query terms that use other operators.

Because these operators manipulate a result set, they cannot be embedded within each other; they must be placed at the outermost level of the query expression.

Result set operators are typically used to exclude noise from the hitlist (irrelevant documents) and to retrieve documents out of a hitlist more efficiently. There are three result set operators:

Operator   Syntax   Description  

THRESHOLD

 

expression>n

term>n

 

Returns only those documents in the result set that score above the threshold n.

Within an expression, selects documents that contain the query term with score of at least n.

 

MAX

 

expression:n

 

Returns the first n highest scoring documents. For example,:20 means to return the top 20 documents in the hitlist. The value n must be an integer between 1 and 65535.

 

FIRST/NEXT

 

expression#m-n

 

Returns the specified number of documents as ordered in the hitlist range m to n.

 

Threshold Operator

You can use the threshold operator in two ways:

Expression level

Use the expression level threshold operator to eliminate documents in the result set that score below a threshold number. For example, to search for documents that contain relational databases and to return only documents that score greater than 75, use the following expression:

'relational databases > 75'

Query Term Level

Use the query term threshold operator in a query expression to select a document based on how a term scores in the document. For example, to select documents that have at least a score of 30 for lion and contain tiger, use:

'(lion > 30) and tiger' 

Max Operator

Use the max operator to retrieve a given number of the highest scoring documents. For example, to obtain the twenty highest scoring documents that contain the word dance, you can write:

'dance:20'

The max operator is particularly useful to prevent writing a large number of records to the hitlist table, which could result in performance degradation.


Note:

The max operator cannot be used with the CTX_QUERY.COUNT_HITS function or with in-memory queries.

 

First/Next Operator

Use the first/next operator to return a specified range of documents from the hitlist.


Note:

In a first/next query, the order of the returned documents is not based on score or textkey. ConText returns the documents based on the order in which it encounters the documents in the queried text column

 

For example, to return the first 10 documents encountered by ConText that contain the term dog, use the following expression:

'dog#1-10'

You could then return the next 10 documents using the following expression:

'dog#11-20'

The first/next operator can be used to create an application interface in which query results (rows in the hitlist) are returned incrementally. Because the query results are returned incrementally, query response is generally faster. The application can display the hitlists in a more manageable size, and control can be returned to the user faster.


Note:

The first/next operator cannot be used with the CTX_QUERY.COUNT_HITS function or with in-memory queries.

 

Combined First/Next and Max Queries

You can use the first/next operator extract chunks of a sorted hitlist returned by the max operator. For example, if you use the max operator to return only the highest scoring 50 documents that contain the term cat, you can extract the first 10 documents from the 50 as follows:

'cat:50#1-10' 


Note:

Placing the max operator inside the first/next operator as such is the only instance in which you can embed the max operator in a query expression.

 

Expansion Operators

The expansion operators expand a query expression to include variants of the query term supplied by the user. There are three kinds of expansion operators:

Operator   Syntax   Description  

STEM

 

$term

 

Expands a query to include all terms having the same stem or root word as the specified term.

 

SOUNDEX

 

!term

 

Expands a query to include all terms that sound the same as the specified term (English-language text only).

 

FUZZY

 

?term

 

Expands a query to include all terms with similar spellings as the specified term (English-language text only).

 

The expansion operators are unary operators. They may be used in combination with each other and with any other operators described in this chapter. In addition, searches can be broadened by performing an expansion on an expansion.

The methods used by the expansion operators to perform stemming, fuzzy matching, and soundex matching for a text column are determined by the Wordlist preference in the policy for the column.

See Also:

For more information about setting up preferences and policies, see Oracle ConText Option Administrator's Guide.

 

Stem Expansions

Use the STEM ($) operator to search for terms that have the same linguistic root as the query term. For example:

Input   Expands To  

$scream

 

scream screaming screamed

 

$distinguish

 

distinguish distinguished distinguishes

 

$guitars

 

guitars guitar

 

$commit

 

commit committed

 

$cat

 

cat cats

 

$sing

 

sang sung sing

 

The ConText stemmer, licensed from Xerox Corporation's XSoft Division, supports the following languages: English, French, Spanish, Italian, German, and Dutch.


Note:

If STEM returns a stopword, the stopword is not included in the query or highlighted by CTX_QUERY.HIGHLIGHT.

 

Soundex Expansions

The soundex (!) operator enables searches on words that have similar sounds; that is, words that sound like other words. This function allows comparison of words that are spelled differently, but sound alike in English.

Soundex in ConText uses the same logic as the soundex function in SQL to search for words that have a similar sound. It returns all words in a text column that have the same soundex value.

The following example illustrates the results that could be returned for a one-step query that uses SOUNDEX:

SELECT ID, COMMENT FROM EMP_RESUME
WHERE CONTAINS (COMMENT, '!SMYTHE') > 0

ID COMMENT 
-- ------------
23 Smith is a hard worker who..

.


Note:

SOUNDEX works best for languages that use a 7-bit character set, such as English. It can be used, with lesser effectiveness, for languages that use an 8-bit character set, such as many Western European languages.

For more information about the SOUNDEX function in SQL, see Oracle8 Server SQL Reference.

 

Fuzzy Expansions

Fuzzy (?) expansions generate words that are spelled similarly. This type of expansion is helpful for finding more accurate results when there are frequent misspellings in the documents in the database.

Unlike the stem expansion, the number of words generated by a fuzzy search depends on what is in the text index; results can vary significantly according to the contents of the database index.

For example:

Input   Expands To  

?cat

 

cat cats calc case

 

?feline

 

feline defined filtering

 

?apply

 

apply apple applied April

 

?read

 

lead real

 


Note:

Fuzzy works best for languages that use a 7-bit character set, such as English. It can be used, with lesser effectiveness, for languages that use an 8-bit character set, such as many Western European languages. Also, the Japanese lexer provides limited fuzzy matching.

In addition, if fuzzy returns a stopword, the stopword is not included in the query or highlighted by CTX_QUERY.HIGHLIGHT.

 

Penetration in Expansion Operators

Penetration allows complex query expansions to be expressed in short concise notation. Penetration is a system of notation for query expressions and does not affect the meaning of the expansion operators or the order in which operations are performed; it is a tool to help you generate non-ambiguous queries using the expansion operators.

Penetration applies the expansion operators to each term within an explicit expression (i.e., an expression delimited by parentheses or braces). Any expansion operators outside an expression delimited by parentheses ( ) or braces { } is applied to each word or phrase inside the expression.

For example:

Query Before Penetration   Query After Penetration  

?(dog, cat, mouse)

 

?dog, ?cat, ?mouse

 

?(dog,!(cat & mouse))

 

?dog, (!?cat & !?mouse)

 

?((cat=feline) meows)

 

(?cat =?feline)?meows

 

In the first example, a fuzzy expansion is performed on each term.

In the second example, a fuzzy expansion is performed on each term and a soundex expansion is performed only on the terms cat and mouse because cat and mouse are enclosed in a separate set of parentheses

In the third example, a fuzzy expansion is performed on each term, including both equivalence terms.


Note:

Expansion operators do not penetrate expressions delimited by brackets [ ].

 

Base-letter Support

If you have base-letter conversion specified for a text column and the query expression contains a SOUNDEX or FUZZY operator, ConText operates on the base-letter form of the query.

The STEM operator does not support base-letter conversion.

Thesaurus Operators

The thesaurus operators expand a query for a single term (word or phrase) using a thesaurus that defines relationships between the user-specified term and other semantically related terms.

There are ten kinds of thesaurus operators, corresponding to the ten types of relationships that can be defined in an ISO2788 standard thesaurus.

Operator   Syntax   Description  

SYNONYM

 

SYN(term[,thes])

 

Expands a query to include all the terms defined in the thesaurus as synonyms for term.

 

PREFERRED

 

PT(term[,thes])

 

Replaces the specified word in a query with the preferred term for term.

 

RELATED

 

RT(term[,thes])

 

Expands a query to include all the terms defined in the thesaurus as a related term for term.

 

TOP

 

TT(term[,thes])

 

Replaces the specified word in a query with the top term in the standard hierarchy (BT, NT) for term.

 

NARROWER

 

NT(term[,level[,thes]])

 

Expands a query to include all the lower level terms defined in the thesaurus as narrower terms for term.

 

NARROWER GENERIC

 

NTG(term[,level[,thes]])

 

Expands a query to include all the lower level terms defined in the thesaurus as narrower generic terms for term.

 

NARROWER PARTITIVE

 

NTP(term[,level[,thes]])

 

Expands a query to include all the lower level terms defined in the thesaurus as narrower partitive term for term.

 

NARROWER INSTANCE

 

NTI(term[,level[,thes]])

 

Expands a query to include all the lower level terms defined in the thesaurus as narrower instance term for term.

 

BROADER

 

BT(term[,level[,thes]])

 

Expands a query to include the term defined in the thesaurus as a broader term for term.

 

BROADER GENERIC

 

BTG(term[,level[,thes]])

 

Expands a query to include all terms defined in the thesaurus as a broader generic terms for term.

 

BROADER PARTITIVE

 

BTP(term[,level[,thes]])

 

Expands a query to include all the terms defined in the thesaurus as broader partitive terms for term.

 

BROADER INSTANCE

 

BTI(term[,level[,thes]])

 

Expands a query to include all the terms defined in the thesaurus as broader instance terms for term.

 

Internally, ConText processes the expansion by bracketing each individual term returned by the expansion, then the terms are accumulated together using the ACCUMULATE operator.

For example, if bird, birdy, and avian are all synonyms:

SYN(bird) is expanded to {bird},{avian},{birdy}.

If a term in a thesaural query does not have corresponding entries in the specified thesaurus, no expansion is produced and the term itself is used in the query.

See Also:

For more information about viewing thesaural expansions, see Chapter 5, "Query Expression Feedback".

For more information about thesaural relationships and creating thesauri, see Oracle ConText Option Administrator's Guide.

 

Limitations

The thesaurus operators can be used in conjunction with all the other query expression operators and special characters supported by ConText, with the exception of the near operator.

The maximum length of the expanded query is 32000 characters.

Thesaural operations cannot be nested. For example, the following query is not allowed.

'SYN(BT(bird))'

Thesaurus Arguments

The thesaurus operators are implemented in ConText as PL/SQL functions, and, as such, have arguments that must be specified with the operator. All of the notational conventions and usage rules for PL/SQL apply to the thesaurus operators.

The thesaurus operators have the following arguments:

term

Specify the operand for the thesaurus operator. You must specify a term when using the NT operator. For preferred term (PT) and top term (TT) queries, term is replaced by the preferred term/top term defined for the term in the specified thesaurus; however, if no PT or TT entries are defined for the term, the term is not replaced and is used in the query.

For all other thesaural queries, term is expanded to include the synonymous, related, broader, or narrower terms defined for the term in the specified thesaurus.

level

Specify the number of levels traversed in the thesaurus hierarchy to return the broader (BT, BTG, BTP) or narrower (NT, NTG, NTP) term for the specified term. For example, a level of 1 in a BT query returns only the broader term, if one exists, for the specified term. A level of 2 returns the broader term for the specified term, as well as the broader term, if one exists, for the broader term.

The level argument is optional and has a default value of one (1). Zero or negative values for the level argument return only the original query term.

thes

Specify the name of the thesaurus used to return the expansions for the specified term. The thes argument is optional and has a default value of DEFAULT. As a result, a thesaurus named DEFAULT must exist in the thesaurus tables before using any of the thesaurus operators.

Synonym Operator

Use the synonym operator (SYN) to expand a query to include all the terms that have been defined in a thesaurus as synonyms for a specified term.

The following query returns all documents that contain the term tutorial or any of the synonyms defined for tutorial in the DEFAULT thesaurus:

'SYN(tutorial)'

Compound Phrases in Synonym Operator

Expansion of compound phrases for a term in a synonym query are returned as AND conjunctives.

For example, the compound phrase temperature + measurement + instruments is defined in a thesaurus as a synonym for the term thermometer. In a synonym query for thermometer, the query is expanded to:

{thermometer},({temperature}&{measurement}&{instruments}) 


Note:

In a thesaurus, compound phrases can only be defined in synonym relationships for a term.

 

Preferred Term Operator

Use the preferred term operator (PT) to replace a term in a query with the preferred term that has been defined in a thesaurus for the term.

For example, the term building has a preferred term of construction in a thesaurus. A PT query for building returns all documents that contain the word construction. Documents that contain the word building are not returned.

Related Term Operator

Use the related term operator (RT) to expand a query to include all terms with the related term that has been defined in a thesaurus for the term.

For example, the term dinosaur has a related term of paleontology. A RT query for dinosaur returns all documents that contain the word paleontology. Documents that contain the word dinosaur are not returned.

Narrower Term Operators

Use the narrower term operators (NT, NTG, NTP, NTI) to expand a query to include all the terms that have been defined in a thesaurus as the narrower or lower level terms for a specified term. They can also expand the query to include all of the narrower terms for each narrower term, and so on down through the thesaurus hierarchy.


Note:

The hierarchy can contain four separate branches, represented by the four narrower term operators. During a narrower term query, the specified operator only searches down the designated branch of the hierarchy.

 

The following query returns all documents that contain either the term tutorial or any of the NT terms defined for tutorial in the DEFAULT thesaurus:

'NT(tutorial)'

The following query returns all documents that contain either fairy tale or any of the narrower instance terms for fairy tale as defined in the DEFAULT thesaurus:

'NTI(fairy tale)'

That is, if the terms cinderella and snow white are defined as narrower term instances for fairy tale, ConText returns documents that contain fairy tale, cinderella, or snow white.

Broader Term Operators

Use the broader term operators (BT, BTG, BTP, BTI) to expand a query to include the term that has been defined in a thesaurus as the broader or higher level term for a specified term. They can also expand the query to include the broader term for the broader term and the broader term for that broader term, and so on up through the thesaurus hierarchy.


Note:

The hierarchy can contain four separate branches, represented by the four broader term operators. In a broader term query, the specified operator only searches up the designated branch of the hierarchy.

 

The following query returns all documents that contain the term tutorial or the BT term defined for tutorial in the DEFAULT thesaurus:

'BT(tutorial)'

Broader and Narrower Term Operator on Homographs

If a homograph (a word or phrase with multiple meanings, but the same spelling) appears in two or more nodes in the same hierarchy branch of a thesaurus, a qualifier is required for each occurrence of the term in the branch.

If the qualifier is not specified for a homograph in a broader or narrower term query, the query expands to include all of the broader/narrower terms for the homograph.

For example, if machine is a broader term for crane (building equipment) and bird is a broader term for crane (waterfoul):

BT(crane) expands to {crane},{machine},{bird}

If the qualifier for a homograph is specified in a broader or narrower term query, only the broader/narrower terms for the qualified homograph are returned.

Using the previous example:

BT(crane{(waterfoul)}) expands to {crane},{bird}


Note:

When specifying a qualifier in a broader or narrower term query, the qualifier and its notation (parentheses) must be escaped, as is shown in this example.

 

Top Term Operator

Use the TOP TERM operator (TT) to replace a term in a query with the top term that has been defined for the term in the standard hierarchy (BT, NT) in a thesaurus. Top terms in the generic (BTG, NTG) and partitive (BTP, NTP) hierarchies are not returned.

For example, the term tutorial has a top term of learning systems in the standard hierarchy of a thesaurus. A TT query for tutorial returns all documents that contain the phrase learning systems. Documents that contain the word tutorial are not returned.

Thesaural Expansions and Case-Sensitivity

Thesaural expansions in text queries can differentiate between terms based on case.

For example, a case-sensitive thesaurus named thes1 is created and Mercury is defined as a narrower term for planets, while mercury is defined as a narrower term for metals.

During a query, the following expansions occur:

BT(mercury,1,thes1) expands to {MERCURY}, {METALS}

BT(Mercury,1,thes1) expands to {MERCURY}, {PLANETS}


Note:

There is no way to enable or disable case-sensitivity. ConText preserves the case of all entries entered in a thesaurus based on whether the thesaurus was specified during creation to be case-sensitive. Similarly, text queries use the cases of terms to perform the thesaural look-up based on the thesaurus specified for the term(s).

 

Limitations

Because text queries are case-insensitive, case-sensitive thesauri only affect the expansion of a term and not the terms actually used in the query.

For example:

BT(Mercury,1,thes1) expands to {MERCURY}, {PLANETS}

However, the query returns all documents in which the two terms occur, regardless of case. In other words, documents that contain mercury, Mercury, planets, Planets, or any other combinations of case for the two terms are all returned by the query.

Base-letter Support for Thesaural Queries

When ConText processes a query on a base-letter index and the expression contains a thesaurus operator, ConText looks up the query term in the thesaurus without converting the query to base-letter. The expansions obtained from the thesaurus are converted to base-letter and looked up subsequently within the index according to query rules.

This sequence of look-up enables base-letter queries to work independent of whether the thesaurus is in base-letter form. However, if the keys in the thesaurus are in base letter form, these keys will not match the corresponding non-base letter form query terms. When you have a base-letter thesaurus, you must specify the base-letter form in the query.

Wildcard Characters

Wildcard characters can be used in query expressions to expand word searches into pattern searches. The wildcard characters are:

Wildcard Character   Description  

%

 

The percent wildcard specifies that any characters can appear in multiple positions represented by the wildcard.

 

_

 

The underscore wildcard specifies a single position in which any character can occur.

 

For example, the following abbreviated one-step query finds all terms beginning with the pattern scal in a column named text:

...contains(TEXT, 'scal%') > 0


Note:

To expand the wildcard query, ConText uses the word list for the text column and rewrites the query with these terms. When your wildcard query expands to a number of terms greater than the maximum allowed in a query, ConText returns an error.

In addition, if a wildcard expression translates to a stopword, the stopword is not included in the query or highlighted by CTX_QUERY.HIGHLIGHT.

 

Grouping Characters

The grouping characters control operator precedence by grouping query terms and operators in a query expression. The grouping characters are:

The beginning of a group of terms and operators is indicated by an open character from one of the sets of grouping characters. The ending of a group is indicated by the occurrence of the appropriate close character for the open character that started the group. Between the two characters, other groups may occur.

For example, the open parenthesis indicates the beginning of a group. The first close parenthesis encountered is the end of the group. Any open parentheses encountered before the close parenthesis indicate nested groups.

Brackets perform the same function as the parentheses, but prevent penetration for the expansion operators.

Stored Query Expressions

You can store the results of a query expression and then call the SQE later in a quewry expression to return the stored results. To call a stored query expression, use the SQE operator.

Operator   Syntax   Description  

Stored Query Expression

 

SQE(SQE_name)

 

Returns the stored result of SQE_name.

 

The advantage of calling an SQE in a query expression, rather than specifying query terms, is that the results are typically returned faster, since ConText does not have to query the text table directly.

In addition, SQEs can be used to perform iterative queries, in which an initial query is refined using one or more additional queries.

Using Stored Query Expressions

The process for using stored query expressions is:

  1. Call CTX_QUERY.STORE_SQE to store the results for the text column or policy. With STORE_SQE, you specify a name for the SQE, a policy (which identifies the text column for the SQE), a query expression, and whether the SQE is a session or system SQE
  2. Call the stored query expression in the query expression of a text (or theme) query. ConText returns the results of the SQE in the same way it returns the results of a regular query. If the results of the SQE are out-of-date, ConText automatically re-evaluates the SQE before returning the results.


Note:

Because ConText must first determine if the results are out-of-date with respect to the document index, many changes to the index though inserting, deleting, and updating documents will slow down the retrieval of the stored query expression results.

 

Administration of stored query expressions can be performed using the REFRESH_SQE, REMOVE_SQE, and PURGE_SQE procedures in the CTX_QUERY PL/SQL package.

Example

To create a session SQE named PROG_LANG, use CTX_QUERY.STORE_SQE as follows:

exec ctx_query.store_sqe('emp_resumes', 'prog_lang', 	'cobol', 'session');

This SQE queries the text column for the EMP_RESUMES policy (in this case, EMP.RESUMES) and returns all documents that contain the term cobol. It stores the results in the SQE table for the policy.

PROG_LANG can then be called within a query expression as follows:

select score, docid from emp 
where contains(resume, 'sqe(prog_lang)')>0 
order by score;

Session and System SQEs

When you initially create an SQE using CTX_QUERY.STORE_SQE, you can specify whether the SQE is for the current session or for all sessions (system SQE).

You can use session SQEs only in the current session. These SQEs are stored only for the duration of the session. When a session is terminated, all session SQEs created during the session are deleted from the SQE tables. If you want to use a session SQE in another session, you must recreate the SQE.

System SQEs can be used in all sessions, including concurrent sessions. When a session is terminated, system SQEs created during the session are not deleted from the SQE tables and can be used in future sessions.

Re-evaluation of Stored Query Expressions

If the text column referenced by an stored query expression has been modified since the stored query expression was created, the stored query expression results may be out-of-date. Before returning the results of an stored query expression in a query expression, ConText verifies that the results are current. If they are not current, ConText automatically evaluates the differences and updates the results.

ConText also verifies that any stored query expressions nested within an stored query expression have up-to-date results


Note:

ConText does not verify whether PL/SQL functions in stored query expressions have been updated. If a PL/SQL function in an stored query expression has been updated, the stored query expression must be manually re-evaluated.

 

Result lists in stored query expression tables may get fragmented by consecutive re-evaluations. You can resolve fragmentation by calling CTX_QUERY.REFRESH_SQE.

Iterative Queries

Iterative queries are queries built on other queries to refine or add to the result set of the original query. Once you define a stored query expression, you can add additional search criteria in two ways:

Extending the Expression in the CONTAINS Procedure

Sometimes you might want to add a condition to a stored query expression to re-define your search criteria. You can do so by extending the query with additional operators when you call CTX_QUERY.CONTAINS. When you extend stored queries in this way, the response time is usually faster than an equivalent query without the SQE operator.

For example, you find that wildcard queries take a long time to process. You therefore define a wildcard query as a stored query expression, Q1, to return all documents indexed under policy pol that have words beginning with the letter z:

ctx_query.store_sqe('pol', 'Q1', 'z%', 'session');

You then extend the query by adding an OR condition: You ask for all documents indexed under policy pol that contain words beginning with the letter z or contains the word cat:

ctx_query.contains('pol', 'SQE(Q1) | cat', 'ctx_temp');

Internally, ConText must still use the text index to find those documents that might have the word cat but not z%; however, the response time is generally much faster than the following equivalent query:

ctx_query.contains('pol', 'z% | cats', 'ctx_temp');

Nesting Stored Query Expressions

You can use stored query expressions to define other stored query expressions. This is useful when you want to refine the result set returned from a stored query expression.

For example, you define the stored query expression, Q1 as follows:

ctx_query.store_sqe('pol', 'Q1', 'lions | tigers', 'session');

You then want to reduce this hitlist by adding another condition, so you define Q2 as follows:

ctx_query.store_sqe('pol', 'Q2', 'SQE(Q1) and zoos', 'session');

You then execute Q2 as follows:

ctx_query.contains('pol', 'SQE(Q2)', 'ctx_temp');

This query searches for all documents that contain the terms lions or tigers and zoos. It is generally faster that the following equivalent query:

ctx_query.contains('pol', 'lions | tigers and zoos', 'ctx_temp');

SQE Tables

Each stored query expression is stored in two tables: a central or system table owned by CTXSYS and an text index table attached to the policy for which the stored query expression was created.

The table owned by CTXSYS is an internal table which stores the stored query expression definitions for all the stored query expressions that have been created for all existing policies. It cannot be accessed directly, but can be viewed through two views, CTX_SQES (users with CTXADMIN role) and CTX_USER_SQES (users with CTXAPP and CTXADMIN roles).

The table used to store the results of an stored query expression for a text column is one of the tables created automatically when the column is indexed; however, the SQR table is only populated when an stored query expression is created and updated when an stored query expression is re-evaluated.

The tablespace, storage clause, and other parameters used to create the SQR table are specified by the Engine preference in the policy for the text column of the stored query expression.

Note:

Similar to the other ConText index tables, the SQR table is an internal table that is accessed only by ConText when an stored query expression is processed in a query.

For more information about policies, preferences, text indexing, and the structure of the stored query expression tables and views, see Oracle ConText Option Administrator's Guide.

 

Using Operators in Stored Query Expressions

You can use all query expression operators in stored query expressions, with the following exceptions:

Stored query expressions also support all of the special characters and other components that can be used in a query expression, including PL/SQL functions and other stored query expressions.

PL/SQL in Query Expressions

In a query expression, you can call a PL/SQL function that returns a value. The syntax for the PL/SQL operator is as follows:

Syntax   Description  

@owner_name.fname(arg1, arg2,...,argn)

execute owner_name.fname()

exec owner_name.fname()

 

Executes fname() where fname() returns a value. Return values that are not of type VARCHAR2 are cast into strings when possible. If fname() does not return a value, an exception is raised.

 

Example

Calling a PL/SQL function within a query is useful for converting words to alternate forms. For example, you can call a function that takes acronyms and returns the expanded string.

Suppose you, as user ctxuser, create a function named CONVERT that takes an acronym as input and returns the fully-expanded version of the acronym. Then, to obtain all documents that contain either IBM or International Business Machine, you issue the following query:

'execute ctxuser.convert(IBM), IBM'

Likewise, you can call a PL/SQL function that translates words. For example, you can call a function french that converts an English word to its French equivalent. You can then search on the French word for cat by issuing the following query:

'@ctxuser.french(cat)'

Operator Precedence

Operator precedence is the order in which the components of a query expression are evaluated. ConText query operators can be divided into two sets of operators that have their own order of evaluation. These two groups are described below as Group 1 and Group 2.

In all cases, query expressions are evaluated in order from left to right according to the precedence of their operators. Operators with higher precedence are applied first. Operators of equal precedence are applied in order of their appearance in the expression from left to right.

Group 1

Within query expressions, the Group 1 operators have the following order of evaluation from highest precedence to lowest:

EQUIV

 

=

 

NEAR

 

;

 

Weight, Threshold

 

* >

 

MINUS

 

-

 

NOT

 

~

 

WITHIN

 

 

AND

 

&

 

OR

 

|

 

ACCUM

 

,

 

Max

 

:

 

First/Next

 

#

 

Group 2

Within query expression, the Group 2 operators have the following order of evaluation from highest to lowest:

Wildcard

 

% _

 

Stem

 

$

 

Fuzzy

 

?

 

Soundex

 

!

 

Procedural Operators

Other operators not listed under Group 1 or Group 2 are procedural. These operators have no sense of precedence attached to them. They include the SQE, PL/SQL, and thesaurus operators.

Precedence Examples

Query Expression   Order of Evaluation  

w1 | w2 & w3

 

(w1) | (w2 & w3)

 

w1 & w2 | w3

 

(w1 & w2) | w3

 

?w1, w2 | w3 & w4

 

(?w1), (w2 | (w3 & w4))

 

abc = def ghi & jkl = mno

 

((abc = def) ghi) & (jkl=mno)

 

dog and cat WITHIN body

 

(dog and cat) WITHIN body

 

In the first example, because AND has a higher precedence than OR, the query returns all documents that contain w1 and all documents that contain both w2 and w3.

In the second example, the query returns all documents that contain both w1 and w2 and all documents that contain w3.

In the third example, the fuzzy operator is first applied to w1, then the AND operator is applied to arguments w3 and w4, then the OR operator is applied to term w2 and the results of the AND operation, and finally, the score from the fuzzy operation on w1 is added to the score from the OR operation.

The fourth example shows that the equivalence operator has higher precedence than the AND operator.

The fifth example shows that the AND operator has higher precedence than the WITHIN operator.

Altering Precedence

Precedence is altered by grouping characters as follows:

Escaping Reserved Words and Characters

To query on words or symbols that have special meaning to query expressions such as and & or| accum, execute, you must escape them. There are two ways to escape characters in a query expression:

Escape Symbol   Meaning  

{}

 

Use braces to escape a string of characters or symbols. Everything within a set of braces is considered part of the escape sequence.

 

\

 

Use the backslash character to escape an individual character or symbol. Only the character immediately following the backslash is escaped.

 

Example

In the following examples, an escape sequence is necessary because each expression contains a ConText operator or reserved symbol:

'AT\&T'
'{AT&T}'

'high\-voltage'
'{high-voltage}'

Reserved Words

The following is a list of ConText reserved words and characters that must be escaped to be searched on:

Operator   Reserved Word   Equivalent Reserved Character  

And

 

AND

 

&

 

Or

 

OR

 

|

 

Accumulate

 

ACCUM

 

,

 

Minus

 

MINUS

 

-

 

Near

 

(none)

 

;

 

Stem

 

(none)

 

$

 

Soundex

 

(none)

 

!

 

Fuzzy

 

(none)

 

?

 

Threshold

 

(none)

 

>

 

Weight

 

(none)

 

*

 

First/Next

 

(none)

 

#

 

Max

 

(none)

 

:

 

Wildcard (multiple)

 

(none)

 

%

 

Wildcard (single)

 

(none)

 

_

 

Grouping (parentheses)

 

(none)

 

( )

 

Grouping (brackets)

 

(none)

 

[ ]

 

Escape (multiple characters)

 

(none)

 

{ }

 

Escape (single character)

 

(none)

 

\

 

PL/SQL call

 

EXECUTE

EXEC

 

@

 

Stored Query Expression

 

SQE

 

(none)

 

Synonym

 

SYN

 

(none)

 

Preferred

 

PT

 

(none)

 

Related

 

RT

 

(none)

 

Top

 

TT

 

(none)

 

Broader

 

BT

 

(none)

 

Narrower

 

NT

 

(none)

 

Broader Generic

 

BTG

 

(none)

 

Narrower Generic

 

NTG

 

(none)

 

Broader Partitive

 

BTP

 

(none)

 

Narrower Partitive

 

NTP

 

(none)

 

Querying Escape Characters

The open brace { signals the beginning of the escape sequence, and the closed brace} indicates the end. Everything between the opening brace and the closing brace is part of the query expression (including any open brace characters). To include the close brace character in a query expression, use}}.

To escape the backslash escape character, use \\.

Querying with Stopwords

Stopwords are words for which ConText does not create an index entry. They are usually common words that are unlikely to be searched on by themselves.

ConText is shipped with a default list of stopwords in English containing common words such as this and that. However, you or ConText administrator can define stopwords.

.

See Also:

For more information about defining stopwords, see Oracle ConText Option Administrator's Guide.

 

Stopwords by Themselves

You cannot query on a stopword by itself or a phrase of only stopwords; whenever you attempt to query on a stopword by itself or a stopword-only phrase, the result is always no hits.

For example, you cannot issue a query to retrieve all documents that contain this if this is defined as a stopword, nor can you issue a query on a phrase of stopwords such as the who, if the words the and who are defined as stopwords.

However, you can query on phrases that contain stopwords as well as non-stopwords, such as this boy talks to that girl, where this and that are the only stopwords. This is possible because Context records the position of stopwords even though it does not create an index entry for them.

Stopwords with Operators

When you use a stopword or a stopword-only phrase as an operand of a query operator, ConText rewrites the expression to eliminate the stopword or stopword-only phrase and then executes the query.

The following table describes some common stopword transformations. The Stopword Expression column describes the query expression or component of a query expression you enter, while the right-hand column describes the way ConText rewrites the query.

In these examples, a value of no_token for the rewritten expression means no hits are returned for the query.

Stopword Expression   Rewritten Expression  

non_stopword AND stopword

 

non_stopword

 

stopword AND non_stopword

 

non_stopword

 

stopword AND stopword

 

no_token

 

non_stopword NOT stopword

 

non_stopword

 

stopword NOT non_stopword

 

no_token

 

stopword NOT stopword

 

no_token

 

For example, assuming that the word this is a stopword and that the word dog is a non-stopword, the query dog and that is rewritten to dog, applying the first transformation is the list.

See Also:

For a complete list of stopword transformations, see Appendix A, "Stopword Transformations".

To learn about how to examine stopword transformations, see Chapter 5, "Query Expression Feedback".

 

Querying with Special Characters

Context indexes text by identifying tokens (words). For English and most European languages it assumes that blank spaces delimit tokens. At index time, ConText must also know how to interpret punctuation characters and characters that occur within words and numbers. Such special characters must be defined in the BASIC LEXER preference. They are described as follows:

Type of Character   Description  

Punctuations

 

Characters that delimit the end of sentences such as the period '.' and question mark '?' and those that occur next to words and numbers, such as the comma ',' and the dollar sign '$'. These characters are not indexed.

 

Continuation

 

Characters that indicate a word continues on the next line. An example is the hyphen '-'. These characters are not indexed.

 

Printjoins

 

Characters that join words together such as hyphen '-'. These characters are indexed.

 

Skipjoins

 

Characters that join words together such as hyphen '-'. These characters are not indexed.

 

Numjoin

 

Characters that occur in numbers such as the decimal point '.'. These characters are indexed.

 

Numgroup

 

Characters that group digits within a number such as the comma ','. These characters are indexed.

 

Startjoin

 

Non-alphanumeric characters that occur at the beginning of a token. For example, you can define < as a startjoin character for HTML tagged text. These characters are indexed.

 

Enjoin

 

Non-alphanumeric characters that occur at the end of a token. For example, you can define > as and endjoin character for HTML tagged text. These characters are indexed.

 

In the BASIC LEXER preference, ConText defines a default set of characters for each group.

The way you query on tokens that contain these characters depends on how ConText indexes the tokens containing these characters. This is because ConText tokenizes words at query time the same way it tokenizes words at index time. To query on words or numbers that contain special characters, you must know how these words are represented in the index.

See Also:

For more information about defining special characters for the BASIC LEXER preference, see Oracle ConText Option Administrator's Guide.

 

Querying with Punctuation and Continuation Characters

Punctuation and continuation characters are not indexed with the words they occur next to or with, and thus are ignored by ConText at query time. The following table shows how ConText strips punctuation characters at query time:

Query   Equivalent Query  

'John swims fast. Sharks eat.'

 

'John swims fast sharks eat'

 

'John swims. Fast sharks eat.'

 

'John swims fast sharks eat'

 

'{John swims, fast sharks eat}'

 

'John swims fast sharks eat'

 

'{SHAZAM!}'

 

'SHAZAM'

 

'{$250}'

 

'250'

 

'{#101}'

 

'101'

 

'{phone#}'

 

'phone'

 


Suggestion:

Because ConText strips punctuation characters at query time, leaving them out of the query expression and using the equivalent query might be a better approach, especially when the characters are reserved as in the last five examples.

 

Querying with Printjoins and Skipjoins

Printjoins and skipjoins are characters such as hyphens that join words together.

When you define a character as a printjoin, such as a hyphen, you specify that the words on either side of the hyphen are to be indexed with the hyphen. For example, sister-in-law is indexed as the token sister-in-law.

When you define a character as a skipjoin, such as a hyphen, you specify that the two words on either side of the hyphen are to be indexed as one token without the hyphen. For example, sister-in-law is indexed as sisterinlaw.

To query on words that contain a join character, you must know if the character is defined as a skipjoin or printjoin in the BASIC LEXER preference.

For example, if the hyphen character is defined as a printjoin, you must write your query with the hyphen, since the indexed token contains the hyphen. Thus, to query on all the documents that contain the term sister-in-law, you must write your query as follows with the hyphen:

'{sister-in-law}' 


Note:

The'-' character must be escaped, or else ConText interprets it as the MINUS operator.

 

However, if the hyphen character is defined as a skipjoin, you must write your query without the hyphen. Thus, to query on all documents that contain sister-in-law, you must write your query as:

'sisterinlaw'

This query really returns all documents that contain sisterinlaw and sister-in-law, provided the hyphen is defined as a skipjoin.

Querying with Numjoins and Numgroups

Numjoin and numgroup characters are characters that can appear in numbers, such as the decimal point and the comma.

Numjoin

A numjoin is a character that occurs once in a string of digits, such as a decimal point, and gets indexed with the number. (ConText defines the decimal as a default numjoin character for the BASIC LEXER preference.) For example, the number 3.14 is indexed as 3.14. Thus to query on 3.14 with the decimal point defined as a numjoin character, you write:

'3.14'

When you define the numjoin character to be NULL, Context indexes 3.14 as the two separate numbers 3 and 14.


Note:

When a period follows a number such as at the end of a sentence, ConText knows to index the number without the decimal point. For example, the number fourteen in the following sentence gets indexed as 14 without the period:

The score was San Francisco 21, Dallas 14.

 

Numgroup

A numgroup is a character such as a comma that groups digits together in a number. Numgroup characters get indexed with the number. (ConText defines the comma as a default numjoin character for the BASIC LEXER preference.) For example, the number 6,344,555 gets indexed as 6,344,555.

To query on a number that contains numgroup characters, you must write the query with the numgroup character. For example, to query on 6,344,555, you write:

'{6,344,555}'

Note that the comma must be escaped

.


Note:

When you have the comma defined as a numgroup character, you must query on numbers using the comma. That is, a query on {1,000} does not return documents that contain 1000 without the comma. A better query is with the equivalence operator:

'{1,000}=1000'

 

When you define the numgroup character as NULL, numbers such as 1,000 get indexed as 1 and 000.

Querying with Startjoin and Endjoin Characters

Startjoin and endjoin characters are non-alphanumeric characters that start and end tokens. These characters are indexed with the token they occur with.

You or your ConText administrator typically define startjoin and endjoin characters when you index tagged text such as HTML. This makes it easy to define sections for section searching as well as to query on the tags themselves.

For example, to query on the tag <HEAD> with < defined as a startjoin and > defined as an endjoin, write your query as follows:

'{<HEAD>}'

In the query above, an escape sequence is necessary, since > is an operator.

See Also:

For more information about section searching, see "WITHIN Operator" in this chapter.

 




Go to previous file in sequence Go to next file in sequence
Prev Next
Oracle
Copyright © 1997 Oracle Corporation.
All Rights Reserved.
Go to Product Documentation Library
Library
Go to books for this product
Product
Go to Contents for this book
Contents
Go to Index
Index