WILDCARD_INDEX

The WILDCARD_INDEX element specifies that a wildcard index should be created for the text collection that contains it.

In addition, this element allows you to control the behavior of the construction of the wildcard index itself. Wildcard indices enable wildcard search, which is the ability to match user query terms to fragments of words in indexed text. The fragments themselves are referred to as ngrams.

DTD

<!ELEMENT WILDCARD_INDEX EMPTY>
<!ATTLIST WILDCARD_INDEX
    MAX_NGRAM_LENGTH             CDATA         #IMPLIED
    DICTIONARY_WILDCARD          (TRUE|FALSE)  #IMPLIED
    DICTIONARY_MAX_NGRAM_LENGTH  CDATA         #IMPLIED
>

Attributes

The following sections describe the WILDCARD_INDEX element's attributes.

MAX_NGRAM_LENGTH

Specifies the maximum ngram length that should be indexed. All substrings of this length or shorter will be indexed. If a user does a wildcard search with a string that is less than or equal to this length, then exact results can be returned directly from the index. If the wildcard search includes a substring longer than this length, then the results returned from the index will be post-processed so that false positives are eliminated.

DICTIONARY_WILDCARD

Specifies whether a wildcard index should be generated for the dictionary of words for the text collection containing this element. The default value of this attribute is TRUE when the containing element consists of off-line documents and FALSE otherwise.

DICTIONARY_MAX_NGRAM_LENGTH

Specifies the maximum ngram length that should be indexed in the dictionary wildcard index.

Sub-elements

The WILDCARD_INDEX element has no sub-elements.

Example

This example enables positional indexing and wildcard indexing for searching on the Description property.

<RECSEARCH_INDEX NAME="Description">
   <SEARCH_INDEX>
     <WILDCARD_INDEX/>
     <POSITIONAL_INDEX/>
   </SEARCH_INDEX>
</RECSEARCH_INDEX>