Siebel Search Administration Guide > Administration of Siebel Search > Siebel Search Index Settings >

Hummingbird Stop File Setting


This setting specifies an operating system file that contains a list of words not to be indexed. Typically, these are words with little semantic value. Using the Stop File can significantly reduces the size of indices by removing words that are not useful for searching. For example, prepositions and articles can be safely removed from indices in most cases.

The stop file is assumed to be in the directory where the table configuration is created unless the stop file name is a fully qualified path name. The default value, an empty string, specifies that no stop file is used. In this case, SearchServer provides a stop file called FULTEXT.STP, which can be used by explicitly specifying it in this parameter. The default stop file contains the following words:

after, also, an and, as, at, be, because, before, between, but, by, for, from, however, if, in into, of, or, other, out, since, such, than, that, the, there, these, this, those, to, under, upon, when, where, whether, which, with, within, without

The stop file can contain a maximum of 1024 stop words totaling not more than 10,000 characters. The stop file is a text file that can be edited in Notepad, or any other plain-text editor.

To customize the stop file, open it in a text editor, directly modify it, then save it using the same name.

CAUTION:  If you choose to customize the stop file after you have created an index, you must regenerate all indices associated with that particular stop file. Also, if you are supporting Mobile Client searching, you must remove the absolute path, and leave only the stop file name fultext.stp, and make sure that the stop file is in the index directory (index directory/siebelroot/search/{datasource}/index/).

Usually, you specify a stop file that is appropriate for the language of the documents you are indexing. SearchServer provides several stop files that are listed in Table 30.

Table 30. Hummingbird SearchServer Stop Files
SearchServer Stop File
Description

csource.stp

Used with the C-language Source Code text reader.

fulfra.stp

Uses the multilingual unicode parser with default options, and contains French-language stop words.

fultext.stp

Uses the multilingual unicode parser with default options, and contains English-language stop words.

ixkor.stp

Uses the InXight-based ixasian parser for Korean-language text.

ixjap.stp

Uses the InXight-based ixasian parser for Japanese-language text.

ixschi.stp

Uses the InXight-based ixasian parser for simplified Chinese-language text.

ixchi

Uses the InXight-based ixasian parser for traditional Chinese-language text.

japan.stp

Included to support old collections. fultext.stp is used to support Japanese-language text. This file is empty.

korean.stp

Used for Korean-language text using n-grams. The Unicode parser is used with k=1 set to map han characters to hangul.

wspprox.stp

Used when indexing with support for Word, Sentence, and Paragraph Proximity.

Example

Property: Stop File

Value: C:\Program Files\HUMMINGBIRD\fultext\fultext.stp

To change the stop file location under Windows

  1. Navigate to Administration - Search > Index Settings.
  2. In the Index Setting Properties list click in the Stop file row.
  3. From the drop-down list in the Value column, select the value that matches the stop file path, then step off the record to save it.

CAUTION:  UNIX users: The provided sample database has a default stop file located at the following path: C:\PROGRAM FILES\HUMMINGBIRD\FULTEXT\FULTEXT.STP. This path is invalid with a UNIX system. You must change the stop file location to a path similar to the following example: /export/home/hummingbird/fultext/fultext.stp.

Siebel Search Administration Guide Copyright © 2009, Oracle and/or its affiliates. All rights reserved. Legal Notices.