Table of Contents

Managing Stoplists

Every Ultra Search instance has a stoplist associated with it. A stoplist is a list of words that are to be ignored during the indexing process. These words are known as Stopwords. Stopwords are not indexed because they are deemed not useful or even disruptive to the performance and accuracy of indexing.

Default Ultra Search Stoplist

During the installation process, a default stoplist is created for the Ultra Search product. Subsequently, when an Ultra Search instance is created, a copy of the default stoplist will be created for the Ultra Search instance.

The default stoplist is created under the WKSYS schema. The default stoplist name is "wk_stoplist". (For your information, this list is defined in the file $ORACLE_HOME/ultrasearch/admin/wk0pref.sql which is run at the time of installation).

You can modify the default stoplist by adding or removing Stopwords from it. However, remember that these modifications will not affect existing Ultra Search instances. They will only affect Ultra Search instances that are created after the modifications are made.

Modifying Instance Stoplists Before Initial Crawling

Modifying instance stoplists should be done as a last resort. The preferred method is to do one of the following:

  1. Modify the default stoplist before creating the instance.
  2. Replace the instance stoplist immediately after creating the instance.

Modifications made to the default stoplist will be reflected in all other instance stoplists created after the time of modification.

Replacing the instance stoplist immediately after creating the instance affects only that instance. You will first need to create a user-defined stoplist.

In both cases above, the result is that the Ultra Search instance stoplist is modified and defined before initial crawling. This means that all documents collected by the Ultra Search Crawler will be evaluated against the correct stoplist. It is important to modify the stoplist before initial crawling to avoid having to recrawl all documents again.

Modifying instance stoplists after initial crawling

If necessary, you may alter an instance stoplist after initial crawling. You can choose one of the following methods:

  1. Add Stopwords to the instance stoplist.
  2. Define a new stoplist and replace the instance stoplist with the new stoplist.

Choosing to Add Stopwords to the instance stoplist will not affect any documents already crawled or indexed. This operation is not an expensive operation.

Defining a new stoplist and replacing the instance stoplist with it will invalidate the entire index. If you choose this method, you must force the Ultra Search Crawler to recrawl all documents in the index. You can do this by selecting the "Process all documents" radio button in the Edit Schedule page. This is a very expensive operation. Therefore, this option should be the last resort.

Instructions on modifying instance stoplists before initial crawling

(1) Modifying the default stoplist before creating the instance

To add the Stopword "web" to the default stoplist, login as user WKSYS through SQL*Plus and issue the following command:

exec ctx_ddl.add_stopword('wk_stoplist','web');

To remove the Stopword "web" from the default stoplist, login as user WKSYS through SQL*Plus and issue the following command:

exec ctx_ddl.remove_stopword('wk_stoplist','web');

Subsequently, the stoplists of all new instances will reflect the modifications made to the default stoplist.

(2) Replace the instance stoplist immediately after creating the instance

First, you must create a new user-defined stoplist. To do so, login as the owner of the instance through SQL*Plus. Issue the following commands:

begin
   ctx_ddl.create_stoplist('example_stoplist');
   ctx_ddl.add_stopword('example_stoplist','example_stopword');
    ... (add more stopwords by repeated the previous line with new stopwords) ...
end;
/

To replace an instance stoplist with this new stoplist, login as the owner of the instance through SQL*Plus and issue the following command:

ALTER INDEX wk$doc_path_idx rebuild parameters('replace stoplist example_stoplist');

Instructions on modifying instance stoplists after initial crawling

(1) Add Stopwords to the instance stoplist

To add the Stopword "web" to the instance stoplist, login as the owner of the instance through SQL*Plus and issue the following command:

alter index wk$doc_path_idx rebuild parameters('add stopword web');

(2) Replace the instance stoplist after initial crawling

The method for replacing the instance stoplist after initial crawling is no different from replacing it before initial crawling. Remember that this is a very expensive operation as it entails recrawling of all documents. Remember also that if you choose this method, you must force the Ultra Search Crawler to recrawl all documents in the index. Therefore, this method should be the last resort.