12 Using the Text Search Indexer

This chapter describes the text search indexer, and how to use it to generate index files.

This chapter contains the following sections:

12.1 Overview of Text Search Indexer

A Java-based text search indexer is included with Oracle Help for Java. The indexer generates the .idx files used for text searches within Oracle Help. Two versions of the Text Search Indexer are provided, one for Japanese content and another for non-Japanese content.

12.2 Java Requirements

The Text Search Indexer requires Java5 SE or later. Performance will be greatly enhanced if you leave the Java JIT (Just In Time Compiler) on. Also ensure that you increase the maximum heap size of the Java Virtual Machine.

12.3 Running the Indexer

Follow these steps to run the indexer:

  1. Include the OHJ Indexer JAR file helpindexer-version.jar in your CLASSPATH.

  2. Run the indexer from the command prompt. The indexer supports the following command-line arguments:

    [-l=locale] [-e=charset] dirnameindexfilename
    

    where,

    Argument Description
    -l=locale The optional (but recommended) locale parameter is specified using the two-letter ISO 639 language codes and ISO 3166 country codes. The format is language_COUNTRY or language_COUNTRY_VARIANT. If the locale is not supplied, the system default locale is used.
    -e=charset The optional (but recommended) charset parameter is the name of the Java-supported character set encoding for the HTML files that are being indexed. If the encoding is not supplied, the default character set encoding of the current system default locale is used. If supplied, the value must be one of the Java supported character set encoding names; for Java SE, see http://download.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html.
    dirname The base directory that contains the HTML files you want to index. The indexer will index all of the files under this directory (and its subdirectories, if any).
    indexfilename The name of the index file to be generated.

    For example, java -mx64m oracle.help.tools.index.Indexer -l=en_US -e=8859_1 D:\MyHTMLFiles myIndex.idx

    The above command sets the Locale to language: English, country: Unites States, encoding: 8859_1, and indexes the D:\MyHTMLFiles directory creating the myIndex.idx file as a result.

12.4 Running the JapaneseIndexer

Follow these steps to run the indexer:

  1. Include the OHJ Indexer JAR file helpindexer-version.jar in your CLASSPATH.

  2. Run the indexer from the command prompt. The indexer supports the following command-line arguments:

    [-e=charset] dirnameindexfilename
    

    where,

    Argument Description
    -e=charset The optional (but recommended) charset parameter is the name of the Java-supported character set encoding for the HTML files that are being indexed. If the encoding is not supplied, the default character set encoding of the current system default locale is used. If supplied, the value must be one of the Java supported character set encoding names; for Java SE, see http://download.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html.
    dirname The base directory that contains the HTML files you want to index. The indexer will index all of the files under this directory (and its subdirectories, if any).
    indexfilename The name of the index file to be generated.

    For example,java -mx64m oracle.help.tools.index.JapaneseIndexer -e=MS932 D:\MyHTMLFiles myIndex.idx

    The above command runs the JapaneseIndexer with the encoding set to MS932, and indexes the D:\MyHTMLFiles directory, creating the myIndex.idx file as a result.