Previous     Contents     Index          Next     
iPlanet Web Server, Enterprise Edition Administrator's Guide



The Search Tab

The Search tab allows you to search the contents and attributes of documents on the server. The Search tab contains the following pages:



The Search State Page

From the Search State page you can enable or disable the search capabilities for your server. If users do not use the search feature or if web traffic is heavy, turning search off will improve the server's performance. For more information, see Turning Search On or Off.

The following elements are displayed:

Search State. Specifies whether the search function is on or off.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The New Collection Page



The New Collection page allows you to create a collection that indexes the content of all or some of the files in a directory. You can create a collection that indexes the content of all or some of the files in a directory. You can define collections that contain only one kind of file or you can create a collection of documents in various formats that are automatically converted to HTML during indexing. When you define a multiple format collection (with the auto-convert option), the indexer first converts the documents into HTML and then indexes the contents of the HTML documents. The converted HTML documents are put into the html_doc directory in the server's search collections folder.

For more information, see Creating a New Collection.

The following elements are displayed:

Directory to Index. Specifies the currently defined document directory and provides a drop-down list of additional document directories. You can select any of the items in the drop-down list as a starting point for finding the directory you want to index.

View. If you want to index a different subdirectory, click View to see a list of resources. You can index any directory that is listed or you can view the subdirectories in a listed directory and index one of those instead. Once you click the index link for a directory, you return to the Create Collection page and the directory name appears in the Directory to Index field.

Documents Matching. Specifies the wildcard expression your server will match to restrict indexing. You can index all HTML files in the chosen directory by leaving the default *.html pattern in the "Documents matching" field or you can define your own wildcard expression to restrict indexing to documents that match that pattern.

For an example, you could use the pattern, *.html to only index the content in documents with the .html extension, or you could use either of these patterns (complete with parentheses) to index all HTML documents:

(*.htm|*.html)

or

*(.htm|.html)

You can define multiple wildcards in an expression.



Note You cannot index a file that includes a semi-colon (;) in its name. You must rename such files before you can index them.



Include Subdirectories. Specifies whether the server indexes the subdirectories within the specified directory to index.

Collection Name. Specifies the name for your collection. The collection name is used for collection maintenance. This is the physical file name for the file, so the collection name must follow the standard directory-naming conventions for your operating system.

You can use any characters up to a maximum of 128 characters. Spaces are converted to underscores.



Note Do not use accented characters in the collection name. If you need accented characters, exclude the accents from the collection name, but use accented characters in the label. The label is what is displayed to the user from the search interface.



Collection Label. Specifies a user-defined name for your collection. This is what users see when they use the text search interface. Collection labels should be as descriptive and relevant as possible. You can use any characters except single or double quotation marks, up to a maximum of 128 characters.

Description. Specifies a description for your collection. The description can have a maximum of 1024 characters. This description appears in the collection contents page.

Collection Contains. Specifies the type of files the collection is to contain:

  • HTML

  • ASCII

  • News

  • E-mail

  • PDF

The kind of file format you choose indicates which default attributes are used in the collection and which, if any, automatic HTML conversion of the content is done as part of indexing.

If you choose HTML as the file type and also try to index non-HTML files, the server creates the collection with the HTML set of default attributes and does not attempt to convert any non-HTML file it indexes. If you index HTML files into an ASCII collection, even the HTML markup tags are indexed as part of the file's contents and when you display the files, the contents appear as raw text. Regardless of the file type chosen, the content of the file is always indexed.

Extract Metatags. Extracts META-tagged attributes from HTML files during indexing. If you extract these attributes, you can search on their values. You can index on a maximum of 30 different user-defined META tags in a document. You can only use this option for HTML collections. Select No to tell the server not to extract META-tagged attributes from HTML files.

Documents are in. Specifies the collection's language. The default is English, labeled "English (ISO-8859-1)."

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The Configure Collection Page



Once you have created a collection, you can use the Configure Collection page to configure the collection by:

  • Modifying its description

  • Changing its label

  • Defining a different URL for its documents

  • Defining how to highlight documents

  • Defining which pattern files to use

  • Defining how to format dates

For more information, see Configuring a Collection.

The following elements are displayed:

Choose Collection. Use the drop-down list to specify the collection that you are configuring.

Document Root. Specifies the primary document directory of the collection.

File Format. Specifies the format of the files in the collection.

Language. Specifies the language of the files in the collection.

Description. Specifies a description for your collection. The description can be up to 1024 characters.

Label. Specifies a user-defined name for your collection. This is what users see when they use the text search interface. Collection labels should be as descriptive and relevant as possible. You can use any characters except single or double quotation marks, up to a maximum of 128 characters.

URL for Documents. Specifies the new URL mapping for the collection's documents if it has changed.

For example, if you originally indexed the directory of files that corresponded to those defined by the URL mapping /publisher/help, and you have changed that mapping to the simpler /helpFiles, you would replace the URL of /publisher/help with the /helpFiles in this field.

Highlight begin. Specifies the HTML tags you want the server to use when highlighting a search query word or phrase in a document. The default is to use bold, with the <b> and </b> tags, but you can add to this or change it. For example, you could add <blink><FONT COLOR = #FF0000> and the corresponding </blink></FONT> to highlight with blinking bold red text.

You can define different default pattern files for displaying the search results: how the search result's header, footer, and list entry line are formatted, respectively. Initially, the pattern files are in the server_root\plugins\search\ui\text.

Highlight end. Specifies the HTML tags you want the server to use when highlighting a search query word or phrase in a document. The default is to use bold, with the <b> and </b> tags, but you can add to this or change it. For example, you could add <blink><FONT COLOR = #FF0000> and the corresponding </blink></FONT> to highlight with blinking bold red text.

You can define different default pattern files for displaying the search results: how the search result's header, footer, and list entry line are formatted, respectively. Initially, the pattern files are in the server_root\plugins\search\ui\text.

Input Date Format. Specifies how you want input dates to be interpreted when using this collection:

  • MM/DD/YY

  • DD/MM/YY

  • YY/MM/DD


Pattern Files for Displaying the Search Results

Header Pattern File. Specifies the header pattern file used when displaying the search results. Pattern files are HTML files that define the layout of the text search interface. You can associate a pattern file with a search function and a set of pattern variables to create a specific portion of the interface. In the pattern file, you define the look, feel, and function of the text search interface. Pattern files use pattern variables that you can use to customize background color, help text, banners, and so on. In some cases, the values are paths to the files that contain the actual text and graphics that these variables represent; in other cases, the values represent text and HTML.

Footer Pattern File. Specifies the footer pattern file used when displaying the search results.

Record Pattern File. Specifies the maximum number of records to return for each search.


Pattern File for Displaying the Highlighted Document

Result Pattern File. Specifies the name of the pattern file you want to use when displaying a single highlighted document from the list of search results.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The Update Collection Page



Once you have created a collection, you can use the Update Collection page to add or remove files from the collection. For more information, see Updating a Collection.

The following elements are displayed:

Choose Collection. Use the drop-down list to specify the collection that you want to update.

Selected Collection. Displays the following information for the collection you select:

  • Label

  • Document Root

  • File Format

  • Language

Collection contains (number of) documents. Up to 100 documents that have index entries in the currently selected collection are listed.

Prev and Next. Use the Prev and Next buttons to view previous or next set of 100 files for collections that have more than 100 files in them.

Documents Matching. Specifies the file names that you want to add or remove from the selected collection. You can use either a single file name or wildcards to specify the type of files you want added or removed from the collection. If you enter a wildcard such as *.html, only files with this extension are affected. You can indicate files within a subdirectory by typing in the path as it appears in the list of files. For example, you could delete all the HTML files in the /frenchDocs directory by typing in (no slash before the directory name): frenchDocs/*.html



Note Be careful how you construct wildcard expressions. For example, if you type in index.html, you can add or remove the index file from the current collection. If instead you type in the expression */index.html, you can add or remove all index.html files in the collection.



Include Subdirectories. Specifies whether the server should index and add all matching documents in the subdirectories of the document directory that was originally defined for the collection. Matching documents are those documents that Yes button is selected, and the collection originally indexed the /publisher directory, this option looks for documents matching the new pattern within all the subdirectories within /publisher. This does not apply for removing documents.

Add Docs. Adds the indicated files and subdirectories to the server.

Remove Docs. Removes the indicated files from the server.

Help. Displays online help.



The Maintain Collection Page



Once you have created a collection, you can use the Maintain Collection page to optimize, reindex, or remove the collection. For more information, see Maintaining a Collection.

The following elements are displayed:

Choose Collection. Use the drop-down list to specify the collection that you want to maintain.

Selected Collection. Displays the following information for the collection you select:

  • Label

  • Document Root

  • File Format

  • Language

Optimize. Improves a collection's performance if you frequently add, delete, or update documents or directories from it. Optimizing a collection is similar to defragmenting a hard drive. A collection is automatically optimized whenever you reindex or update it, so you should not need to do additional optimizing. You might want to optimize a collection before publishing it to another site or before putting it onto a read-only CD-ROM.

Reindex. Locates each file that already has an entry in the collection and reindexes its attributes and contents, extracting the META-tagged attributes if that option was selected when the files were originally indexed into the collection. This does not return to the original criteria for creating the collection, say *.html, and add any new documents that fit the original criteria. This option also removes collection entries when the source documents have been deleted and can no longer be found.

Remove. Removes the collection, not the original source documents.

Help. Displays online help.



The Schedule Collection Maintenance Page



You can use the Schedule Collection Maintenance page to optimize, reindex, or remove the collection at a designated time.For more information, see Scheduling Regular Maintenance.

The following elements are displayed:

Choose Collection. Use the drop-down list to specify the collection on which scheduled maintenance will be performed.

Choose Action. Use the drop-down list to choose one of these actions:

  • Reindex. Locates each file that already has an entry in the collection and reindexes its attributes and contents, extracting the META-tagged attributes if that option was selected when the files were originally indexed into the collection. This does not return to the original criteria for creating the collection, say *.html, and add any new documents that fit the original criteria. This option also removes collection entries when the source documents have been deleted and can no longer be found.

  • Optimize. Improves a collection's performance if you frequently add, delete, or update documents or directories from it. Optimizing a collection is similar to defragmenting a hard drive. A collection is automatically optimized whenever you reindex or update it, so you should not need to do additional optimizing. You might want to optimize a collection before publishing it to another site or before putting it onto a read-only CD-ROM.

  • Update. Updates the files in a collection. If you are adding documents, the files' contents are indexed (and converted if necessary), when their entries are added to the collection. If you are removing documents, the entries for the files are removed from the collection along with their metadata. This function does not affect the original documents, only their entries in the collection.

Schedule Time. Specifies the time of day that the scheduled maintenance will take place. The time must be in military format (HH:MM). HH must be less than 24 and MM must be less than 60. This field must contain a value for scheduled maintenance to take place.

Schedule Day(s) of the Week. Check which days of the week to perform maintenance. You can select all the days, but must select at least one day.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The Remove Scheduled Collection Maintenance Page



You can use the Remove Scheduled Collection Maintenance page to remove scheduled maintenance for your server. Scheduled maintenance includes optimizing, reindexing, or updating a collection at a designated time.For more information, see Removing Scheduled Collection Maintenance.

The following elements are displayed:

Choose Collection. Use the drop-down list to specify the collection for which you will remove scheduled maintenance.

Choose Action. Use the drop-down list to choose one of these actions:

  • Reindex. Locates each file that already has an entry in the collection and reindexes its attributes and contents, extracting the META-tagged attributes if that option was selected when the files were originally indexed into the collection. This does not return to the original criteria for creating the collection, say *.html, and add any new documents that fit the original criteria. This option also removes collection entries when the source documents have been deleted and can no longer be found.

  • Optimize. Improves a collection's performance if you frequently add, delete, or update documents or directories from it. Optimizing a collection is similar to defragmenting a hard drive. A collection is automatically optimized whenever you reindex or update it, so you should not need to do additional optimizing. You might want to optimize a collection before publishing it to another site or before putting it onto a read-only CD-ROM.

  • Update. Updates the files in a collection. If you are adding documents, the files' contents are indexed (and converted if necessary), when their entries are added to the collection. If you are removing documents, the entries for the files are removed from the collection along with their metadata. This function does not affect the original documents, only their entries in the collection.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The Search Configuration Page



The Search Configuration page allows you to set the default parameters that govern what users see when they get search results. For more information, see Configuring the Search Parameters.

The following elements are displayed:

Default Result Set Size. Specifies the default maximum number of search result items displayed to users at a time. This value cannot be larger than the value for the largest possible result set size that you enter into the next field. The default value is 20.

Largest Possible Result Set Size. Specifies the maximum number of items in a result set. The default is 5000. For example, if this field contains the number 250, and there were 1000 documents that match the search criteria, users would only be able to see the first 250 or the 250 top-ranked documents (for searches that rank their results).

Date/Time string. Specifies the format of the date/time string in Posix format. This is how the search results are displayed to users in the search results page. For example, the format %b-%d-%y %H:%M produces Oct-1-97 14:24. You can use the symbols listed in Table 17-2.


Table 17-2    Common Posix date and time formats

Format

Displayed result (example)

%a  

Abbreviated week day (for example, Wed)  

%A  

Full week day (for example, Wednesday)  

%b  

Abbreviated month (for example, Oct)  

%B  

Full month (for example, October)  

%c  

Date and time formatted for current locale  

%d  

Day of the month as a decimal number (for example, 01-31)  

%H  

Hour as a decimal number, 24-hr military format (for example, 00-23)  

%m  

Month as a decimal number (for example, 01-12)  

%M  

Minute as a decimal number (for example, 00-59)  

%x  

Date  

%X  

Time  

%y  

Year without century (for example, 00-99)  

%Y  

Year with century (for example, 1999)  

Default HTML title. Specifies a default title for the HTML document. This title is used if the document's author has not included a title as part of the document, tagged with the HTML Title tag. The typical default is (Untitled), which appears in the search results page for HTML files.

Check access permissions on collection root before doing a search. Checks access permission of a server user before performing a search on the collection chosen. If proper authorization is not entered, the requested search is not performed.

Check access permissions on search results. Checks the user's access permission before displaying the search results. If you click Yes, the server checks the user's access privileges for each file before displaying the documents found as a result of the search. Only the documents that you have permission to view are displayed. Select the No button to tell the server not to check the user's access permission before displaying the search results.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.



The Search Pattern Files Page



The Search Pattern Files page allows you to define the layout of the text search interface by configuring pattern files.

The following elements are displayed:

Pattern File Directory. Specifies the absolute path for the directory where you store your pattern files. The default start (header), end (footer), and query page pattern files are located in this directory.

Default Start Pattern File. Specifies the relative path for the default pattern file to use for the top of the search results page when a collection has no defined header file or when more than one collection is being searched. The path must be relative to the pattern file directory.

Default End Pattern File. Specifies the relative path for the default pattern file to use for the footer of the search results page when for a collection has no defined footer file or when more than one collection is being searched. Specify the path relative to the pattern file directory.

Pattern File for Query Page. Specifies the relative path for the pattern file you want to use for the search query page that appears when you start up the search function. Specify the path relative to the pattern file directory.

OK. Saves your entries.

Reset. Erases your changes and resets the elements in the page to the values they contained before your changes.

Help. Displays online help.


Previous     Contents     Index          Next     
Copyright © 2001 Sun Microsystems, Inc. Some preexisting portions Copyright © 2001 Netscape Communications Corp. All rights reserved.

Last Updated May 09, 2002