Complete Contents
About This Guide
Chapter 1 Introduction to iPlanet Web Server
Chapter 2 Administrating iPlanet Web Servers
Chapter 3 Setting Administration Preferences
Chapter 4 Managing Users and Groups
Chapter 5 Working with Server Security
Chapter 6 Managing Server Clusters
Chapter 7 Configuring Server Preferences
Chapter 8 Understanding Log Files
Chapter 9 Using SNMP to Monitor Servers
Chapter 10 Configuring the Server for Performance
Chapter 11 Extending Your Server with Programs
Chapter 12 Working with Configuration Styles
Chapter 13 Managing Server Content
Chapter 14 Controlling Access to Your Server
Chapter 15 Configuring Web Publishing
Chapter 16 Using Search
Appendix A HyperText Transfer Protocol
Appendix B ACL File Syntax
Appendix C Internationalized iPlanet Web Server
Appendix D Server Extensions for Microsoft FrontPage
Appendix E iPlanet Web Server User Interface
Glossary
Index
Administrator's Guide: Using Search
Previous Next Contents Index Bookshelf


Chapter 16 Using Search

The iPlanet Web Server search function allows you to search the contents and attributes of documents on the server. As the server administrator, you can create a customized text search interface tailored to your user community.

Note. The Search function is not available on Linux platforms.

This chapter contains the following sections:


About Search
Server documents can be in a variety of formats, such as HTML, Microsoft Excel, Adobe PDF, and WordPerfect provided that there is a conversion filter available for a particular file format. With the filters, the server converts the documents into HTML as it indexes them so that you can use your web browser to view the documents that are found for your search. For more information, see About Collections.

Users can search through server documents for a specific word or attribute value, obtaining a set of search results that list all documents that match the query. They can then select a document from the list to browse it in its entirety. This provides easy access to server content.

As the server administrator, you can restrict which users and groups are authorized to use text search and which documents they can access, you can modify the configuration files that govern how text search operates, and you can customize the search query and results pages.

To enable searching capability on your server, you begin by identifying the special configuration needs of your server and using the several search configuration windows to input these. Then you need to identify the directory or directories of documents that you want prepared for searching and index the document information into a searchable database, called a collection. The next several sections discuss the details of configuring search and indexing collections.

Note.  Search cannot work if the Web Publishing collection does not yet exist or has been deleted. If search does not work, restart the server with the web publishing function turned on (the default), and try searching again.

If Search is turned on before Web Publishing then the default collection is not created until after a force index is performed. This happens only if Web Publishing is enabled after Search. The reason that the Web Publishing collection does not show up in search is that at the time the search init is run, the collection has not been created. If you restart the server, then it will show up correctly.


Configuring Text Search
You can configure several aspects of the search function for your specific server, some of which are collection-specific and others apply across all collections during a search. Collection-specific configuring affects how documents are indexed into a particular collection, so you must define these before creating the collection. Other configuring actions can be defined at any time because they only affect the searches themselves.

Collection-specific configuration actions:

Configurations that affect all collections:

This section includes the following topics:

Controlling Search Access
The search function accesses the ACL database that is the default for your server. You can restrict access to the documents and directories on your server by defining explicit access control list (ACL) rules or you can rely on the default access control definitions. You can add users to your server's access control database through the Administration Server's Users & Groups function. For more information about setting access control, see Controlling Access to Your Server.

You can set your server to check access permissions before displaying search results (by choosing Search and clicking the Search link) as described in Configuring the Search Parameters. When this option is set, before returning the results of a search query, the server checks a user's access privileges and challenges the user to identify themselves before displaying any results.

Mapping URLs
When users search through a collection's files, the documents that are returned as search results use a partial URL (Uniform Resource Identifier), to identify them. This is a security feature that prevents users from knowing the complete physical pathname for a file. A URI is set up by mapping a URL to an additional document directory.

For example, if the path for a file is server_root/Docs/marketing/bizplans/planB.doc, you could set up a mapping that prevents users from seeing all but the last directory by defining a URL prefix of plans and mapping it to server_root/Docs/marketing/bizplans. From then on, users need only type /plans/planB.doc to locate the file. For more information, see Managing Server Content

For information on how to add a doc root for software virtual servers, see Adding a Doc Root for Software Virtual Servers.

Note. By default, URLs that are redirected are always escaped. To prevent this, add escape="no". For example:

The iPlanet Web Server provides five default mappings:

When you create a collection, you must specify which document directory to index. You can only choose a directory that has a URL mapping or a subdirectory within such a mapped directory. You can create your own mappings to define specific directories. To do this, follow these steps:

  1. From the Server Manager, choose Content Management.
  2. Click the Additional Document Directories link.
  3. Type in a nickname that maps the URL to the additional document directory you want to define.
  4. Type the absolute physical path of the directory you want the URL mapping to map to.
  5. If you want to apply a style to the directory, select the style in the Apply Style drop-down list.
  6. Click OK to create the additional document directory.
Note. Once you create a collection based on an additional document directory, you cannot change the URL mapping or the collection's entries will target the URL mapping to the wrong physical file location.

Deciding Which Words Not to Search
You can specify words the search engine should not index or search against. These words are sometimes referred to as stop words or drop words and typically include articles, conjunctions, and prepositions such as at, and, be, for, and the.

To specify stop words, you need to edit the file named style.stp. This file resides in each of the subdirectories html, pdf, mail, and news (for each collection type) in the directory server_root\plugins\search\common\style. Each style.stp file controls stop words for that collection type; for example, the style.stp file in server_root\plugins\search\common\style\html controls stop words for html files in the collection.

Add the stop words to style.stp, one per line and left justified. You can use operators such as square brackets ([]) to indicate character classes, periods (.) to indicate any character, and plus notation (+) to indicate repeats. For example, the style.stp file might contain the following lines:

In this example, the first line of periods (in the file by default) indicates that words with 40 or more characters are not to be indexed as well as the words at, and, and be. [0-9a-zA-Z] indicates that all one letter words are not to be indexed. [0-9][0-9][0-9][0-9]+ indicates that all integers with 4 or more digits are not to be indexed.

The words you specify are case sensitive so if you want to stop all the case variations of a word you need to enter them all. For instance, for the you might enter the, THE, and The.

Make sure you have the stop list you want before you create a collection. If you need to change the stop list after a collection has been created, you need to delete the collection, change the stop list for the collection type, recreate the collection, and reindex all the documents in the collection.

Turning Search On or Off
You can turn search capabilities on and off for your server. Turning search off for a server where users do not use this function can improve server performance. You may also want to turn off the search function at certain times when you know the server will have heavy traffic, reserving this function for times when traffic is lighter.

If you turn search off, the search plug-in is not loaded when the HTTP server starts up. The default is for search to be turned off.

Note.  If search is turned off, the Find Broken Links function in Web Publisher is not available because it executes a search as part of its operation.

To turn off the search function, use The Search State Page in the Server Manager.

Configuring the Search Parameters
As server administrator, you can set the default parameters that govern what users see when they get search results.

To configure search parameters:

  1. From the Server Manager, choose Search.
  2. Click the Search Configuration link.
  3. Type the default maximum number of search result items displayed to users at a time.
  4. Type the maximum number of items in a result set.
  5. Type the format of the date/time string in Posix format.
  6. Type a default title for the document that is to be used if the document's author has not included a title as part of the document, tagged with the HTML Title tag.
  7. If you want the user's access permission to be checked on a collection before displaying the search results, click Yes under the label Check access permissions on collection root before doing a search?
  8. Click OK to set your new search configuration.
Table 16.1 Common Posix date and time formats
Format
Displayed result (example)
%a
Abbreviated week day (for example, Wed)
%A
Full week day (for example, Wednesday)
%b
Abbreviated month (for example, Oct)
%B
Full month (for example, October)
%c
Date and time formatted for current locale
%d
Day of the month as a decimal number (for example, 01-31)
%H
Hour as a decimal number, 24-hr military format (for example, 00-23)
%m
Month as a decimal number (for example, 01-12)
%M
Minute as a decimal number (for example, 00-59)
%x
Date
%X
Time
%y
Year without century (for example, 00-99)
%Y
Year with century (for example, 1999)

Configuring Your Pattern Files
Pattern files are HTML files that define the layout of the text search interface. You can associate a pattern file with a search function and a set of pattern variables to create a specific portion of the interface. In the pattern file, you define the look, feel, and function of the text search interface. Pattern files use pattern variables that you can use to customize background color, help text, banners, and so on. In some cases, the values are pathnames to the files that contain the actual text and graphics that these variables represent; in other cases, the values represent text and HTML.

You can use the default pattern files, or you can create your own customized set of files and point to them from here. For more information about how to change the user interface, see Customizing the Search Interface.

To define where the search function is to look for default pattern files associated with a particular search request, you have to specify the paths for the files.

To configure pattern files, perform the following steps:

  1. From the Server Manager, choose Search.
  2. Click the Search Pattern Files link.
  3. Type the absolute path for the directory where you store your pattern files.
  4. Type in the relative pathname for the default pattern file you want to use for the top of the search results page when a collection has no defined header file or when more than one collection is being searched.
  5. Type in the relative pathname for the default pattern file you want to use for the footer of the search results page when for a collection has no defined footer file or when more than one collection is being searched.
  6. Type in the relative pathname for the pattern file you want to use for the search query page that appears when you start up the search function.
  7. Click OK to configure your search pattern files.
Configuring Manually
The search function examines several configuration files to determine how search is configured on your server. These files define system settings, user-defined variables, and information about your search collections. You normally change this information through the iPlanet Web Server's Search pages, but you can also modify the files manually with your own text editor. Some of the implications of changing the configuration files in order to customize the user interface are discussed in Customizing the Search Interface.

Note. It is not recommended that you make any manual modifications to your configuration files, but if you do, you must restart the server for your modifications to take effect.

This section includes the following topics:

The Configuration Files
The configuration files that govern searching are described in the following list:

Adjusting the Maximum Number of Attributes
Collections have different sets of default attributes that depend on which file format they are. For example, HTML files have Title and SourceType. You can also define META-tagged HTML attributes in your HTML files. Some file formats, such as PDF, have a great many default attributes. For more information about the attributes for each format, see About Collection Attributes and Table 16.2.

You can use the Add Custom Property window to add additional properties for the Web Publishing collection. These are the default maximum settings:

You can change the maximum settings for these in the webpub.conf file, although larger sets of attributes impact the performance of your server. You cannot set the maximums beyond 100 for text and 50 for dates and numbers.

To do this, you need to manually edit the [NS-loader] section of the webpub.conf file to define maximum numbers of attributes. For example, to change all three values, you could use these lines:

Note. You cannot use the additional attributes in existing collections, only in subsequently created collections. To use them in a search collection, you must use the Maintain Collection window (choose Search and click the Maintain Collection link) to remove the collection and then use the New Collection window (click the New Collection link) to create a new collection. If you want to use the new attributes in the web publishing collection, you must use your file system to remove both the web_htm and link_mgr collection files from the search collections directory and then restart your server.

Restricting Memory for Indexing
You can set a limit on the amount of RAM available for indexing operations. To do this, you need to manually edit the [NS-loader] section of the webpub.conf file to add a line defining a maximum memory amount. For example:

The default is for the server to use all of the available memory that the system can offer. Most typically, you need to limit the RAM used for indexing in these two cases:

Restricting Your Index File Size
You can limit how much disk space an index file can consume. To do this, you need to manually edit the [NS-loader] section of the webpub.conf file to define a maximum index file size. For example,

Typically, an indexing operation requires approximately 1.5MB per file, and since there are two files, one of which is temporary, you may need as much as 3MB of disk space for indexing. Setting the file size to 1.5MB per file puts a cap on how large each file can become.

Removing Access to the Web Publishing Collection
Web Publishing appears in the Search In field of the user's standard search query page. To remove the Web Publishing collection from this field, you need to edit the dblist.ini file as follows:

  1. In the "[web_htm]" section, change "NS-display-select=YES" to "NS-display-select=NO".
  2. Restart the server.

Indexing Your Documents
Before users can execute searches, they need a database of searchable data against which they can target their searches. To do this, you create a database, called a collection, that indexes and stores information about the documents such as their content and file properties.

Searches require collections of files upon which to perform their searches. Once the documents are indexed, their contents and file properties, such as their titles, creation dates, and authors, are available for searching.

You can add or delete documents from a collection: optimizing, updating, and managing your collections as needed.

Note. Search cannot work if the Web Publishing collection does not yet exist or has been deleted. If search does not work, restart the server with the web publishing function turned on (the default), and try searching again.

This section includes the following topics:

About Collections
When your server administrator indexes all or some of a server's documents, information about the documents is stored in a collection. Collections contain such information as the format of the documents, the language they are in, their searchable attributes, the number of documents in the collection, the collection's status, and a brief description of the collection. For more details, see Displaying Collection Contents.

When you create a collection, you indicate the type of files that it contains: HTML, ASCII, news, email, or PDF. You can index all the files in a directory or only those with a specific extension—for example, all the HTML or PDF documents.

A collection has records with information about each document that has been indexed. If the document is deleted from the collection, only the collection's entry for that document is removed. The original document is not deleted.

When you have multiple server instances, the collection you create is only associated with the server instance on which the collection was created. Therefore, users can only search collections for that server instance.

About Collection Attributes
Certain file formats have a default set of attributes that are indexed for files of that type, as shown in Table 16.2.

Table 16.2 The default attributes indexed for each file format
File format
Attribute
Type
Description
ASCII
(none)
-
-
HTML
Title
text
The user-defined title of the file.

SourceType
text
The original format of the document. Used by the web publishing and other multi-format collections.
NEWS
From
text
The source userID of the news item.

Subject
text
The text from the subject field of the news item.

Keywords
text
Any keywords defined for the news item

Date
date
The date the news item was created.
EMAIL
From
text
The source userID of the email.

To
text
The destination userID of the email.

Subject
text
The text from the email's subject field.

Date
date
The date the email was created.
PDF
InstanceID
text
An internal ID number.

PermanentID
text
An internal ID number.

NumPages
integer
The number of pages in the document.

DirID
text
The directory where the PDF file exists.

FTS_ModificationDate
date
The document's last modification date.

FTS_CreationDate
date
The document's creation date.

WXEVersion
integer
The version of Adobe Word Finder used to extract the text from the PDF document.

FileName
text
The Adobe filename specification.

FTS_Title
text
The document's title.

FTS_Subject
text
The document's subject.

FTS_Author
text
The document's author.

FTS_Creator
text
The document's creator.

FTS_Producer
text
The document's producer.

FTS_Keywords
text
The document's keywords.

PageMap
text
The page map, describing the word instances for the page.

By default, HTML collections have Title and SourceType attributes, but they can be indexed to permit searching and sorting by up to 30 file attributes tagged with the HTML <META> tag. You can change the maximum settings for file attributes in webpub.conf, as discussed in Adjusting the Maximum Number of Attributes.

For example, a document could have these lines of HTML code:

If this document was indexed with its META tags extracted, you could search it for specific values in the writer or product fields. For example, you could enter this query: Writer <contains> Hunter or Song <contains> Blue.

Note. Any attribute values in META-tagged fields are text strings only, which means that dates and numbers are sorted as text, not as dates or numbers. Also, illegal HTML characters in a META-tagged attribute are replaced with a hyphen. You can use the Add Custom Property window (choose Web Publishing and click the Add Custom Property link) to redefine the text-formatted dates and numbers so that you can perform searches based on actual dates and numbers for data in the Web Publishing collection.

Creating a New Collection
You can create a collection that indexes the content of all or some of the files in a directory. You can define collections that contain only one kind of file or you can create a collection of documents in various formats that are automatically converted to HTML during indexing. When you define a multiple format collection (with the auto-convert option), the indexer first converts the documents into HTML and then indexes the contents of the HTML documents. The converted HTML documents are put into the html_doc directory in the server's search collections folder.

You can only have 12 collections on your server, which is limited to 10 user-defined collections for any server that uses web publishing. If you want to use a 13th collection, you must remove one of your existing collections (choose Search and click the Maintain Collection link). Do not remove the web publishing collection if one exists for your server.

You can only have entries for a maximum of 16 million documents in your collections. A document that is indexed in multiple collections counts as multiple documents. It is best to create new collections of over 10,000 documents at low-traffic times, or the indexing operation may affect your system's performance.

Note. You need to have at least 3MB of available disk space on your system to create a collection. For information on how you can restrict the size of the index files, see Restricting Your Index File Size.

To create a new collection, perform the following steps:

  1. From theServer Manager, choose Search.
  2. Click the New Collection link.
  3. The web server displays the Create a Collection window. The Directory to Index field displays the currently defined document directory and provides a drop-down list of all the additional document directories defined for the server. For more information about additional document directories, see Mapping URLs.

  4. You can select any of the items in the drop-down list as a starting point for finding the directory you want to index.
  5. If you want to index a different subdirectory, click the View button to see a list of resources.
  6. You can index any directory that is listed or you can view the subdirectories in a listed directory and index one of those instead. Once you click the index link for a directory, you return to the Create Collection window and the directory name appears in the Directory to Index field.
  7. You can index all HTML files in the chosen directory by leaving the default *.html pattern in the Documents matching field or you can define your own wildcard expression to restrict indexing to documents that match that pattern.
  8. For example, you could enter *.html to only index the content in documents with the .html extension, or you could use either of these patterns (complete with parentheses) to index all HTML documents:

    (*.htm|*.html or *(.htm|.html)

    You can define multiple wildcards in an expression. For details of the syntax for wildcard patterns, see Using Wildcards.

    Note.  You cannot index a file that includes a semi-colon (;) in its name. You must rename such files before you can index them.

  9. To index the subdirectories within the specified directory, click Include Subdirectories.
  10. Type a name for your collection in the Collection Name field.
  11. The collection name is used for collection maintenance. This is the physical file name for the file, so follow the standard directory-naming conventions for your operating system. You can use any characters up to a maximum of 128 characters. Spaces are converted to underscores.

    Note.  Do not use accented characters in the collection name. If you need accented characters, exclude the accents from the collection name, but use accented characters in the label. The label is what is displayed to the user from the search interface.

  12. Type a user-defined name for your collection in the optional Collection Label field.
  13. This name is what users see when they use the text search interface. Make your collection's label as descriptive and relevant as possible. You can use any characters except single or double quotation marks, up to a maximum of 128 characters.

  14. Type a description for your collection (up to a maximum of 1024 characters) in the optional Description field.
  15. This description is displayed in the collection contents page.

  16. Select the type of files the collection is to contain: ASCII, HTML, news, email, or PDF.
  17. The kind of file format you choose indicates which default attributes are used in the collection and which, if any, automatic HTML conversion of the content is done as part of indexing. For information about the attributes for each format, see Table 16.2 and About Collection Attributes.

    If you choose HTML as the file type and also try to index non-HTML files, the server creates the collection with the HTML set of default attributes and does not attempt to convert any non-HTML file it indexes. If you index HTML files into an ASCII collection, even the HTML markup tags are indexed as part of the file's contents and when you display the files, the contents are displayed as raw text. Regardless of the file type chosen, the content of the file is always indexed.

    Complex PDF files, such as those that are password protected or that contain graphical navigation elements cannot be correctly converted when they are indexed as part of a multi-format collection. The file data converts correctly when they are part of a PDF-only collection. Graphic elements are not converted.

  18. Select whether or not to extract META-tagged attributes from HTML files during indexing.
  19. If you extract these attributes, you can search on their values. You can index on a maximum of thirty (30) different user-defined META tags in a document. You can only use this option for HTML collections.

  20. Select the collection's language from the drop-down list.
  21. The default is English, labeled "English (ISO-8859-1)." For more information on character sets, see Managing Server Content

  22. Click OK to create a new collection.
Note. Once you begin indexing a collection, you cannot stop the process until either the indexing is complete or you reboot the system. Shutting down your server does not kill the process.

Configuring a Collection
After you have initially created a collection, you can modify some of the initial settings for the collection. This data resides in the collection information file, dblist.ini, and when you reconfigure a collection, the dblist.ini file is updated to reflect your changes. For more information about the configuration files, see Configuring Manually. You can revise the description, change its label, define a different URL for its documents, and define how to indicate highlighting in displayed documents, which pattern files to use, and how to format dates.

Note.  This window allows you to modify some of the settings for the web publishing default collection, web_htm, because you are not changing actual collection data. Avoid making unnecessary changes to this collection's settings.

To configure a collection, perform the following steps:

  1. From the Server Manager, choose Search.
  2. Click the Configure Collection link.
  3. The web server displays the Configure Collection window.

  4. In the optional Description field, you can type a description for your collection up to a maximum of 1024 characters.
  5. In the optional Collection Label field, you can type a user-defined name for your collection.
  6. This is what users see when they use the text search interface. Make your collection's label as descriptive and relevant as possible. You can use any characters except single or double quotation marks, up to a maximum of 128 characters.

  7. In the URL for Documents field, you can type in the new URL mapping for the collection's documents if that has changed.
  8. That is, if you originally indexed the directory of files that corresponded to those defined by the URL mapping /publisher/help, and you have changed that mapping to the simpler /helpFiles, you would replace the URL of /publisher/help with the /helpFiles in this field. For more information about additional document directories, see Mapping URLs.

  9. In the Highlight Begin and Highlight End fields, you can type in the HTML tagging you want the server to use when highlighting a search query word or phrase in a displayed document.
  10. The default is to use bold, with the <b> and </b> tags, but you can add to this or change it. For example, you could add <blink><FONT COLOR = #FF0000> and the corresponding </blink></FONT> to highlight with blinking bold red text.

  11. You can define different default pattern files for displaying the search results: how the search result's header, footer, and list entry line are formatted, respectively.
  12. Initially, the pattern files are in the server_root\plugins\search\ui\text.

  13. In the Result Pattern File field, you can enter the name of the pattern file you want to use when displaying a single highlighted document from the list of search results.
  14. In the Date Format field, you can specify how you want input dates to be interpreted when using this collection: MM/DD/YY, DD/MM/YY, or YY/MM/DD.
  15. Click OK to change the collection configuration.
Updating a Collection
After you have initially created a collection, you may want to add or remove files. If you are adding documents, the files' contents are indexed (and converted if necessary), when their entries are added to the collection. If you are removing documents, the entries for the files are removed from the collection along with their metadata. This function does not affect the original documents, only their entries in the collection.

Note. If you selected the Extract Metatags option when you created this collection, then the META-tagged HTML attributes are indexed whenever you add new documents to this collection.

To update a collection, perform the following steps:

  1. From the Server Manager, choose Search.
  2. Click the Update Collection link.
  3. The web server displays the Update Collection window.

  4. Select the collection you want to update from the drop-down list.
  5. The list of documents in the center of the form shows you what documents have index entries in the currently selected collection. The list holds 100 records, and the Prev and Next buttons get the previous (or next) set of 100 files for collections that have more than 100 files in them.

  6. In the Documents Matching field, you can type in a single filename or you can use wildcards to specify the type of files you want added to or removed from the collection.
  7. If you enter a wildcard such as *.html, only files with this extension are affected. You can indicate files within a subdirectory by typing in the pathname as it appears in the list of files. For example, you could delete all the HTML files in the /frenchDocs directory by typing in (no slash before the directory name): frenchDocs/*.html

    Note. Be careful how you construct wildcard expressions. For example, if you type in index.html, you can add or remove the index file from the current collection. If instead you type in the expression */index.html, you can add or remove all index.html files in the collection.

  8. Select whether to index and add all matching documents from the subdirectories of the document directory that was originally defined for the collection.
  9. That is, if the collection originally indexed the /publisher directory, this option looks for documents matching the new pattern within all the subdirectories within /publisher. This does not apply for removing documents.

  10. Click AddDocs to add the indicated files and subdirectories.
  11. Click RemoveDocs to remove the indicated files.
Maintaining a Collection
Periodically, you may want to maintain your collections. With normal usage, these tasks may not be necessary, but if you do a great deal of indexing and updating of collections, you may want to use some of these functions occasionally. You can perform the following collection management tasks:

Note. Do not use your local file manager to remove collections, especially not the web publishing collections. If by chance you do, when you try to execute a search before restarting your server again, the search will fail even if it doesn't use the web publishing collection. Once you restart your server, a new web publishing collection will be automatically created for you, so your search can execute.

To perform any of the collection management tasks, use The Maintain Collection Page in the Server Manager.

Scheduling Regular Maintenance
You can schedule collection maintenance at regular intervals. You can set up separate maintenance schedules for optimizing and reindexing. With normal usage, these tasks may not be necessary, but if you do a great deal of indexing and updating of collections, you may want to use some of these functions occasionally. For example, some very active web sites may require frequent reindexing if new documents are added on a daily basis.

A common combination of tasks is to set up a pair of regularly scheduled reindex and update operations to clean out deleted entries an to add entries for new documents matching your collection criteria.

You can optimize a collection to improve performance if you frequently add, delete, or update documents or directories in your collections. An analogy is defragmenting your hard drive. Optimizing is not done automatically, so you must manually optimize after you reindex or update a collection. One situation when you might want to optimize a collection is just before publishing it to another site or before putting it onto a read-only CD-ROM.

You can reindex a collection, which locates each file that has an entry in the collection and reindexes its attributes and contents, extracting the META-tagged attributes if that option was selected when the files were originally indexed into the collection. This does not add entries for new documents but cleans up the collection by removing entries to files that have been deleted.

You can update a collection, by entering new indexing criteria for the collection, say *.html, which adds any new documents that match the criteria.

To optimize, reindex, or update your collection, perform the following steps:

  1. From the Server Manager, choose Search.
  2. Click the Schedule Collection Maintenance link.
  3. The web server displays the Schedule Collection Maintenance window.

  4. Choose a collection from the drop-down list.
  5. This lists all the collections that you have created.

  6. Choose an action from the drop-down list: Reindex, Optimize, or Update.
  7. You can set up different schedules for different operations on the same collection.

  8. If you choose to update your collection, two extra fields are displayed for entering the document matching criteria and for including documents found in subdirectories that match your criteria.
  9. In the Schedule Time field, type in the time of day when you want the scheduled maintenance to take place.
  10. Use a military format (HH:MM). HH must be less than 24 and MM must be less than 60. You must enter a time.

  11. In the section labeled Schedule Day(s) of the Week, check one or more of the day checkboxes.
  12. You can select all days. You must select at least one day.

  13. Click OK to schedule the maintenance.
For Unix/Linux users, to make your newly scheduled maintenance take effect, you must restart the ns-cron process from the Administration Server.

To restart the ns-cron process, peform the following steps:

  1. From the Administration Server, Choose Global Settings.
  2. Click the Cron Control link.
  3. If ns-cron is already on, click Restart to restart it. If ns-cron is not on, click Start to start it up.
  4. In either case, your regularly scheduled maintenance will now be able to take place.

Unscheduling Collection Maintenance
If you have scheduled regular reindexing or optimizing of a collection, you can remove the scheduled maintenance when you no longer want the collection to be maintained at regular intervals.

To unschedule collection maintenance, perform the following steps:

  1. From the Server Manager, choose Search.
  2. Click the Remove Scheduled Collection Maintenance link.
  3. The web server displays the Remove Scheduled Collection Maintenance window.

  4. Choose a collection from the drop-down list for Choose Collection.
  5. This lists all your collections for which you have set up regular maintenance.

  6. Choose an action from the drop-down list: Reindex or Optimize.
  7. In the lower part of the frame, you can see the time and days of the week when the scheduled maintenance is currently scheduled to take place.
  8. Click OK to remove the scheduled maintenance.
For Unix/Linux users, to make your newly scheduled maintenance take effect, you must restart the ns-cron process.

To restart the ns-cron process, perform the following steps:

  1. From the Administration Server, choose Global Settings.
  2. Click the Cron Control link.
  3. If ns-cron is already on, click Restart to restart it. If ns-cron is not on, click Start to start it up.
  4. In either case, your regularly scheduled maintenance will no longer take place.


Performing a Search: The Basics
Users are primarily concerned with asking questions of the data in the search collections and getting a list of documents in return. When you install the iPlanet Web Server, a default set of search query and result forms are included. These allow users a simple method of accessing the search function.

There are four parts to text searching:

Note. If the search function is turned off, these query forms are not available.

This section includes the following topics:

Search Home Page
The search home page (see: http://serverid:port/search) provides individual links to each of the three search query interfaces as well as an online QuickStart tutorial on customizing the interface. The tutorial discusses the various pattern files and gives examples of how they can be changed to produce different results.

A Search Query
The default installation of iPlanet Web Server includes three search query pages: standard and advanced HTML queries and a Java-based guided query.

On the standard search query, you select a collection to search against and type in a word or phrase to search for using the query language operators.

On the guided Java-based search interface, you can use the many drop-down lists to easily construct a query. You can only obtain this interface when Java is enabled for your browser.

On the advanced HTML page, you have the additional options of selecting multiple collections to search through, establishing a sort sequence for the results, and defining how many documents are to be displayed on a page at a time (clicking the Prev and Next arrows moves you through the pages of results).

Note. You can only execute date and number comparison searches against HTML META attribute values in the web publishing collection provided you have redefined them as date or number properties through the Web Publishing | Add Custom Property form.

To perform a standard search, perform the following steps:

  1. Type the following URL in the location field in your web browser:
  2. In the search query page that appears, choose the collection you want to search through from the drop-down list in the Search In field.
  3. Enter the word or phrase for your search query in the For field. You can create complex queries by combining operators. For details about the search operators, see Using the Query Operators.
  4. Click the Search button to execute your query.
Guided Search
You can choose to use the Java-based guided search interface, which helps you construct the query. This is especially useful if you want to build a query that has several parts, say searching for a word in the documents' content as well as a specific attribute value.

Note. Make sure Java is enabled for your browser. To do this, use the Languages option preferences menu command.

Note. The attributes for Version Control and Link Management are no longer used in iPlanet Web Server. However, note that if you perform a guided search, iPlanet Web Server may still return them; consequently, do not use these variables.

There are two ways to obtain the guided search page: through the Search home page or through the standard search query page.

To access the guided search interface through the Search home page, perform the following steps:

  1. Type the following URL in the location field in your web browser:
  2. Click the Guided Search link on the home page.
To access guided search through the standard search query page, perform the following steps:

  1. Go to the standard search query page by typing the following URL in the location field in your web browser:
  2. Click Guided Search on the standard search page and the guided Java-based query page is displayed.
  3. Choose the collection you want to search through from the drop-down list in the Search In field.
  4. Use the For drop-down list to select the type of element you wish to search for. In this example, choose Words.
  5. In the blank text field, type in the word you want to search for. For details about the search operator, see "Using the Query Operators".
  6. Click Add Line to add the first part of the query. The word appears in the large text display box at the bottom of the form.
  7. To add to your query, choose another element from the drop-down list. In this example, choose Attribute.
  8. A new drop-down list appears on the right side of the form, listing all attributes that are available for the chosen collection. Choose the attribute you want to search against.
  9. From the drop-down list above the text input field, choose a query operator (Contains, Starts, Ends, Matches, Has a substring) or logical operator (=, <, , <=, =) for your query.
  10. In the blank text field, type in the attribute value you want to search for.
  11. Click Add Line to add another line for your query. You can click Undo Line to remove the last line you added or Clear to remove the entire query.
  12. Click the Search button to execute the search.
Advanced Search
You can choose to use the advanced HTML search interface, which helps you construct the query. This is especially useful if you want to create a query that searches through more than one collection or that produces results sorted by a specific attribute value.

There are two ways to obtain the advanced HTML search page: through the Search home page or through the standard search query page.

To access advanced HTML through the Search home page, perform the following steps:

  1. Type the following URL in the location field in your web browser:
  2. Click the Advanced HTML Search link on the home page.
To access advanced HTML search through the standard search query page, perform the following steps:

  1. Go to the standard search query page by typing the following URL in the location field in your web browser:
  2. Disable Java for your browser. To do this, use the Languages option Preferences menu command.
  3. Click Guided Search on the standard search page and the web server displays the advanced HTML query page.
  4. In the For field, type in the word or phrase you want to search for. You can create complex queries by combining operators. For details about the search operators, see Using the Query Operators.
  5. You can type in one or more attributes to sort the results by. The default is an ascending sort order, but you can indicate a descending sort order with a minus. For more information about sorting, see Sorting the Results.
  6. Depending on how many fields are listed for each document in the search results page or how many you want to see at a time, you can expand or limit the number of matching documents you want the search to return at a time. The Prev and Next buttons allow you access to additional pages of documents if there are too many to fit on a page at once.
  7. Use the drop-down list in the Search In field to choose the collection you want to search through. You can select more than one collection by holding down the Ctrl key as you click on another collection. All collections in a query must be in the same language, but the web publishing collection cannot be used in a multi-collection search.
  8. Click the Search button to execute your query.
The Search Results
There are two standard types of search results: a list of all documents that match the search criteria and the text of a single document that you selected from the list of matching documents.

Your access permissions are checked at several points during the search process:

Listing Matched Documents
In the default installation of the iPlanet Web Server, when you execute a search from either the simple or advanced search query pages, you obtain a list of the documents that match your search criteria. The list gives some standard information about each file, depending on the collection's format. For example, the default results page for email collections give subject, to, from, and date for each entry and news collections give subject, from, and date for each entry.

The kind of file format in the collection indicates which default attributes are available for searching. For information about the attributes for each format, see About Collection Attributes.

For entries resulting from a search that checks for comparative proximity of words to each other or for the exactness of the match, the file's ranking can be provided by showing a score.

If there are more matching documents than can fit on a page, click Next to see the next batch. You can always execute a new search by entering new query data and clicking Search.

Sorting the Results
By default, or if you don't enter anything in the Sort By field on the advanced HTML query page, all documents matching the search are output according to their relevance ranking (for queries that consider this) or their position in the server file database (for other queries).

If you enter an attribute name in the Sort By field, the documents are displayed in an ascending sort sequence. You can list the documents in a descending sort sequence by adding a minus sign (-) prefix to the attribute, as in -keywords or -title. You can do a multiple sort, by typing in more than one field, as in Author,-PubDate.

In a short query, sort order usually isn't critical, but in queries that result in a great many matches, you may want to set a sort value in order to obtain useful search results. Note, however, using a special sort sequence may impact the search's performance.

Note. Attribute values in META-tagged fields are text strings, which means that dates and numbers are sorted as text, not as dates or numbers. To convert the value into a date or number, you can create a new property in the Add Custom Property page from the Web Publishing tab and check the box that marks this property as a META-tagged attribute.

Displaying a Highlighted Document
In the default installation of iPlanet Web Server, when you obtain a list of the documents that match your search criteria, you can select a single document to view in your web browser. Depending on how the pattern files are set up, the word you entered as your original search query can be highlighted in the displayed document with color, boldface text, or blinking.

To view a highlighted document, you click on the document's entry in the search results. The field you use to access the highlighted document depends on how your search interface has been designed, but in the default installation, you click the icon shown next to the document's listing. When you click it, there is additional code defined behind the icon's link to format the displayed document with the search query highlighted.

In the default search results page, if you click the file's URL you open the file in your browser without any special highlighting.

In the case of documents that have been converted into HTML, the URL points you to the original document. To get to the converted HTML document, click the document's title.

Displaying Collection Contents
You can display the contents of your collection database to see which attributes are set for each collection. The default installation of iPlanet Web Server uses the HTML-description.pat file to display information about each of your collections that have been defined as displayable (NS-display-select = YES) in the dblist.ini file. The collection contents typically include these items:

To display your collection database contents, type this line in the web browser's URL location field:


Using the Query Operators
To perform an effective search, you need to know how to use the query operators. You can only do Boolean searches, so all the subsequent information is based on Boolean search rules.

Note. The query language is not case-sensitive. The examples use uppercase for clarity only.

The search engine interprets the search query based on a set of syntax rules. For example, by entering the word region, the actual word region and all its stemmed variations (such as regions and regional) are found. The search results are ranked for "importance," which means how close the matched word comes to the originally input search criteria. In the example above, region would rank higher than any of the stemmed variants.

Not all queries rank their results. Only those queries that can have varying degrees of matching can be ranked. For example, <CONTAINS queries either do or do not contain the given string, but <NEAR queries can be ranked according to how close the words are to each other: words closer together are listed at the top of the search results, while those that are far apart are put at the bottom of the results.

This section includes the following topics:

Default Assumptions
The search query language has some implicit defaults and assumptions that dictate how it interprets your input. In some cases, you can circumvent the defaults, but here is how the search engine decides what you want as the search results:

Search Rules
To create complex searches, you can combine query operators, manipulate the query syntax, and include wildcard characters.

Angle Brackets
With the exception of the AND, OR, NOT, and the date and numeric comparison operators, you need to enclose query operators in angle brackets, as in <CONTAINS and <WILDCARD.

Combining Operators
You can combine several query operators into a single query to obtain precise results. For example, you can input the following query to limit your search to those documents that have Bay and Monterey but excludes those that also mention Aquarium:

You can achieve even greater precision by including some implicit phrases, as in the following query that finds documents that refer to the Monterey Bay Aquarium by its full name and also mention otters but do not refer to shark:

Using Query Operators as Search Words
You can use any of the query operators as a search word, but you must enclose the word in quotation marks. For example, you could search for documents about the ebb and flow of the tides with the following query:

Canceling Stemming
You can cancel the implicit stemming by using quotation marks around a word. For example, you can be exact by using a query such as this:

This search only results in documents that contain the exact word plan. It ignores documents with plans or planning.

Modifying Operators
You can use AND, OR, and NOT to modify other operators. For example, you may want to exclude documents with titles that contain the phrase theme park. A query such as this would solve this problem:

Determining Which Operators To Use
Use the following reference to help determine which operators to use. Note that the query language is not case-sensitive, so <starts and <STARTS are equivalent. This document uses uppercase for clarity only.

Table 16.3 Deciding which operator to use
Type of Search
Valid Operators
Examples
Finding documents by date or numeric value comparison.
is equal to (=),
greater than (>),
greater than or equal to (>=),
less than (<),
less than or equal to (<=)

DATE >= 06-30-96

Finds documents created on or after June 30, 1996.
Finding words or phrases in specific document fields or in specific locations in the field.
<STARTS>,
<CONTAINS>,
<ENDS>,
is equal to (=)

Title <STARTS> Help

Finds documents with titles that start with Help.
Finding two or more words in a document.
AND,
<NEAR/1>

specifications AND review

Finds documents that contain both specifications and review.

The following table describes some commonly used operators and provides examples of how to use each one. All are relevance ranked except where explicitly noted.

Table 16.4 Query language operators
Operator
Description
Examples
AND

Adds mandatory criteria to the search. Finds documents that have all of the specified words.
Antarctica AND mountain climb

Finds only documents containing both Antarctica and mountain climb plus all the stemmed variants, such as mountain climbing.
<CONTAINS>

Finds documents containing the specified words in a document field. The words must be in the exact same sequential and contiguous order.

You can use wildcards. Only alphanumeric values.

Does not rank documents for relevance.
Title <CONTAINS> higher profit

Finds documents containing the phrase higher profit in the title. Ignores documents with profits higher in the title.
<ENDS>

Finds documents in which a document field ends with a certain string of characters.

Does not rank documents for relevance.
Title <ENDS> draft

Finds documents with titles ending in draft.
equals (=)
Finds documents in which a document field matches a specific date or numeric value.
Created = 6-30-96

Finds documents created on June 30, 1996.
greater than (>)
Finds documents in which a document field is greater than a specific date or numeric value.
Created > 6-30-96

Finds documents created after June 30, 1996.
greater than or equal to (>=)
Finds documents in which a document field is greater than or equal to a specific date or numeric value.
Created >= 6-30-96

Finds documents created on or after June 30, 1996.
less than (<)
Finds documents in which a document field is less than a specific date or numeric value.
Created < 6-30-96

Finds documents created before June 30, 1996.
less than or equal to (<=)
Finds documents in which a document field is less than or equal to a specific date or numeric value.
Created <= 6-30-96

Finds documents created on or before June 30, 1996.
<MATCHES>

Finds documents in which a string in a document field matches the character string you specify.

Ignores documents that contain partial matches.

Does not rank documents for relevance.
<MATCHES> employee

Finds documents containing employee or any of its stemmed variants such as employees.
<NEAR>

Finds documents that contain the specified words. The closer the terms are to each other in the document, the higher the document's score.
stock <NEAR> purchase

Finds any document containing both stock and purchase, but gives a higher score to a document that has stock purchase than to one that has purchase supplies and stock up.
<NEAR/N>

Finds documents in which two or more specified words are within N number of words from each other. N can be an integer up to 1000. Also ranks the documents for relevance based on the words' proximity to each other.
stock <NEAR/1> purchase

Finds documents containing the phrases stock purchase and purchase stock.

Ignores documents containing phrases like purchase supplies and stock up because stock and purchase do not appear next to each other.

When N is 2 or greater, finds documents that contain the words within the range and gives a higher score for documents which have the words closer together.
NOT

Finds documents that do not contain a specific word or phrase.

Note: You can use NOT to modify the OR or the AND operator.
surf AND NOT beach

Finds documents containing the word surf but not the word beach.
OR

Adds optional criteria to the search. Finds any document that contains at least one of the search values.
apples OR oranges

Finds documents containing either apples or oranges.
<PHRASE>

Finds documents that contain the specified phrase.A phrase is a grouping of two or more words that occur in a specific order.
<PHRASE> (rise "and" fall)

Finds documents that include the entire phrase rise and fall. The and is in quotes to force the search to interpret it as a literal, not as an operator.
<STARTS>

Finds documents in which a document field starts with a certain string of characters.

Does not rank documents for relevance.
Title <STARTS> Corp

Finds documents with titles starting with Corp, such as Corporate and Corporation.
<STEM>
(English only)
Finds documents that contain the specified word and its variants.
<STEM> plan

Finds documents that contain plan, plans, planned, planning, and other variants with the same meaning stem. Ignores similarly spelled words such as planet and plane that don't come from the same stem.
<SUBSTRING>

Finds documents in which part or all of a string in a document field matches the character string you specify.

Similar to <MATCHES>, but can match on a partial string.

Does not work with wildcards.

Does not rank documents for relevance.
<SUBSTRING> employ

Finds documents that can match on all or part of employ, so it can succeed with ploy.

Note: This works with literals only. If you input web*, the asterisk does not work as a wildcard, so the search succeeds only with the exact "web*" string.
<WILDCARD>

Finds documents that contain the wildcard characters in the search string. You can use this to get words that have some similar spellings but which would not be found by stemming the word.

Some characters, such as * and ?, automatically indicate a wildcard-based search, so you don't have to include the word <WILDCARD>.
<WILDCARD> plan*


Finds documents that contain plan, plane, and planet as well as any word that begins with plan, such as planned, plans, and planetopolis.

See the next section for more details and examples.

<WORD>

Finds documents that contain the specified word.
<WORD> theme

Finds documents that contain theme, thematic, themes, and other words that stem from theme.

Using Wildcards
You can use wildcards to obtain special results. For example, you can find documents that contain words that have similar spellings but are not stemmed variants. For example, plan stems into plans and planning but not plane or planet. With wildcards, you can find all of these words.

Some characters, such as * and ?, automatically indicate a wildcard-based search and do not require you to use the <WILDCARD operator as part of the expression.

Table 16.5 Wildcard operators
Character
Description
*
Specifies 0 or more alphanumeric characters. For example, air* finds documents that contain air, airline, and airhead.

Cannot use this wildcard as the first character in an expression.

This wildcard is ignored in a set of ([ ]) or in an alternative pattern ({ }).

With this wildcard, the<WILDCARD> operator is implicit.
?
Specifies a single alphanumeric character, although you can use more than one ? to indicate multiple characters. For example, ?at finds documents that contain cat and hat, while ??at finds documents that contain that and chat.

This wildcard is ignored in a set of ([ ]) or in an alternative pattern ({ }).

With this wildcard, the<WILDCARD> operator is implicit.
{}
An alternative pattern that specifies a series of patterns, one for each pattern separated by commas. For example,
<WILDCARD> `Chat{s, ting, ty}`
finds documents that contain chats, chatting, and chatty.


You must enclose the entire string in back quotes and you cannot have any embedded spaces.
[ ]
A set that specifies a series of characters that can be used to find a match. For example,
<WILDCARD> `[chp]at`
finds documents that contain cat, hat, and pat.


You must enclose the entire string in back quotes and you cannot have any embedded spaces.

^
Specifies one or more characters to exclude from a set. For example, <WILDCARD> `C[^io]t` finds documents that contain cat and cut, but not cot.

The caret (^) must be the first character after the left bracket.
-
Specifies a range of characters in a set. For example, <WILDCARD> `Ch[a-j]t` finds documents that contain any four-letter word from chat to chjt.

Non-alphanumeric Characters
You can only search for non-alphanumeric characters if the style.lex file used to create the collection is set up to recognize them. This file is in the HTML, news, and mail subdirectories in the server_root\plugins\common\ directory.

Wildcards as Literals
Sometimes you may want to search on characters that are normally used as wildcards, such as *or ?. To use a wildcard as a literal, you must precede it with a backslash. In the case of asterisks, you must use two backslashes. For example, to search on a magazine with a title of Zine***, you would type the following string:

Several characters have special meaning for the search engine and require you to use back quotes to be interpreted as literals. The special search characters are listed here:

For example, to search for the string "a{b", you would type the following string:

For another example, if you wanted to search on the string "c\Qt", which contains a back quote, you would type the following string:


Customizing the Search Interface
As server administrator, you can customize the search interface to meet specific user requirements. All of the HTML-based forms that the user sees are defined through a set of pattern files that set up display formats for the search results page header and footer as well as each search result record listed in response to a query. There are a set of pattern variables that you can use to construct the forms used for search input and output. Many of the variables are defined in the system and user configuration files (userdefs.ini, webpub.conf, and dblist.ini, which are discussed in Configuring Manually).

Note. The search home page, at http://serverid:port/search also provides an introduction to the search interface as well as an online QuickStart tutorial on customizing the interface. The tutorial discusses the various pattern files and gives examples of how they can be changed to produce different results.

This section includes the following topics:

Dynamically Generated Headers and Footers
You can specify dynamically generated headers and footers. To accomplish this, add the add-headers and add-footers directives to your obj.conf file as Service functions. These directives take either a path or uri parameter. Use the path parameter to specify a static file as the header or footer. For example:

Use the uri parameter to specify a dynamically generated file, such as a CGI program, as the header or footer. For example:

These Service functions should precede the actual Service function that will answer the request, such as send-file or send-cgi.

HTML Pattern Files
A good place to begin customizing the interface is by modifying the existing pattern files. After you see how they work and you understand pattern variables, you can create your own pattern files and change the configuration files and other pattern files to point to them. In the default installation of iPlanet Web Server, the pattern files are in this directory: server_root\plugins\search\ui\text. (Make copies of your original pattern files so you can restore them afterwards.)

There are pattern files for different kinds of collections: email, news, ASCII, PDF, and HTML as well as one for the web publishing collection. (The web publishing pattern file is a special case, using a great many collection-specific attributes as variables in the dblist.ini file.) There are several general types of pattern files, each of which has a particular use. A file prefix designates which type of file the pattern file is for, for example, ASCII-record.pat, EMAIL-record.pat, etc. The following list describes the general pattern file types:

The pattern files contain HTML formatting instructions, which define how elements look, and HTML search arguments and variables, which define the text label or value that is displayed.

There are three kinds of pattern variables (discussed further in Using Pattern Variables):

To see how these work together, here are some lines from the standard query pattern file, NS-query.pat:

Each line contains standard HTML tags and one or more variables with the $$ or $$NS- prefix. Examining each line more closely requires looking at the configuration files mentioned in Configuring Manually.

Search Function Syntax
The search function uses standard URL syntax with a series of name-value pairs for the search arguments. This is the basic syntax:

As you use the HTML search query and results pages, you can see search functions and arguments displayed in the URL field of your browser. When entered directly into the URL field, these are sometimes called decorated URLs. You can also embed them in your pattern files with the HREF tag.

You can create a complete search function as an HREF element within a pattern file. The example given is from the HTML-descriptions.pat file, which defined how collection information is displayed. The following lines produce a heading for each collection for the label ("Collection:") and provides a link to the actual collection file through the collection's label (NS-collection-alias) that was defined in the dblist.ini file.

The HREF contains a complete search function by using the following elements:

You can set up a search to use a variable conditionally so that if there is no value associated with the variable, nothing is displayed. The syntax is as follows:

For example, you could request that the document's title be output if it exists. If there is no title for this document, not even the label "Title:" is to be displayed. To do this, you would use code like this:

URL Encodings
When you construct HTML instructions, whether in decorated URLs or within a pattern file, you need to follow the rules for URL encoding. Any character that might be misunderstood as part of an URL should be encoded with a code in the format of %nn, where nn is a hexadecimal code. Blanks are converted to the + symbol (plus sign) in queries or to %20 in output. The following table shows the most commonly used URL codes.

Table 16.6 Common URL encodings
Character
Description
Code

Space
%20
;
Semicolon
%3B
/
Slash
%2F
?
Question mark
%3F
:
Colon
%3A
@
At sign
%40
=
Equal sign
%3D
&
Ampersand
%26

Required Search Arguments
Although you can customize almost every aspect of query and result pages, there are some arguments required for search functions to display the different types of search pages. These arguments are required whether the search function is in a decorated URL or embedded as an HREF in a pattern file.

Search functions that display the search query page require these arguments:

Search functions that display the search results page require these arguments:

Search functions that display a highlighted document require these arguments:

Search functions that display the collection contents require only this argument:

Using Pattern Variables
By using pattern variables, you can customize the search text interface and eliminate the need to update the actual HTML pages as user requirements change. For example, if the interface has graphics or text elements that change periodically, you can define a pattern variable that points to a pathname where that graphic or text is maintained and stored.

There are three categories of pattern variables:

User-defined Pattern Variables
You can create any number of your own user-defined pattern variables in the user definitions file, userdefs.ini, or you can modify existing definitions. When one of these variables is used in a pattern file, the $$ prefix is added to it. Variable names can have up to 32 characters or digits, or combinations of both. Characters can be letters A-Z in upper or lower case, hyphens (-), and underscores (_). Names are case sensitive.

The default userdefs.ini file included with iPlanet Web Server contains variables that are used to define the search query page (labeled [query] in the file, the results listing (labeled [toc]), the document display page, (labeled [record]), and the collection contents page (labeled [contents]). Each line begins with a variable name and is followed by a definition for that variable. Many are labels for screen elements, some are paths to other files, and some have more complex contents. For example, the following lines are from the query section of that file.

[query]
NS-character-set=iso-8859-1
uidir = $$NS-server-url/search-ui
icondir = $$uidir/icons
l10nicondir = $$uidir/icons
htmldir = $$uidir/text
logo = <img src="$$icondir/magnifier.jpg" border=0 align=absmiddle><b><font size=+2>N</font><font size=+1>etscape&nbsp;</ font><font size=+2>S</font><font size=+1>earch</font></b>
sitename = $$NS-host
help = /help/5search.htm
title = Sample Search Interface
searchButtonLabel = Search
searchNote = To search, choose a collection, then enter words and phrases, separated by commas<br>(e.g., search, jet engines, basketball).

advSearchNote = To search, choose collections, then enter words and phrases, separated by commas<br>(e.g., search, jet engines, basketball).<p>Sorting is done on any defined attributes. Use '-' to specify descending order sort<br>(e.g., Title,-Author,+Date)
queryLabel = For:
queryLabelSJIS = $$queryLabel
queryLabelEUC = $$queryLabel
queryLabelJIS7 = $$queryLabel
collectionLabel = Search&nbsp;in:
booleanLabel = Boolean
sortByLabel = Sort&nbsp;by:
sortByLabelSJIS = $sortByLabel
sortByLabelEUC = $sortByLabel
sortByLabelJIS7 = $sortByLabel
freetextLabel = Freetext (unavailable)
maxDocumentsLabel = Documents to return:
maxDocumentsLabelSJIS = $$maxDocumentsLabel
maxDocumentsLabelEUC = $$maxDocumentsLabel
maxDocumentsLabelJIS7 = $$maxDocumentsLabel
copyright = Copyright &#169; 1997 Netscape Communications Corporation. All Rights Reserved.
advancedButtonLabel = Advanced Button Label
helpButtonLabel = Help Button Label

The file also includes references to search macros, such as $$NS-server-url, and can also refer to other user-defined variables, as in the following lines:

Search macros are described further in "Macros and Generated Pattern Variables" on page 456.

You can use any supported HTML character entity in your variable definitions. You can use entity names that are defined in the &name; format as well as those defined with the three-digit code in the &#nnn; format. In the userdefs.ini code sample, the entity &nbsp; inserts a nonbreaking space and &#169; inserts a copyright symbol. Some of the more commonly used entities are in the following table:

Table 16.7 Common HTML character entities
Numeric code
Entity name
Description
&#032;

Space
&#034;
&quot;
Quotation mark
&#036;
$
Dollar sign
&#058;
-
Colon
&#060;
&lt;
Less than
&#062;
&gt;
Greater than
&#153;
-
Trademark symbol
&#160;
&nbsp;
Nonbreaking space
&#169;
&copy;
Copyright symbol
&#174;
&reg;
Registered trademark

Configuration File Variables
Some variables are defined in the system configuration and the collection configuration files. These use a prefix of NS- in the configuration file to differentiate them from other markup tags in an HTML page. To use these variables as arguments to the search function, you add another prefix $$ to the variable, as in $$NS-date-time and $$NS-max-records.

Variables that define defaults for all searches on a server are defined in the system configuration file, webpub.conf. For example, the default installation of iPlanet Web Server includes the following variables in the webpub.conf file:

Although installations may vary depending on how each server is configured, the most commonly found variables from the webpub.conf file are listed in the following table:

Table 16.8 Commonly found variables defined in webpub.conf
Variable
Description
NS-default-html-title
The name given to HTML documents that do not contain a user-defined title. Typically set to "(Untitled)."
NS-date-time
The date and time format to use when displaying results.
NS-date-input-format
The format for inputting dates (the default is MMDDYY).
NS-HTML-descriptions-pat
The pattern file to use when displaying the contents of the collections.
NS-largest-set
The maximum number of records that can be handled as matching the search criteria. The records are displayed in groups of NS-max-records.
NS-max-records
The maximum size of the result set displayed at one time.
NS-ms-tocend
The pattern file to use for the footer at the bottom of the search results page when searching multiple collections.
NS-ms-tocstart
The pattern file to use for the header at the top of the search results page when searching multiple collections.
NS-query-pat
The query pattern file used when creating a query page.
NS-search-type
The type of search to perform. Only Boolean is permitted.

Collection-specific variables are defined in the dblist.ini file. For example, the default installation of iPlanet Web Server includes variables for the web publishing collection. Among the variables defined there are:

The variables in your dblist.ini file may differ according to the type of collections you are using, Table 11.9 contains some of the more commonly found collection-specific variables.

Table 16.9 Commonly found variables in dblist.ini
Variable
Description
NS-collection-alias
The collection's label. Can be specified more then once to search multiple collections.
NS-doc-root
The root directory for the documents in the collection.
NS-display-select
This indicates whether the collection is displayed as part of the collection information listing, when NS-search-page=contents. The default is YES.
NS-highlight-start
Begin highlighting at this point in the displayed document. Typically this highlights the search query criteria.
NS-highlight-end
End highlighting at this point in the displayed document.
NS-language
The language of the documents in the collection.
NS-record-pat
The pattern file to use when displaying a highlighted document page.
NS-tocend-pat
The footer pattern file associated with a collection to be used when formatting the search results.
NS-tocrec-pat
The record pattern file associated with a collection to be used when formatting the search results.
NS-tocstart-pat
The header pattern file associated with a collection to be used when formatting the search results.
NS-url-base
The base URL used when constructing the link used to locate the file.

Macros and Generated Pattern Variables
There are some search macros that you can use in your pattern files or decorated URLs, and the search function itself generates some pattern variables that you can use in subsequent search requests to define how the later output is to be displayed. These macros and variables have a prefix of $$NS- to indicate their use.

For example, after doing an initial search query that results in 24 documents on the results page, you can reuse the search-generated $$NS-docs-matched and the $$NS-doc-number variables to help define a document page displaying one of the documents in detail. In this way, you can tell the user that this document is number 3 of 24 documents returned for the original search.

The search macros and the generated variables that you can use in a subsequent pattern file or decorated URL are listed the following table:

Table 16.10 Macros and generated pattern variables
Variable
Description
$$NS-collection-list
An HTML multiple select list of all the collections in dblist.ini where NS-display-select is set to YES.
$$NS-collection-list-dropdown
An HTML drop-down list version of NS-collection-list.
$$NS-collections-searched
The number of collections searched for this request.
$$NS-display-query
The HTML-displayable version of the query that is generated for a results page.
$$NS-doc-href
The HTML HREF tag for the document. This provides a URL to the original source document. For email, this is in the form mailbox:/boxname?id-messageID and for news, it is in the form news:messageID.
$$NS-doc-name
The document's name.
$$NS-doc-number
The sequence number of the document in the results page list.
$$NS-doc-path
The absolute path to the document.
$$NS-doc-score
The ranked score of the document (ranges 0 to 100).
$$NS-doc-score-div10
The ranked score of the document (ranges 0 to 10).
$$NS-doc-score-div5
The ranked score of the document (ranges 0 to 5).
$$NS-doc-time
The creation time for a document in the results list. To obtain this value, you must set NS-use-system-stat = YES in the webpub.conf file. By default it is set to NO, since system statistics are expensive.
$$NS-doc-size
The size of the document rounded to the nearest K. To obtain this value, you must set NS-use-system-stat = YES in the webpub.conf file. By default it is set to NO, since system statistics are expensive.
$$NS-docs-found
The actual number of documents that the search engine found for this request.
$$NS-docs-matched
The number of documents returned from the search (up to NS-max-records) for this request.
$$NS-docs-searched
The number of documents searched through for this request.
$$NS-get-highlighted-doc
This provides the URL for a highlighted document in order to be able to display the document as HTML text with highlights.
$$NS-get-next
This variable gets the next set of search results to be displayed. The set is equal to NS-max-records and is positioned by using NS-search-offset.
$$NS-get-prev
This variable gets the previous set of search results that has been displayed. The set is equal to NS-max-records and is positioned by using NS-search-offset.
$$NS-host
The host name.
$$NS-insert-doc
A placeholder used in the NS-record-pat pattern files for HTML to indicate where the source document is to be inserted.
$$NS-rel-doc-name
The relative name of the document to display creating a document page.
$$NS-search-offset
The offset into the set of records returned as search results. Used to determine which set of records are displayed when you use NS-get-next and NS-get-prev.
$$NS-server-url
The URL for the server.
$$NS-sort-by
The sort sequence for the items on the results page. You can select one or more of the available attributes for the collection. The default is an ascending sort.


 

© Copyright © 2000 Sun Microsystems, Inc. Some preexisting portions Copyright © 2000 Netscape Communications Corp. All rights reserved.