The key to producing optimal XHTML documents is specifying the right set of properties – omitting properties that you don’t need, and including the ones that you do. The smaller your XHTML documents are, the smaller your index will be; a smaller index uses less memory and can be searched more quickly than a larger one. Of course, an index that’s missing important data is not very useful, so you need to make tradeoffs based on the needs of your environment. This section provides guidelines to help you determine which properties to include as text properties and which ones to include as metadata.

In addition to omitting unnecessary properties, you can reduce the size of your XHTML documents by applying various property value filters to the repository data. For information about these filters and how to use them, see the Using Property Value Filters section of the Customizing the XHTML Output chapter. Also, see the IndexingOutputConfig Analysis section for tools to help pinpoint unnecessary properties in your index.

Guidelines for Text Properties

Some guidelines for determining which properties to include as text properties, and which ones to omit:

Keep in mind that these are just guidelines, and you may need to deviate from them depending on the requirements of your environment. For example, you may want to include a Boolean property as a text property if you translate true and false into searchable Strings. (See Translating Property Values.) Or there may be certain numeric properties (e.g., product codes) that you may want to make available for searching.

Guidelines for Metadata Properties

Some guidelines for determining which properties to include as metadata properties, and which ones to omit: