In some cases, it is useful to filter a set of property values before outputting an XHTML document. For example, suppose your document-level item has many child items, and each child item has a color property that can have only a few possible values. Rather than outputting the color of every child item, you could include each color in the document just once, by using a filter that removes duplicate property values.

The PropertyValuesFilter interface defines a method for filtering property values. The atg.repository.search.indexing.filter package includes three implementations of this interface:

Each of these filters can be useful for reducing the size of your XHTML documents. This section provides information about what these filters do and when they’re appropriate.

In the definition file, you can specify property filters by using the filter attribute. Note that you can use multiple filters on the same property. The value of the filter attribute is a comma-separated list of Nucleus components. The component names must be absolute pathnames.

To simplify coding of the definition file, you can map PropertyValuesFilter Nucleus components to simple names, and use those names as the values of filter attributes. You can perform this mapping by setting the filterMap property of the IndexingOutputConfig component. This property is a Map in which the keys are the names and the values are PropertyFilter Nucleus components that the names represent.

Note, however, that you do not need to perform this mapping to use the UniqueFilter, ConcatFilter, or UniqueWordFilter class. These classes are mapped by default to the following names:

Filter Class

Name

UniqueFilter

unique

ConcatFilter

concat

UniqueWordFilter

uniqueword

So, for example, you can specify UniqueFilter like this:

<property name="color" filter="unique"/>
UniqueFilter

You may be able to reduce the size of your index by filtering the property values to remove redundant entries. For example, suppose each XHTML document represents a product with several child SKUs. You might include the SKUs’ salePrice property in the index as a metadata property, so it can be used for faceting. Depending on the product, many of the SKUs may have the same value for salePrice. So the resulting entries in an XHTML document might look something like this:

<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>

By filtering out redundant entries, you can reduce this to:

<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>

To automatically perform this filtering, specify the UniqueFilter class in the XML definition file:

<property name="salePrice" filter="unique"/>

As a general rule, it is a good idea to specify the unique filter for a property if multiple items in an XHTML document may have identical values for that property. If you specify this filter for a property and every value of that property in an XHTML document is unique (or if only one item with that property appears in the document), the unique filter will have no effect on the resulting XHTML (either negative or positive). However, executing this filter increases processing time to create the document, so it is a good idea to specify it only for properties that will benefit from it.

ConcatFilter

You may also be able to reduce the size of your index by concatenating the values of text properties. For example, suppose a product’s SKUs have a color property, whose values are red, green, blue, and yellow. The resulting entries in the XHTML document will be:

<div class="atg:role:childSKUs.color" id="1">red</div>
<div class="atg:role:childSKUs.color" id="2">green</div>
<div class="atg:role:childSKUs.color" id="3">blue</div>
<div class="atg:role:childSKUs.color" id="4">yellow</div>

By concatenating the values, you can reduce this to:

<div class="atg:role:childSKUs.color" id="1">red green blue yellow</div>

To combine these values into a single tag, specify the ConcatFilter class in the XML definition file:

<property name="color" filter="concat"/>

This setting invokes an instance of the atg.repository.search.indexing.filter.ConcatFilter class. Note that you do not need to create a Nucleus component to use this filter.

Some guidelines for using concat:

UniqueWordFilter

The atg.repository.search.indexing.filter.UniqueWordFilter class concatenates the values of a property into a single string, and then removes all duplicate words. For example, suppose a product’s SKUs have a size property, and the resulting entries in the XHTML document are:

<div class="atg:role:childSKUs.size" id="1">medium</div>
<div class="atg:role:childSKUs.size" id="2">large</div>
<div class="atg:role:childSKUs.size" id="3">x large</div>
<div class="atg:role:childSKUs.size" id="4">xx large</div>

By applying UniqueWordFilter, you can reduce this to:

<div class="atg:role:childSKUs.size" id="1">medium large x xx</div>

Note that UniqueWordFilter converts all Strings to lowercase, so that redundant words are eliminated even if they don’t have identical case.

You can specify UniqueWordFilter in the XML definition file like this:

<property name="size" filter="uniqueword"/>

You do not need to create a Nucleus component to use this filter.

Although UniqueWordFilter concatenates values and removes redundancies, it is not equivalent to using UniqueFilter and ConcatFilter. UniqueFilter considers the entire string when it eliminates redundant values, not individual words. In this example, each complete string is unique, so UniqueFilter would not actually eliminate any values, and the result would be:

<div class="atg:role:childSKUs.size" id="1">
   medium large x large xx large</div>

Note: You should use UniqueWordFilter carefully, as under certain circumstances it can have undesirable effects. If you use a custom dictionary that includes multi-word terms, searches for those terms may not return the expected results, because the filter may rearrange the order of the words in the index.

 
loading table of contents...