In some cases, it is useful to filter a set of property values before outputting an XHTML document. For example, suppose your document-level item has many child items, and each child item has a color
property that can have only a few possible values. Rather than outputting the color of every child item, you could include each color in the document just once, by using a filter that removes duplicate property values.
The PropertyValuesFilter
interface defines a method for filtering property values. The atg.repository.search.indexing.filter
package includes three implementations of this interface:
UniqueFilter
removes duplicate property values, returning only the unique values.ConcatFilter
concatenates all of the property values into a single string.UniqueWordFilter
concatenates all of the property values into a single string, and then removes all duplicate words.
Each of these filters can be useful for reducing the size of your XHTML documents. This section provides information about what these filters do and when they’re appropriate.
In the definition file, you can specify property filters by using the filter
attribute. Note that you can use multiple filters on the same property. The value of the filter
attribute is a comma-separated list of Nucleus components. The component names must be absolute pathnames.
To simplify coding of the definition file, you can map PropertyValuesFilter
Nucleus components to simple names, and use those names as the values of filter
attributes. You can perform this mapping by setting the filterMap
property of the IndexingOutputConfig
component. This property is a Map in which the keys are the names and the values are PropertyFilter
Nucleus components that the names represent.
Note, however, that you do not need to perform this mapping to use the UniqueFilter
, ConcatFilter
, or UniqueWordFilter
class. These classes are mapped by default to the following names:
Filter Class | Name |
---|---|
|
|
|
|
|
|
So, for example, you can specify UniqueFilter
like this:
<property name="color" filter="unique"/>
UniqueFilter
You may be able to reduce the size of your index by filtering the property values to remove redundant entries. For example, suppose each XHTML document represents a product with several child SKUs. You might include the SKUs’ salePrice
property in the index as a metadata property, so it can be used for faceting. Depending on the product, many of the SKUs may have the same value for salePrice
. So the resulting entries in an XHTML document might look something like this:
<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
By filtering out redundant entries, you can reduce this to:
<meta name="atg:float:childSKUs.salePrice" content="190.0"/>
<meta name="atg:float:childSKUs.salePrice" content="205.0"/>
To automatically perform this filtering, specify the UniqueFilter
class in the XML definition file:
<property name="salePrice" filter="unique"/>
As a general rule, it is a good idea to specify the unique
filter for a property if multiple items in an XHTML document may have identical values for that property. If you specify this filter for a property and every value of that property in an XHTML document is unique (or if only one item with that property appears in the document), the unique
filter will have no effect on the resulting XHTML (either negative or positive). However, executing this filter increases processing time to create the document, so it is a good idea to specify it only for properties that will benefit from it.
ConcatFilter
You may also be able to reduce the size of your index by concatenating the values of text properties. For example, suppose a product’s SKUs have a color
property, whose values are red, green, blue, and yellow. The resulting entries in the XHTML document will be:
<div class="atg:role:childSKUs.color" id="1">red</div>
<div class="atg:role:childSKUs.color" id="2">green</div>
<div class="atg:role:childSKUs.color" id="3">blue</div>
<div class="atg:role:childSKUs.color" id="4">yellow</div>
By concatenating the values, you can reduce this to:
<div class="atg:role:childSKUs.color" id="1">red green blue yellow</div>
To combine these values into a single tag, specify the ConcatFilter
class in the XML definition file:
<property name="color" filter="concat"/>
This setting invokes an instance of the atg.repository.search.indexing.filter.ConcatFilter
class. Note that you do not need to create a Nucleus component to use this filter.
Some guidelines for using concat
:
Do not use the
concat
filter for metadata properties. This will almost always produce undesirable results.Concatenating property values may have some slight effects on the search results. It is probably best to use the
concat
filter only for properties with short text values, such as thecolor
property shown above. Concatenating properties with long text values, such aslongDescription
, can have negative effects.You can use both the
unique
and theconcat
filters on the same text property, by setting the value of thefilter
attribute to a comma-separated list. The filters are invoked in the order that they are listed, so it is important to put theunique
filter first for it to have an effect. For example:<property name="color" filter="unique,concat"/>
UniqueWordFilter
The atg.repository.search.indexing.filter.UniqueWordFilter
class concatenates the values of a property into a single string, and then removes all duplicate words. For example, suppose a product’s SKUs have a size
property, and the resulting entries in the XHTML document are:
<div class="atg:role:childSKUs.size" id="1">medium</div>
<div class="atg:role:childSKUs.size" id="2">large</div>
<div class="atg:role:childSKUs.size" id="3">x large</div>
<div class="atg:role:childSKUs.size" id="4">xx large</div>
By applying UniqueWordFilter
, you can reduce this to:
<div class="atg:role:childSKUs.size" id="1">medium large x xx</div>
Note that UniqueWordFilter
converts all Strings to lowercase, so that redundant words are eliminated even if they don’t have identical case.
You can specify UniqueWordFilter
in the XML definition file like this:
<property name="size" filter="uniqueword"/>
You do not need to create a Nucleus component to use this filter.
Although UniqueWordFilter
concatenates values and removes redundancies, it is not equivalent to using UniqueFilter
and ConcatFilter
. UniqueFilter
considers the entire string when it eliminates redundant values, not individual words. In this example, each complete string is unique, so UniqueFilter
would not actually eliminate any values, and the result would be:
<div class="atg:role:childSKUs.size" id="1">
medium large x large xx large</div>
Note: You should use UniqueWordFilter
carefully, as under certain circumstances it can have undesirable effects. If you use a custom dictionary that includes multi-word terms, searches for those terms may not return the expected results, because the filter may rearrange the order of the words in the index.