The atg.repository.search.indexing.filter.UniqueWordFilter class concatenates the values of a property into a single string, and then removes all duplicate words. For example, suppose a product’s SKUs have a size property, and the resulting entries in the XHTML document are:

<div class="atg:role:childSKUs.size" id="1">medium</div>
<div class="atg:role:childSKUs.size" id="2">large</div>
<div class="atg:role:childSKUs.size" id="3">x large</div>
<div class="atg:role:childSKUs.size" id="4">xx large</div>

By applying UniqueWordFilter, you can reduce this to:

<div class="atg:role:childSKUs.size" id="1">medium large x xx</div>

Note that UniqueWordFilter converts all Strings to lowercase, so that redundant words are eliminated even if they don’t have identical case.

You can specify UniqueWordFilter in the XML definition file like this:

<property name="size" filter="uniqueword"/>

You do not need to create a Nucleus component to use this filter.

Although UniqueWordFilter concatenates values and removes redundancies, it is not equivalent to using UniqueFilter and ConcatFilter. UniqueFilter considers the entire string when it eliminates redundant values, not individual words. In this example, each complete string is unique, so UniqueFilter would not actually eliminate any values, and the result would be:

<div class="atg:role:childSKUs.size" id="1">
   medium large x large xx large
</div>

Note: You should use UniqueWordFilter carefully, as under certain circumstances it can have undesirable effects. If you use a custom dictionary that includes multi-word terms, searches for those terms may not return the expected results, because the filter may rearrange the order of the words in the index.


Copyright © 1997, 2013 Oracle and/or its affiliates. All rights reserved. Legal Notices