The atg.repository.search.indexing.filter.UniqueWordFilter class removes any duplicate words in the property values, and then concatenates the results into a single string. For example, suppose a product’s SKUs have a size property, and the resulting entries in a record are:

<PROP NAME="sku.size">
  <PVAL>medium</PVAL>
  <PVAL>large</PVAL>
  <PVAL>x large</PVAL>
  <PVAL>xx large</PVAL>
</PROP>

By applying UniqueWordFilter, you can reduce this to:

<PROP NAME="sku.size">
  <PVAL>medium large x xx</PVAL>
</PROP>

Note that UniqueWordFilter converts all Strings to lowercase, so that redundant words are eliminated even if they do not have identical case.

You can specify UniqueWordFilter in the XML definition file like this:

<property name="size" filter="uniqueword"/>

You do not need to create a Nucleus component to use this filter.

Although UniqueWordFilter removes redundancies and concatenates values, it is not equivalent to using a combination of UniqueFilter and ConcatFilter. UniqueFilter considers the entire string when it eliminates redundant values, not individual words. In this example, each complete string is unique, so UniqueFilter would not actually eliminate any values, and the result would be:

<PROP NAME="sku.size">
  <PVAL>medium large x large xx large</PVAL>
</PROP>

Note: You should use UniqueWordFilter carefully, as under certain circumstances it can have undesirable effects. If you use a dictionary that includes multi-word terms, searches for those terms may not return the expected results, because the filter may rearrange the order of the words in the index.


Copyright © 1997, 2015 Oracle and/or its affiliates. All rights reserved. Legal Notices