In some cases, it is useful to filter a set of property values before outputting a record. For example, suppose each record represents a product whose SKUs all have the same display name. Rather than outputting the displayName property value of each SKU, you could include displayName in the record just once, by using a filter that removes duplicate property values.
The PropertyValuesFilter interface defines a method for filtering property values. The atg.repository.search.indexing.filter package includes several implementations of this interface:
UniqueFilterremoves duplicate property values, returning only the unique values.ConcatFilterconcatenates all of the property values into a single string.UniqueWordFilterremoves any duplicate words in the property values, and then concatenates the results into a single string.HtmlFilterremoves any HTML markup from the property values.
This section provides information about what these filters do and when they’re appropriate.
In an EndecaIndexingOutputConfig definition file, you can specify property filters by using the filter attribute. Note that you can use multiple filters on the same property. The value of the filter attribute is a comma-separated list of Nucleus components. The component names must be absolute pathnames.
To simplify coding of the definition file, you can map PropertyValuesFilter Nucleus components to simple names, and use those names as the values of filter attributes. You can perform this mapping by setting the filterMap property of the IndexingOutputConfig component. This property is a Map in which the keys are the names and the values are PropertyFilter Nucleus components that the names represent.
Note, however, that you do not need to perform this mapping to use the UniqueFilter, ConcatFilter, UniqueWordFilter, or HtmlFilter class. These classes are mapped by default to the following names:
Filter Class | Name |
|---|---|
|
|
|
|
|
|
|
|
So, for example, you can specify UniqueFilter like this:
<property name="color" filter="unique"/>

