In some cases, it is useful to filter a set of property values before outputting a record. For example, suppose each record represents a product whose SKUs all have the same display name. Rather than outputting the displayName
property value of each SKU, you could include displayName
in the record only once, by using a filter that removes duplicate property values.
The PropertyValuesFilter
interface defines a method for filtering property values. The atg.repository.search.indexing.filter
package includes several implementations of this interface:
UniqueFilter
removes duplicate property values, returning only the unique values.ConcatFilter
concatenates all of the property values into a single string.UniqueWordFilter
removes any duplicate words in the property values, and then concatenates the results into a single string.HtmlFilter
removes any HTML markup from the property values.
This section provides information about what these filters do and when they’re appropriate.
In an EndecaIndexingOutputConfig
definition file, you can specify property filters by using the filter
attribute. Note that you can use multiple filters on the same property. The value of the filter
attribute is a comma-separated list of Nucleus components. The component names must be absolute pathnames.
To simplify coding of the definition file, you can map PropertyValuesFilter
Nucleus components to simple names, and use those names as the values of filter
attributes. You can perform this mapping by setting the filterMap
property of the EndecaIndexingOutputConfig
component. This property is a Map in which the keys are the names and the values are PropertyFilter
Nucleus components that the names represent.
Note, however, that you do not need to perform this mapping to use the UniqueFilter
, ConcatFilter
, UniqueWordFilter
, or HtmlFilter
class. These classes are mapped by default to the following names:
Filter Class | Name |
---|---|
|
|
|
|
|
|
|
|
So, for example, you can specify UniqueFilter
like this:
<property name="color" filter="unique"/>