Oracle Commerce Guided Search - Relevance ranking

Relevance ranking

Relevance ranking can impose a significant computational cost in the context of affected search operations (that is, operations where relevance ranking is enabled).

The set of modules that will provide acceptable performance depends heavily on the size and characteristics of the application data set.

In general, Oracle recommends testing the set of modules used for relevance ranking in a staging environment before using it in production. This is because the qualities of the data set may affect relevance ranking performance in unexpected ways. The following characteristics of the data set may negatively affect performance:

The data set is too large to fit into RAM
It contains large file content used in search
It uses stemming or thesaurus heavily
It has many dimensions or properties per record
It frequently produces large result set sizes

Minimizing the performance impact of relevance ranking

You can minimize the performance impact of relevance ranking in your implementation by making module substitutions when appropriate, and ordering the modules you do select sensibly within your relevance ranking strategy.

Making module substitutions

Because of the linear cost of relevance ranking in the size of the result set, the actual cost of relevance ranking depends heavily on the set of ranking modules used. In general, modules that do not perform text evaluation introduce significantly lower computational costs than text-matching-oriented modules.

Although the relative cost of the various ranking modules is dependent on the nature of your data and the number of records, the modules can be roughly grouped into four tiers:

Exact is very computationally expensive.
Proximity, Phrase with Subphrase or Query Expansion options specified, and First are all high-cost modules, presented in the order of decreasing cost.
WFreq can also be costly in some situations.
The remaining modules (Static, Phrase with no options specified, Freq, Spell, Glom, Nterms, Interp, Numfields, Maxfields and Field) are generally relatively cheap.

In order to maximize the performance of your relevance ranking strategy, consider a less expensive way to get similar results. For example, replacing Exact with Phrase may improve performance with relatively little impact on results.

Note

Choose the set of modules used for relevance ranking most carefully when the data set is large or contains large file content that is used for search operations.

Ordering modules sensibly

Relevance ranking modules are only evaluated as needed. When higher-priority modules determine the order of records, lower-priority modules do not need to be calculated. This can have a dramatic impact on performance when higher-cost modules have a lower priority than a lower-cost module.

To optimize performance, make sure that the cheaper modules are placed before the more expensive ones in your strategy.

Copyright © Legal Notices