The Phrase module states that results containing the user’s query as an exact phrase, or a subset of the exact phrase, should be considered more relevant than matches simply containing the user’s search terms scattered throughout the text.

Records that have the phrase are ranked higher than records which do not contain the phrase.

The Phrase module has a variety of options that you use to customize its behavior.

The Phrase options are:

When you add the Phrase module in the Relevance Ranking Modules editor, you are presented with the following editor that allows you to set these options.

The three configuration settings for the Phrase module can be used in a variety of combinations for different effects.

The following matrix describes the behavior of each combination.

Subphrase

Approximate

Expansion

Description

Off

Off

Off

Default. Ranks results into two strata: those that match the user’s query as a whole phrase, and those that do not.

Off

Off

On

Ranks results into two strata: those that match the original, or an extended version, of the query as a whole phrase, and those that do not.

Off

On

Off

Ranks results into two strata: those that match the original query as a whole phrase, and those that do not. Look only at the first possible phrase match within each record.

Off

On

On

Ranks results into two strata: those that match the original, or an extended version, of the query as a whole phrase, and those that do not. Look only at the first possible phrase match within each record.

On

Off

Off

Ranks results into N strata where N equals the length of the query and each result’s score equals the length of its matched subphrase.

On

Off

On

Ranks results into N strata where N equals the length of the query and each result’s score equals the length of its matched subphrase. Extend subphrases to facilitate matching but rank based on the length of the original subphrase (before extension).

Note

This combination can have a negative performance impact on query throughput.

On

On

Off

Ranks results into N strata where N equals the length of the query and each result’s score equals the length of its matched subphrase. Look only at the first possible phrase match within each record.

On

On

On

Ranks results into N strata where N equals the length of the query and each result’s score equals the length of its matched subphrase. Expand the query to facilitate matching but rank based on the length of the original subphrase (before extension). Look only at the first possible phrase match within each record.

The Phrase module translates each wildcard in a query into a generic placeholder for a single term.

For example, the query “sparkling w* wine” becomes “sparkling * wine” during phrase relevance ranking, where “*” indicates a single term. This generic wildcard replacement causes slightly different behavior depending on whether subphrasing is enabled.

When subphrasing is not enabled, all results that match the generic version of the wildcard phrase exactly are still placed into the first stratum. It is important, however, to understand what constitutes a matching result from the Phrase module’s point of view.

Consider the search query “sparkling w* wine” with the MatchAny mode enabled. In MatchAny mode, search results only need to contain one of the requested terms to be valid, so a list of search results for this query could contain phrases that look like this:

sparkling white wine
sparkling refreshing wine
sparkling wet wine
sparkling soda
wine cooler

When phrase relevance ranking is applied to these search results, the Phrase module looks for matches to “sparkling * wine” not “sparkling w* wine.” Therefore, there are three results—”sparkling white wine,” “sparkling refreshing wine,” and “sparkling wet wine”—that are considered phrase matches for the purposes of ranking. These results are placed in the first stratum. The other two results are placed in the second stratum.

When subphrasing is enabled, the behavior becomes a bit more complex. Again, we have to remember that wildcards become generic placeholders and match any single term in a result. This means that any subphrase that is adjacent to a wildcard will, by definition, match at least one additional term (the wildcard). Because of this behavior, subphrases break down differently. The subphrases for “cold sparkling w* wine” break down into the following (note that w* changes to *):

cold
sparkling *
* wine
cold sparkling *
sparkling * wine
cold sparkling * wine

Notice that the subphrases “sparkling,” “wine,” and “cold sparkling” are not included in this list. Because these subphrases are adjacent to the wildcard, we know that the subphrases will match at least one additional term. Therefore, these subphrases are subsumed by the “sparkling *”, “* wine”, and “cold sparkling *” subphrases.

Like regular subphrase, stratification is based on the number of terms in the subphrase, and the wildcard placeholders are counted toward the length of the subphrase. To continue the example above, results that contain “cold” get a score of one, results that contain “sparkling *” get a score of two, and so on. Again, this is the case even if the matching result phrases are different, for example, “sparkling white” and “sparkling soda.”

Finally, it is important to note that, while the wildcard can be replaced by any term, a term must still exist. In other words, search results that contain the phrase “sparkling wine” are not acceptable matches for the phrase “sparkling * wine” because there is no term to substitute for the wildcard. Conversely, the phrase “sparkling cold white wine” is also not a match because each wildcard can be replaced by one, and only one, term. Even when wildcards are present, results must contain the correct number of terms, in the correct order, for them to be considered phrase matches by the Phrase module.


Copyright © Legal Notices