The proximity module ranks how close the query terms are to each other in a document by counting the number of intervening words. It is designed primarily for use with unstructured data.

Like the first module, the proximity module groups its results into variable sized strata, because the difference in significance of an interval of one word and one of two words is usually greater than the difference in significance of an interval of 21 words and 22. If no terms match, the document is placed in the lowest stratum.

Single words and phrases get assigned to the best stratum because there are no intervening words. When the query has multiple terms, proximity behaves as follows:

Under query expansion (that is, stemming and the thesaurus), the expanded terms are treated as if they were in the query, so the proximity metric is computed using the locations of the expanded terms in the matching document.

For example, if a user searches for “big cats” and a document contains the sentence, “Big Bird likes his cat” (stemming takes cats to cat ), then the proximity metric is computed just as if the sentence were, “Big Bird likes his cats.” The proximity module scores partially matched queries as if the query contains only the matching terms. For example, if a user searches for “cat dog fish” and a document is partially matched that contains only cat and fish, then the document is scored as if the query “cat fish” had been entered.

Note: The proximity module does not work with Boolean searches, cross-field matching, or wildcard search. It assigns all such matches a score of zero.


Copyright © 1997, 2016 Oracle and/or its affiliates. All rights reserved. Legal Notices