Designed primarily for use with unstructured data, the First module ranks documents by how close the query terms are to the beginning of the document.
The First module groups its results into variably-sized strata. The strata are not the same size, because while the first word is probably more relevant than the tenth word, the 301st is probably not so much more relevant than the 310th word. This module takes advantage of the fact that the closer something is to the beginning of a document, the more likely it is to be relevant.
The First module works as follows:
When the query has a single term, First’s behavior is straight-forward: it retrieves the first absolute position of the word in the document, then calculates which stratum contains that position. The score for this document is based upon that stratum; earlier strata are better than later strata.
When the query has multiple terms, First behaves as follows: The first absolute position for each of the query terms is determined, and then the median position of these positions is calculated. This median is treated as the position of this query in the document and can be used with stratification as described in the single word case.
With query expansion (using stemming, spelling correction, or the thesaurus), the First module treats expanded terms as if they occurred in the source query. For example, the phrase glucose intolerence would be corrected to glucose intolerance (with intolerence spell-corrected to intolerance). First then continues as it does in the non-expansion case. The first position of each term is computed and the median of these is taken.
In a partially matched query, where only some of the query terms cause a document to match, First behaves as if the intersection of terms that occur in the document and terms that occur in the original query were the entire query. For example, if the query cat bird dog is partially matched to a document on the terms cat and bird, then the document is scored as if the query were cat bird. If no terms match, then the document is scored in the lowest strata.
The First relevance ranking module is supported for wildcard queries.
Note
The First module does not work with Boolean searches and cross-field matching. It assigns all such matches a score of zero.