Oracle Commerce Guided Search - All-terms destination property

All-terms destination property

The ALL_TERMS_OUTPUT_PROP_NAME pass-through specifies the property which gets all terms on a record that pass corpus-level filtering.

The ALL_TERMS_OUTPUT_PROP_NAME pass-through is mandatory. The property gets all terms on a record that pass corpus-level filtering regardless of whether they pass record-level filtering.

If the property does not already exist in the implementation, the term extractor will create it. When mapped to a Guided Search property, record searches can be performed on this all-terms property (assuming it is configured to be searchable).

The all-terms property is used for search purposes. For each record, the term extractor finds all of the terms in the corpus-wide vocabulary that occur in that document (regardless of their relevance for that document) and puts them in the all-terms property. By using this property for searches, stemming of the terms can be performed.

For example, if the terms "search engine" and "search engines" appear in the corpus, they will be normalized to the dominant form (e.g., as "search engine"). But if you want to find all records that contain either variant, you cannot use Phrase search against the body text, because Phrase search does not locate stemmed variants. Instead, Term Discovery ensures that both the dimension value name and the term stored in the all-terms property is the dominant form.

The terms in the all-terms property are separated by using sep as the delimiter. The term extractor makes the separator by doing (sep)+ until the separator is not a substring of any term in the corpus. Therefore, the separator may be sep, sepsep, sepsepsep, and so on. For example, an article on Aldera, Spain might produce the following all-terms property (named P_AllTerms) on the record (in the example, sepsep is the separator for the property):

P_AllTerms: district sepsep Spain sepsep south coast sepsep
     coast sepsep town sepsep Aldera sepsep province sepsep
     Romans sepsep decline sepsep station sepsep hills sepsep
     Heracles sepsep temple sepsep colonies sepsep Tiberius

Because of the widespread use of this separator, you should add it to the stop word list. (Note that this is the application’s stop word list, not the term exclude list.) Before doing so, first determine which form is the separator. Run the corpus at least once to find what the separator is and then set that separator as a stop word. For example, if "sep" is a valid term in the corpus, then it is likely that sepsep will the separator. Thus, you would add "sepsep" (but not "sep") to the stop word list. Then, periodically monitor the corpus to make sure the separator has not changed.

All-terms destination property

Guided Search Platform Services Content Acquisition System Relationship Discovery Guide