About word-break analysis

Word-break analysis allows the Spelling Correction feature to consider alternate queries computed by changing the word divisions in the user’s query.

For example, if the query is Back Street Boys, word-break analysis could instruct the Oracle Endeca Server to consider the alternate Backstreet Boys.

The following statements describe how word-break analysis works in the Dgraph process of the Oracle Endeca Server:

It is enabled by default.
As part of the word-break analysis, the Dgraph process removes breaks from the original term, or adds breaks to the original term if needed.
The maximum number of word breaks that the Dgraph adds to or removes from a query is one.
The minimum length for a new term created by word-break analysis is two characters. The Dgraph does not correct words that are smaller than 2 characters. For example, it does not correct anear to a near. It could correct to an ear if there are actual terms in the data corpus that match both an and ear.
When word-break analysis is applied to a query, it requires that the substrings that the term is broken up into appear in the data in succession. For example, starting with the query box17, word-break analysis would find box 17, as well as box-17, assuming that the hyphen (-) has not been specified as a search character. However, it would not find 17 old boxes, because the target terms do not appear in order.