As described in the Result Groups section of the Standard Query chapter, ATG Search performs grouping on the raw result list in order to avoid duplicate results. The settings that control grouping, along with other controls on the selection of the final results, are described in this section.

The values of these controls can be changed in two ways:

Query XML attributes override settings in the AEConfig.xml file. The param is the name of the parameter, which is pre-pended to the numeric value Value.

Maximum Results by Type

Each statement result has a type attribute that specifies the source of the statement. For unstructured statement results, the type is SENTENCE. And for structured statement results, the type is either the field name, such as ROLE:GOAL, or simply SOLUTION. ATG Search controls how many results of each type can be returned, using the following parameters:

f2,o10,s10,t50

The parameters are:

The o parameter is the default value for any structured statement result type. Individual fields in structured documents can be defined separately, as shown below:

role:goal10,role:symptom10,role:fact2
Maximum Result Pool

The result grouping process operates on the pool of final results, the size of which is controlled by this parameter:

pool200

The pool parameter is the maximum number of each major type of result to collect for grouping: unstructured statements, and structured statements.

Similar Statement Text Threshold

The group-by-statement feature (see Grouping by Statement in the Standard Query chapter) groups statement results by similar statement text. This feature uses a similarity metric to compute a value that quantifies how similar two statements are; this value is then compared to a numeric threshold, which is set by the following parameter:

sim1

The sim parameter sets the value of the numeric threshold, and can be any integer from 0 to 100.

The similarity metric computes what percentage one statement is of the other, based on a strict sub-string match, which ignores case, white space, and punctuation. For example, consider these two statements:

If the installation failed, you probably have the wrong version.
The installation failed.

The first statement is 53 characters long, excluding white space and punctuation. The second statement is 21 characters long, and it is a sub-string of the first. So the similarity metric is 21/53, which equals 40%. If this value is greater than the threshold set by the sim parameter, then these statements are deemed similar for grouping.

A sim value of 0 means that any size sub-string will be considered similar. A value of 100 means that only identical strings are considered similar.

Altering Weight by Result Type

Normally, all statement results receive the same treatment in the relevancy calculation. However, at times it may be useful for certain statement types to be weighted higher or lower. For example, two identical statements from similar documents usually receive near identical relevancy. However, if the statements are from two different text fields (such as role:goal and role:fact), and there is reason to consider one field of more interest to users, varying the relevancy would be valuable. ATG Search supports these weighting factors with the following parameters:

f*1.0,o*1.0,s*1.0,ROLE:ID*2.0

The parameters are:

A weight of 1.0 means the original (pure) relevancy is used. The ROLE:ID field is double weighted, since a text search on a particular id strongly prefers the single document with that id, rather than other documents which might refer to it.

The o* parameter is the default value for any structured statement result type. Individual structured types (or fields) can be defined separately, as shown below:

role:goal*1.2,role:symptom*1.1,role:fact*0.5
Returning Whole Fields in Result Text

Normally, the result text is the matching statement text plus some additional context for small sentences (if using the Extending Statement Result Text option). However, for structured content, which contains potentially multi-sentence fields of text, applications might want the entire text of the field returned as the result text. This behavior is controlled by the following parameter:

field0

The field parameter holds a Boolean value which, if non-zero, means that the entire field text associated with the matching statement is returned.

Displaying Document Summary Text

Normally, the result text is the matching statement text plus some additional context for small sentences. However, some applications may not want to display this text, but simply display the static summary of the retrieved document. For example, a commerce shopping site might always want to display the product description rather than a matching field of text. This behavior is controlled by the following parameter:

sum0

The sum parameter holds a Boolean value which, if non-zero, means that each result will contain no matching result text and the summary text should be used in its place.

Returning One Result per Document or Solution

Normally, ATG Search returns matching statements that may come from the same retrieved document, especially if the documents are large or have repetitive content. The group-by-document feature (see Grouping by Document in the Standard Query chapter) can collect results from the same document, providing a seemingly single-result-per-document display. However, in some applications, other grouping algorithms might be required in conjunction with the desire for a single result per indexed item. For example, commerce applications might want to sort by metadata, but restrict the results to one per item. This behavior is controlled by the following parameter:

onePerDoc0

The onePerDoc parameter holds a Boolean value which, if non-zero, means that there will be only one result for document or item.

Similarly, the onePerSol parameter holds a Boolean value which, if non-zero, means that there will be only one result for a structured document (such as a Knowledge solution or Commerce item).

onePerSol0

This feature takes effect before the final result pool is constructed, so more unique retrieved documents may be collected.

 
loading table of contents...