These controls are expressed as top-level attributes in the query request XML. Many of these controls enable functionality described elsewhere in this document.
Minimum Relevancy Threshold
ATG Search uses a relevancy score to rank results (see the Relevancy Calculation section of the Query Concepts and Processes chapter). The relevancy score is calculated based on how well the statement matches the query, plus how related the retrieved index item of that statement is to the query.
During the collection of the final results, before grouping and secondary sorting, ATG Search applies a minimum threshold on the relevancy score, using the following attribute:
<query minScore="min
"
The min
value must range from 0 to 1000, and defaults to 0. Results that do not meet the minimum threshold are discarded.
Using Grouping Modes
As described in the Result Groups section of the Standard Query chapter, ATG Search has four algorithms to group the final statement results: group-by-document, group-by-statement, group-by-field and group-by-property. The algorithm is controlled by the settings described in the ResponseNumber Settings section (above).
The query XML uses the following attribute to determine which grouping mode to invoke:
<query sorting="mode
"
The mode
value can be document
, text
(for grouping by statement), field
, or property
.
Secondary Sorting
As described in the Standard Query chapter, ATG Search returns a list of result groups in its query response. Normally, the result groups are sorted in relevance order, but you may want to allow users to sort the final results by some secondary criteria, such as date or source or format. This secondary sort does not affect what results are in the result groups, just the order of the returned groups. Secondary sorting is performed before paging, and is controlled by the following attributes:
<query docSort="mode
" docSortOrder="order
" docSortProp="prop
"
dcSortPropDefault="def
" docSortPred="predicate
"
The mode
value specifies how the result groups will be sorted, and can be one of the following:
relevance
– The default value, returns groups in relevance order.alpha
– Sort groups by filename of index item (for instance,index.htm
).address
– Sort groups by web site address of the index item (for example,http://www.mycorp.com
).url
– Sort groups by full URL of the index item.date
– Sort groups by last modified date of the index item.strprop
– Sort groups by a metadata string property, requiresdocSortProp
attribute.numprop
– Sort groups by a metadata number property, requiresdocSortProp
attribute.title
– Sort groups by title of the index item.type
– Sort groups by the type of the index item, such as HTML or PDF.docset
– Sort groups by document set.predicate
– Sort groups by combination of the modes specified in thedocSortPred
attribute (see below).
Note that all the sort modes use the index item of the first result in the group. For group-by-document, all results have the same index item, so this is not important. However, for group-by-statement, the results will have different index items, and only the first (most relevant) result is significant. The order
value determines the direction of the sort; the value can be either ascending
or descending
.
The prop
value specifies the property name to use for the strprop
or numprop
modes. The property name must be a valid property of the given type; that is for strprop
, either string
or enum
, and for numprop
, either integer
, float
, boolean
or date
. Index items that don’t have this property will be excluded from the sort. To prevent exclusion, the def
value can specify the default property value to use for these exceptional cases. The def
value should agree with the type of the property.
The predicate
value specifies a sequence of sorting modes and orders to apply when the mode
=“predicate"
, forming a complex sort criterion. The value has the following form:
docSortPred="mode:order:prop:def|…"
docSortPred="numprop:descending:popularity:0|numprop:ascending:cost"
docSortOrder="ascending"
Paging
ATG Search results paging is controlled by the following attributes:
<query pageNum="num
" pageSize="size
"
The pageSize
attribute controls how many results are returned at one time. If pageSize
is empty, no paging is performing. If paging is used, and the results do not fit on a single page, resubmit the query with pageNum="1"
, pageNum="2"
, etc. to access the additional results (the first result page is page 0).
ATG strongly recommends that you do not use a pageSize
value greater than 100, as this can lead to performance problems or even Search engine failure.
Spelling Correction
As described in the Spelling Feedback section of the Standard Query chapter, ATG Search performs spelling correction and returns suggestions as part of the response.
Spelling suggestion feedback is always returned, but you can configure whether or not to automatically correct any mistakes before issuing the query, using the following attribute:
<query autospell="bool"
The bool
value must be either true
or false
, and defaults to true
.
Query Refinement
Facet sets allows users to refine their query by searching within an existing result set. For example, an end-user conducts a search for luggage on a commerce site, then refines their search by color, price, or material. ATG Search returns refinement results based on settings in a refineConfig.xml
file. This file defines which properties of the indexed products to return as possible facets. Before returning results, ATG Search retrieves the possible values for the properties configured in refineConfig.xml
. Thus, the existing query can be re-submitted with an additional constraint that limits the results to one of the enumerated property values.
Administrators use Search Administration to specify which attributes to use for refinement opportunities in search results presentation. See the “Facet Sets” chapter of the ATG Search Administration Guide for information.
At query time, ATG Search can control which configuration to use and global parameters for the calculation, using the following attributes:
<query refineConfig="name
" refineMax="max
" refineTop="top
" refineMin="min
"
name
—Must be a valid name of a facet set loaded into the index. If no value is given, no calculation is made.Max
—Maximum number of facet properties to return, even if the facet set could generate more. The default value is 0, which means no calculation is made.Top
—Maximum number of facet property values (per property). The values are selected in sort order, which usually is in terms of the number of index items that has each value. The default value is 5.Min
—Minimum size of a facet property value, in terms of the number of index items with that value. The default value is 0.
Query Mode
As described in the Query Concepts and Processes chapter, ATG Search handles natural language and Boolean queries. Complex Boolean expressions require a special mode. Furthermore, ATG Search can support simple keyword search behavior in several additional modes. These modes are controlled by this attribute:
<query mode="mode
"
The mode
value can be:
nlp
—Natural language and simple Boolean queries. This is the default value.boolean
—Complex Boolean expressions; see below for details. This mode operates at the statement level, and should therefore not be used for ATG Commerce integrations; for example, “red” and “shoes” would be indexed as separate statements, and a search that includes both would return not results.keyword
—Handles natural language queries in a simplistic keyword search model. ATG Search parses the query as normal, but each query term is double-quoted and required to appear in the index items of the results. For example, a query of install procedures inkeyword
mode would be interpreted as ++”install” ++”procedures”.and
—Handles natural language queries in an expanded keyword search model. ATG Search parses the query as normal, but each query term is required to appear in the index items of the results. This is similar to thekeyword
mode, but without the double-quotes, which means the query terms could match morphological variants and use term expansions. For example, a query of install procedures inand
mode would be interpreted as ++install ++procedures.ATG Commerce uses the
and
mode by default.matchall
—Handles natural language queries as a Boolean AND of terms, as opposed to the default Boolean OR. ATG Search parses the query as normal, but each query term is required to appear in the result statements. For example, a query of install procedures inmatchall
mode would be interpreted as +install +procedures.
As described the Required Terms and Excluded Terms sections earlier in this chapter, the required term and excluded term operators represent simple approximations of true Boolean operators. In addition to these simple operators, ATG Search supports a special query syntax for Boolean expressions. The Boolean syntax is shown here in Backus Naur Form:
expr := <expr> AND <expr>
expr := <expr> OR <expr>
expr := NOT <expr>
expr := ( <expr> )
expr := [']["]term
["][']
expr := [']wildcard
[']
expr := i..j
The first three statements show the syntax for the three Boolean operator expressions, whose operands can themselves be other expressions. The precedence for these operators is: NOT, AND, OR. The fourth statement shows that parentheses can be used to delimit an expression in order to override operator precedence. For example, x AND y OR z is interpreted by default as (x AND y) OR z, but using explicit parentheses, it could be interpreted as x AND (y OR z).
The last three statements show the three types of simple term expressions: normal term, with optional quote operators; wildcard pattern, with optional single quote operator; and a number range pattern. Thus, full Boolean expressions can utilize all the simple query operators described in this section except for the simple Boolean operators (+, !, ++, !!, +|).
ATG Search handles full Boolean expressions specially, but it shares much of the natural language query handling. ATG Search parses the Boolean expression, building up an operator-operand tree. During this parse, it processes the terms in the same way as a natural language query, including tokenization, morphology and term expansion. In addition, it also has to process the special query operators that may modify the terms. At the end of this process, ATG Search has a vector of query items that can execute normally, plus a Boolean expression tree that can filter the retrieved sentence results.
RQText
This attribute is deprecated and should not be used.
Search Strategies
As described elsewhere in this chapter, ATG Search has a large number of parameters that control searching. To simplify the adjustment of these settings, ATG Search provides five search strategies that are implemented as sets of parameter values. The search strategy is selected by the following attribute:
<query strategy="strat
"
The strat
can be one of the following five values:
everything
– Try unlimited search, without any parameter values that can restrict the search algorithm; also try all term expansions during document candidate retrieval.expand
– Try an expanded search, increasing the default parameter values that can restrict the search algorithm.normal
– Default system settings, optimized for fast search with good search quality; this is the default value.restrict
– Try a restricted search, decreasing the default parameter values which will further restrict the search algorithm; disable non-equal term expansions; adjust relevancy calculation to prefer literal matches.exact
– Try an exact search, which is the same as restrict plus a heavy increase in the/exactWgt
setting, which will force results to contain the literal query string.
Metadata Property Controls
ATG Search returns the metadata properties associated with the index item of each statement result. These returned properties can be used for user interface functionality, such as customized result pages. By default, ATG Search returns all stored metadata properties, but the list of returned properties can be controlled by this attribute:
<query docProps="all"
<query docProps="prop
,prop
,..."
The first example is the default, and indicates that all properties are returned. The second form lists the property names to return in a comma-delimited list.
ATG Search allows a third value for this parameter for even more control of the properties, as shown:
<query
docProps="config;testprop|val,...,val|retprop,...;testprop|val,...,val|ret
prop,...;..."
This syntax allows for conditional selection of returned properties, depending on another property of the index item. The config;
prefix denotes the new special syntax. The testprop
is a name of a property that will be tested for one of the values in the list that follow it. If an index item has one of the values for the testprop
, then the list of retprop
properties is used to control what properties to return just for that index item. If not, the next testprop
sequence is used, and so on until no more configuration data is left. If testprop
is empty, then the sequence is unconditional and all index items will satisfy it. This is useful as the last item of the sequence to denote the default properties for any index item that does not satisfy the specific property tests. If the val
list is empty, then any value satisfies the test. If the retprop
list is empty, then no properties will be returned for index items satisfying the given test.
Automatic Category Constraints
As described in the Categorize Query chapter, ATG Search applies rules to decide what categories of a taxonomy are relevant to the user queries. One use of this functionality is to automatically add the most relevant categories as constraints on the query itself, thus narrowing the search to the more appropriate content. These automatic constraints are controlled by the following attribute:
<query autocat="max
"
<query autocat="max
p"
The max
value is the maximum number of categories to add as constraints. Multiple categories are added as a Boolean OR of document set constraints, joined (that is, AND’ed) to the pre-existing constraints. If the max
value is appended with a p
, then the optional taxonomy pruning post-processing algorithm is used during categorization (see the Taxonomy Pruning section in the Categorize Query chapter for information on this feature).
Suggested Categories
ATG Search applies rules to determine what categories of a taxonomy are relevant to the user queries. This allows the end-user to manually refine the search. This categorization feedback is controlled by the following attribute:
<query suggestcat="max
" suggestcatPrune="prune"
<query suggestcat="max
p"
The max
value is the maximum number of categories to return in the feedback. If the max
value is appended with a p
or the suggestCatPrune
value is true
, then the optional taxonomy pruning post-processing algorithm is used during categorization (see the Taxonomy Pruning section in the Categorize Query chapter for information on this feature).
Categorization Tree
As described in the Categorization Feedback section of the Standard Query chapter, ATG Search can return categorization feedback about the returned results in the form of a tree. This functionality is controlled by the following attribute:
<query docSetSort="mode
"
The mode
value can be:
none
—No categorization feedback tree is constructed. This is the default.Fulltree
—A full categorization tree is returned, with all intervening levels, even if they have no direct connection to the results.Sparsetree
—A categorization tree is returned, but intervening levels that have no direct connection to the results are omitted.
Query Term Feedback
ATG Search returns feedback about related terms and phrases for the query. This functionality is enabled by the following attribute:
<query feedback="bool"
The bool
value must be either true
or false
, and defaults to false
.
Search Context
Normally, ATG Search treats each user query as a separate isolated request, with no pre-existing state or context. However, many user searches are interrelated, and you may want to provide context. ATG Search captures several types of search context in its query request XML, and it is controlled by the following special element and attribute:
<query requestMode="mode
"
<priorInput>context
</priorInput>
The mode
value specifies how the context string in the priorInput
element is interpreted, and can be one of the following values:
normal
– The default value, no search context processing.subtractDoc
– Using the context as a preliminary query, eliminate from the current search results any that are from index items also returned by the context query.subtractAns
– Using the context as a preliminary query, eliminate from the current search results any that are statements also returned by the context query.penalizeDoc
– Using the context as a preliminary query, penalize any current search results that are from index items also returned by the context query. If the penalty exceeds the relevancy, the result is eliminated.penalizeAns
– Using the context as a preliminary query, penalize any current search results that are statements also returned by the context query. If the penalty exceeds the relevancy, the result is eliminated.withinDoc
– Using the context as a preliminary query, restrict the current search results to index items also returned by the context query.withinAns
– Using the context as a preliminary query, restrict the current search results to statements also returned by the context query
The subtract
modes represent the search scenario known as not like this, where the end-user does a search that returns relevant but poor results, and then directs the system to find results not like the poor results.
The penalize
modes represent the search scenario known as less like this, where the end-user does a search that returns relevant but poor results, and then directs the system to find results less like the poor results, but not necessarily eliminating them.
The within
modes represent the search scenario known as search within, where the end-user does a search that returns generally relevant results, and then directs the system to find results of a new query within those initial results.
Query Analysis Mode
ATG Search includes a query module that analyzes user queries and executes special actions which can modify the search behavior. This functionality is controlled by the following parameter:
<query ruleMode="mode
"
The mode
value must be one of the following:
ignore
– No query analysis will be performed.display
– Perform the query analysis, but simply return feedback about the results in the response.exec
– Perform the query analysis and execute the actions, returning feedback about what was executed.
The default value is display
, although only an index that contains query rules will return results.
Related Set Controls
As described in the Standard Query Results section, ATG Search returns the information associated with the index item of each statement result. This information includes related item (document) sets of the index item. By default, ATG Search returns all related item sets, but the number and type of the returned item sets can be controlled by this attribute:
<query maxRelatedSets="max
" relatedSets="path,path,..."
The max value is the maximum number of related sets to return. A value of 0 means no related item set information is returned in the response. The default is 1000. The path values are item set paths (for example, /Topics/Product
) which act as constraints on what type of related sets to return. Only related sets that are descendents of one of the path values are returned. The default value is an empty string, which means that the related sets are unconstrained.