Top-Level Query Attributes

These controls are expressed as top-level attributes in the query request XML. Many of these controls enable functionality described elsewhere in this document.

Minimum Relevancy Threshold

ATG Search uses a relevancy score to rank results (see the Relevancy Calculation section of the Query Concepts and Processes chapter). The relevancy score is calculated based on how well the statement matches the query, plus how related the retrieved index item of that statement is to the query.

During the collection of the final results, before grouping and secondary sorting, ATG Search applies a minimum threshold on the relevancy score, using the following attribute:

<query minScore="min"

The min value must range from 0 to 1000, and defaults to 0. Results that do not meet the minimum threshold are discarded.

As described in the Result Groups section of the Standard Query chapter, ATG Search has four algorithms to group the final statement results: group-by-document, group-by-statement, group-by-field and group-by-property. The algorithm is controlled by the settings described in the ResponseNumber Settings section (above).

The query XML uses the following attribute to determine which grouping mode to invoke:

<query sorting="mode"

The mode value can be document, text (for grouping by statement), field, or property.

Grouping Type	Description
Group-by-document	Groups the raw search results by document, returning up to some maximum number of groups of a certain size, as defined by these parameters, with the default values shown: `doc10,perDoc3,perSol1,` `- doc`–Maximum number of document result groups to return `- perDoc`–Maximum size of a group from an unstructured index item `- perSol`–Maximum size of a group from a structured index item. Note: An additional mode, `docrank`, is the same as `document`, but it also uses the relevancy of the document instead of the relevancy of the statement to rank results.
Group-by-statement	Groups the raw search results by similar statement text, returning up to some maximum number of groups of a certain size, as defined by these parameters, with the default values shown: `ans10,perAns1,` The `ans` parameter is the maximum number of statement result groups to return, and the `perAns` parameter established the maximum size of a group.
Group-by-property	Groups the raw search results by a metadata property value, returning up to some maximum number of groups of a certain size, as defined by these parameters, with the default values shown: `prop10,perProp3,` The `prop` parameter is the maximum number of property result groups to return, and the `perProp` parameter established the maximum size of a group. To group by property, the mode value requires a `sortProp` attribute with the type, name and default value for the grouping property. type—One of the six valid property types: `enum`, `string`, `integer`, `float`, `boolean`, and `date`. Name—A valid property name of the given type. Default—A valid value for the property of the given type; this value is used for results that do not contain the given property.
Group-by-field	Groups the raw results by the statement fields, returning up to some maximum number of groups of a certain size, as defined by these parameters, with the default values shown: `field10,perField3,` The `field` parameter is the maximum number of field result groups to return, and the `perField` parameter established the maximum size of a group.

Secondary Sorting

As described in the Standard Query chapter, ATG Search returns a list of result groups in its query response. Normally, the result groups are sorted in relevance order, but you may want to allow users to sort the final results by some secondary criteria, such as date or source or format. This secondary sort does not affect what results are in the result groups, just the order of the returned groups. Secondary sorting is performed before paging, and is controlled by the following attributes:

<query docSort="mode" docSortOrder="order" docSortProp="prop"
dcSortPropDefault="def" docSortPred="predicate"

The mode value specifies how the result groups will be sorted, and can be one of the following:

relevance – The default value, returns groups in relevance order.
alpha – Sort groups by filename of index item (for instance, index.htm).
address – Sort groups by web site address of the index item (for example, http://www.mycorp.com).
url – Sort groups by full URL of the index item.
date – Sort groups by last modified date of the index item.
strprop – Sort groups by a metadata string property, requires docSortProp attribute.
numprop – Sort groups by a metadata number property, requires docSortProp attribute.
title – Sort groups by title of the index item.
type – Sort groups by the type of the index item, such as HTML or PDF.
docset – Sort groups by document set.
predicate – Sort groups by combination of the modes specified in the docSortPred attribute (see below).

Note that all the sort modes use the index item of the first result in the group. For group-by-document, all results have the same index item, so this is not important. However, for group-by-statement, the results will have different index items, and only the first (most relevant) result is significant. The order value determines the direction of the sort; the value can be either ascending or descending.

The prop value specifies the property name to use for the strprop or numprop modes. The property name must be a valid property of the given type; that is for strprop, either string or enum, and for numprop, either integer, float, boolean or date. Index items that don’t have this property will be excluded from the sort. To prevent exclusion, the def value can specify the default property value to use for these exceptional cases. The def value should agree with the type of the property.

The predicate value specifies a sequence of sorting modes and orders to apply when the mode=“predicate", forming a complex sort criterion. The value has the following form:

docSortPred="mode:order:prop:def|…"

docSortPred="numprop:descending:popularity:0|numprop:ascending:cost"
docSortOrder="ascending"

Paging

ATG Search results paging is controlled by the following attributes:

<query pageNum="num" pageSize="size"

The pageSize attribute controls how many results are returned at one time. If pageSize is empty, no paging is performing. If paging is used, and the results do not fit on a single page, resubmit the query with pageNum="1", pageNum="2", etc. to access the additional results (the first result page is page 0).

ATG strongly recommends that you do not use a pageSize value greater than 100, as this can lead to performance problems or even Search engine failure.

Spelling Correction

As described in the Spelling Feedback section of the Standard Query chapter, ATG Search performs spelling correction and returns suggestions as part of the response.

Spelling suggestion feedback is always returned, but you can configure whether or not to automatically correct any mistakes before issuing the query, using the following attribute:

<query autospell="bool"

The bool value must be either true or false, and defaults to true.

Query Refinement

Facet sets allows users to refine their query by searching within an existing result set. For example, an end-user conducts a search for luggage on a commerce site, then refines their search by color, price, or material. ATG Search returns refinement results based on settings in a refineConfig.xml file. This file defines which properties of the indexed products to return as possible facets. Before returning results, ATG Search retrieves the possible values for the properties configured in refineConfig.xml. Thus, the existing query can be re-submitted with an additional constraint that limits the results to one of the enumerated property values.

Administrators use Search Administration to specify which attributes to use for refinement opportunities in search results presentation. See the “Facet Sets” chapter of the ATG Search Administration Guide for information.

At query time, ATG Search can control which configuration to use and global parameters for the calculation, using the following attributes:

<query refineConfig="name" refineMax="max" refineTop="top" refineMin="min"

name—Must be a valid name of a facet set loaded into the index. If no value is given, no calculation is made.
Max—Maximum number of facet properties to return, even if the facet set could generate more. The default value is 0, which means no calculation is made.
Top—Maximum number of facet property values (per property). The values are selected in sort order, which usually is in terms of the number of index items that has each value. The default value is 5.
Min—Minimum size of a facet property value, in terms of the number of index items with that value. The default value is 0.

Query Mode

As described in the Query Concepts and Processes chapter, ATG Search handles natural language and Boolean queries. Complex Boolean expressions require a special mode. Furthermore, ATG Search can support simple keyword search behavior in several additional modes. These modes are controlled by this attribute:

<query mode="mode"

The mode value can be:

nlp—Natural language and simple Boolean queries. This is the default value.
boolean—Complex Boolean expressions; see below for details. This mode operates at the statement level, and should therefore not be used for ATG Commerce integrations; for example, “red” and “shoes” would be indexed as separate statements, and a search that includes both would return not results.
keyword—Handles natural language queries in a simplistic keyword search model. ATG Search parses the query as normal, but each query term is double-quoted and required to appear in the index items of the results. For example, a query of install procedures in keyword mode would be interpreted as ++”install” ++”procedures”.
and—Handles natural language queries in an expanded keyword search model. ATG Search parses the query as normal, but each query term is required to appear in the index items of the results. This is similar to the keyword mode, but without the double-quotes, which means the query terms could match morphological variants and use term expansions. For example, a query of install procedures in and mode would be interpreted as ++install ++procedures.
ATG Commerce uses the and mode by default.
matchall—Handles natural language queries as a Boolean AND of terms, as opposed to the default Boolean OR. ATG Search parses the query as normal, but each query term is required to appear in the result statements. For example, a query of install procedures in matchall mode would be interpreted as +install +procedures.

As described the Required Terms and Excluded Terms sections earlier in this chapter, the required term and excluded term operators represent simple approximations of true Boolean operators. In addition to these simple operators, ATG Search supports a special query syntax for Boolean expressions. The Boolean syntax is shown here in Backus Naur Form:

expr := <expr> AND <expr>
expr := <expr> OR <expr>
expr := NOT <expr>
expr := ( <expr> )
expr := [']["]term["][']
expr := [']wildcard[']
expr := i..j

The first three statements show the syntax for the three Boolean operator expressions, whose operands can themselves be other expressions. The precedence for these operators is: NOT, AND, OR. The fourth statement shows that parentheses can be used to delimit an expression in order to override operator precedence. For example, x AND y OR z is interpreted by default as (x AND y) OR z, but using explicit parentheses, it could be interpreted as x AND (y OR z).

The last three statements show the three types of simple term expressions: normal term, with optional quote operators; wildcard pattern, with optional single quote operator; and a number range pattern. Thus, full Boolean expressions can utilize all the simple query operators described in this section except for the simple Boolean operators (+, !, ++, !!, +|).

ATG Search handles full Boolean expressions specially, but it shares much of the natural language query handling. ATG Search parses the Boolean expression, building up an operator-operand tree. During this parse, it processes the terms in the same way as a natural language query, including tokenization, morphology and term expansion. In addition, it also has to process the special query operators that may modify the terms. At the end of this process, ATG Search has a vector of query items that can execute normally, plus a Boolean expression tree that can filter the retrieved sentence results.

RQText

This attribute is deprecated and should not be used.

Search Strategies

As described elsewhere in this chapter, ATG Search has a large number of parameters that control searching. To simplify the adjustment of these settings, ATG Search provides five search strategies that are implemented as sets of parameter values. The search strategy is selected by the following attribute:

<query strategy="strat"

The strat can be one of the following five values:

everything – Try unlimited search, without any parameter values that can restrict the search algorithm; also try all term expansions during document candidate retrieval.
expand – Try an expanded search, increasing the default parameter values that can restrict the search algorithm.
normal – Default system settings, optimized for fast search with good search quality; this is the default value.
restrict – Try a restricted search, decreasing the default parameter values which will further restrict the search algorithm; disable non-equal term expansions; adjust relevancy calculation to prefer literal matches.
exact – Try an exact search, which is the same as restrict plus a heavy increase in the /exactWgt setting, which will force results to contain the literal query string.

Metadata Property Controls

ATG Search returns the metadata properties associated with the index item of each statement result. These returned properties can be used for user interface functionality, such as customized result pages. By default, ATG Search returns all stored metadata properties, but the list of returned properties can be controlled by this attribute:

<query docProps="all"

<query docProps="prop,prop,..."

The first example is the default, and indicates that all properties are returned. The second form lists the property names to return in a comma-delimited list.

ATG Search allows a third value for this parameter for even more control of the properties, as shown:

<query
docProps="config;testprop|val,...,val|retprop,...;testprop|val,...,val|ret
prop,...;..."

This syntax allows for conditional selection of returned properties, depending on another property of the index item. The config; prefix denotes the new special syntax. The testprop is a name of a property that will be tested for one of the values in the list that follow it. If an index item has one of the values for the testprop, then the list of retprop properties is used to control what properties to return just for that index item. If not, the next testprop sequence is used, and so on until no more configuration data is left. If testprop is empty, then the sequence is unconditional and all index items will satisfy it. This is useful as the last item of the sequence to denote the default properties for any index item that does not satisfy the specific property tests. If the val list is empty, then any value satisfies the test. If the retprop list is empty, then no properties will be returned for index items satisfying the given test.

Automatic Category Constraints

As described in the Categorize Query chapter, ATG Search applies rules to decide what categories of a taxonomy are relevant to the user queries. One use of this functionality is to automatically add the most relevant categories as constraints on the query itself, thus narrowing the search to the more appropriate content. These automatic constraints are controlled by the following attribute:

<query autocat="max"

<query autocat="maxp"

The max value is the maximum number of categories to add as constraints. Multiple categories are added as a Boolean OR of document set constraints, joined (that is, AND’ed) to the pre-existing constraints. If the max value is appended with a p, then the optional taxonomy pruning post-processing algorithm is used during categorization (see the Taxonomy Pruning section in the Categorize Query chapter for information on this feature).

Suggested Categories

ATG Search applies rules to determine what categories of a taxonomy are relevant to the user queries. This allows the end-user to manually refine the search. This categorization feedback is controlled by the following attribute:

<query suggestcat="max" suggestcatPrune="prune"

<query suggestcat="maxp"

The max value is the maximum number of categories to return in the feedback. If the max value is appended with a p or the suggestCatPrune value is true, then the optional taxonomy pruning post-processing algorithm is used during categorization (see the Taxonomy Pruning section in the Categorize Query chapter for information on this feature).

Categorization Tree

As described in the Categorization Feedback section of the Standard Query chapter, ATG Search can return categorization feedback about the returned results in the form of a tree. This functionality is controlled by the following attribute:

<query docSetSort="mode"

The mode value can be:

none—No categorization feedback tree is constructed. This is the default.
Fulltree—A full categorization tree is returned, with all intervening levels, even if they have no direct connection to the results.
Sparsetree—A categorization tree is returned, but intervening levels that have no direct connection to the results are omitted.

Query Term Feedback

ATG Search returns feedback about related terms and phrases for the query. This functionality is enabled by the following attribute:

<query feedback="bool"

The bool value must be either true or false, and defaults to false.

Search Context

Normally, ATG Search treats each user query as a separate isolated request, with no pre-existing state or context. However, many user searches are interrelated, and you may want to provide context. ATG Search captures several types of search context in its query request XML, and it is controlled by the following special element and attribute:

<query requestMode="mode"

<priorInput>context</priorInput>

The mode value specifies how the context string in the priorInput element is interpreted, and can be one of the following values:

normal – The default value, no search context processing.
subtractDoc – Using the context as a preliminary query, eliminate from the current search results any that are from index items also returned by the context query.
subtractAns – Using the context as a preliminary query, eliminate from the current search results any that are statements also returned by the context query.
penalizeDoc – Using the context as a preliminary query, penalize any current search results that are from index items also returned by the context query. If the penalty exceeds the relevancy, the result is eliminated.
penalizeAns – Using the context as a preliminary query, penalize any current search results that are statements also returned by the context query. If the penalty exceeds the relevancy, the result is eliminated.
withinDoc – Using the context as a preliminary query, restrict the current search results to index items also returned by the context query.
withinAns – Using the context as a preliminary query, restrict the current search results to statements also returned by the context query

The subtract modes represent the search scenario known as not like this, where the end-user does a search that returns relevant but poor results, and then directs the system to find results not like the poor results.

The penalize modes represent the search scenario known as less like this, where the end-user does a search that returns relevant but poor results, and then directs the system to find results less like the poor results, but not necessarily eliminating them.

The within modes represent the search scenario known as search within, where the end-user does a search that returns generally relevant results, and then directs the system to find results of a new query within those initial results.

Query Analysis Mode

ATG Search includes a query module that analyzes user queries and executes special actions which can modify the search behavior. This functionality is controlled by the following parameter:

<query ruleMode="mode"

The mode value must be one of the following:

ignore – No query analysis will be performed.
display – Perform the query analysis, but simply return feedback about the results in the response.
exec – Perform the query analysis and execute the actions, returning feedback about what was executed.

The default value is display, although only an index that contains query rules will return results.

Related Set Controls

As described in the Standard Query Results section, ATG Search returns the information associated with the index item of each statement result. This information includes related item (document) sets of the index item. By default, ATG Search returns all related item sets, but the number and type of the returned item sets can be controlled by this attribute:

<query maxRelatedSets="max" relatedSets="path,path,..."

The max value is the maximum number of related sets to return. A value of 0 means no related item set information is returned in the response. The default is 1000. The path values are item set paths (for example, /Topics/Product) which act as constraints on what type of related sets to return. Only related sets that are descendents of one of the path values are returned. The default value is an empty string, which means that the related sets are unconstrained.

ATG Search Query Reference Guide