The relQuestSettings
attribute represents low-level numeric variables that control the search and relevancy processing. These settings can be declared in the XML using the format:
relQuestSettings
="/param=value;/param=value;..."
They can also be changed in the <RelQuestSettings>
tag in the global ATG Search configuration file <ATG10dir>\Search10.1.2\SearchEngine\
platform
\bin\AEConfig.xml.
<RelQuestSettings>/param=value;/param=value;...</RelQuestSettings>
Query XML attributes override settings in the AEConfig.xml
file. The param string is the name of the parameter, and the value is an appropriate value for that parameter. Some parameters take a list of values, separated by commas. The remainder of this section describes the parameters. See also the strategy attribute, which allows you to set a number of parameters simultaneously.
Matching Statement Parameters
Oracle ATG Web Commerce Search constructs a candidate list of matching statements, sorted by an estimated relevancy metric. From the candidate list, the top candidates are matched in detail and have their final relevancy computed. The parameters described in this section apply to these candidate statements.
Note: The defaults are optimized to balance processing speed with result quality. Be cautious in making changes.
Parameter Name | Syntax and Default | Parameter Description |
---|---|---|
Statement matching maximum |
| Limits the number of top candidates. |
Statement matching cut-off |
| Detailed matching will end before the maximum of top candidates is reached if the number of relevant statements reaches this parameter value. In this context, a relevant statement is one whose relevancy exceeds the Statement Minimum Relevance (see next row). The value of this parameter should be less than or equal to the |
| Candidates that have a relevancy score less than this parameter are eliminated from the results. The | |
| Candidates that have a relevancy score less than a percentage of the most relevant statement are eliminated from the results; the percentage is controlled by this parameter. For example, if the highest relevancy score is 80 and the relevancy cut-off percentage is 70%, then all candidates that have score less than 56 are eliminated. The default is 0, which disables this mechanism. | |
| Only candidates that have an estimated relevancy that is greater than or equal to this parameter are matched in more detail. Normally, this value is less than or equal to the Statement Minimum Relevance threshold. | |
| The number of total candidates is limited by this parameter. Since this parameter takes effect before the statements are sorted, the candidate statements are collected on a first come, first served basis. However, the query terms are processed in inverse frequency order, guaranteeing the most highly weighted terms will fill in the statement candidates first. |
Matching Document Parameters
Oracle ATG Web Commerce Search constructs a candidate list of retrieved documents, sorted by a term frequency (TF-IDF) metric. From this list, the top candidates are inspected for matching statements.
Parameter Name | Syntax and Default | Parameter Description |
---|---|---|
| The number of top candidates is limited by the parameter. | |
| The number of total candidates is limited by the parameter. Since this parameter takes effect before the documents are sorted, the candidate documents are collected on a first come, first served basis. However, the query terms are processed in inverse frequency order, guaranteeing the most highly weighted terms will fill in the document candidates first. |
Filtering by Thesaurus Link Strength
Oracle ATG Web Commerce Search expands query terms using a thesaurus. Thesaurus entries are characterized by link strength, ranging from equality to weak. By default, Search uses all link types during retrieval, but this behavior is controlled by the following parameter:
/link=none;/link=equality;/link=strong;/link=medium;/link=weak;
The value of none
clears out any previous values, which would result in no term expansions being used during search. Any subsequent values are appended to the list of link types to use. Normally, only the following four setting combinations should be used:
/link=none;/link=equality;/link=strong;/link=medium;
/link=none;/link=equality;/link=strong;
/link=none;/link=equality;
/link=none;
The first example excludes weak links, the second excludes weak and medium links, the third excludes all but equal links, and the fourth disables all term expansion.
Extending Statement Result Text
By default, Oracle ATG Web Commerce Search retrieves a sentence term vector and constructs a statement result with the text of the sentence as the result string. However, some statements can be very small fragments, such as a section header, and lack enough context to be useful as a search result. Search can extend the statement text with subsequent statement text that is also retrieved by the query. This functionality is controlled by three parameters.
Parameter Name | Syntax and Default | Description |
---|---|---|
| The maximum size of a statement text that can be extended, in number of characters. Any statement text that is greater than or equal this value will not be extended. | |
| The maximum size of an extended statement, in number of characters. If the extended statement size would exceed this value, the statement is not extended. The statement text is extended with successive statements until this limit is reached. | |
| The maximum intervening characters that can appear between the statement and its extension. Normally, only white space appears between statements, and large white space tends to indicate a separation of content which should not be joined together. |
Statement Relevance Parameters
The parameters described in this section all act according to the computed weight of a statement. Oracle ATG Web Commerce Search relevancy computation uses a weighted sum of factors for a main score and a tie-breaker score, together forming the final relevancy value or weight.
Parameter Name | Syntax and Default | Parameter Description |
---|---|---|
| Quantifies how closely the surface query terms match the statement terms, discounting indirect matches through term expansions.
| |
| Quantifies if the query text matches exactly within the statement text, without regard to case and white space. The weight of this factor and whether it is a main factor are controlled by the parameters.
This factor is disabled by default, but can be enabled as part of a search strategy (see the strategy attribute). | |
| Quantifies how close in proximity do the query terms match the statement terms. The weight of this factor and whether it is a main factor are controlled by the following parameters.
| |
| Quantifies how well the document pertains to the query, using the term frequency calculation. The weight of this factor and whether it is a main factor are controlled by the parameters.
| |
| Quantifies how well the surrounding statements also match the query. The weight of this factor and whether it is a main factor are controlled by the parameters.
| |
| Quantifies how well the metadata of the statement’s index item match the weighted properties passed in with the query. The weight of this factor and whether it is a main factor are controlled by the parameters.
| |
| The recall factor is the percentage of statement term weight that the query matched. This calculation is biased towards small statements, which have small total term weights. Use this parameter to force all statements to have the same total weight in terms of this recall calculation. A value of 0 means the normal recall calculation is performed. A positive integer value means that that value is used as the recall denominator, in place of the statement’s total term weight. | |
| The recall factor is the percentage of statement term weight that the query matched. This calculation is biased towards statements with repeated terms, since each instance of a term is counted separately. Use this parameter to limit the number of occurrences that are significant in the recall calculation. A value of 0 means the normal recall calculation is performed. A value of 1 means that only 1 occurrence of each term is used. An integer value greater than 1 means that up to that number of occurrences are used in the calculation. | |
Exclude unknown terms |
| A query term that does not exist in the dictionary and has not occurred in the index items provides no information to the system and it cannot retrieve anything. A value of 1 means that the unknown terms are excluded from the query processing and do not effect the relevancy. A value of 0 means that unknown terms are included in the query processing and will hurt the relevancy of the results (since they cannot retrieve anything). |
Special treatment for all-caps terms |
| In a mixed case query, often terms in all capital letters refer to the most important information. A 0 value means that no special treatment is given to these terms. A 1 value means that these terms are required to appear in the statement results, the equivalent of the single + query operator. A value greater than 1 means that these terms are required to appear in the document results, the equivalent of the double ++ query operator. |
Document Relevance Parameters
Oracle ATG Web Commerce Search constructs a candidate list of retrieved documents, sorted by a term frequency (TF-IDF) metric. The parameters described in this section all act according to the computed weight of a document. Search relevancy computation uses a weighted sum of factors for a main score and a tie-breaker score, together forming the final relevancy value or weight.
All terms are used in the statement matching algorithm, giving them some effect on the final results.
Parameter Name | Syntax and Default | Parameter Description |
---|---|---|
| Terms whose weight is less than this parameter are excluded from retrieval. | |
| Search uses term expansions for candidate retrieval, but excludes terms expansions whose link strength is less than this parameter. The default value of |
Search Fields
Oracle ATG Web Commerce Search indexes structured content and records the fields from which each sentence term vector was created. Queries can then be constrained to a limited set of those fields, also called a fielded search. The following parameter establishes which fields are included in a search of structured content such as Oracle ATG Web Knowledge Manager solutions or an Oracle ATG Web Commerce catalog:
/activeSolutionZones=role:id,role:goal,role:symptom,role:question;
This parameter can also take a special value to denote all fields should be searched:
/activeSolutionZones=*;
This is the default value.
Oracle ATG Web Commerce Search also indexes unstructured content and records the fields from which each sentence term vector was created. However, in this case, all sentences from the body of the unstructured content reside in a single field, called doc
. The title of the content is stored in a role:title
field and the URL is stored in a role:url
field. The following parameter establishes which fields are included in search of unstructured content; all other fields are excluded from the search:
/activeSentenceZones=doc;
This parameter can also take a special value to denote all fields should be searched:
/activeSentenceZones=*;
To include the title and URL fields in the search, use the following:
/activeSentenceZones=doc,role:title,role:url;
Conditional Keyword Interpretation
If the query’s mode is nlp
, Oracle ATG Web Commerce Search can treat user queries differently depending on the content of the query.
If the user query consists of N terms or fewer and the query is a simple list of content terms, then the engine will treat the query as a boolean AND on the documents. If the AND of the terms fails to return any results, the normal nlp
mode is used instead. If the AND of terms succeeds, only those documents with all of the terms are returned.
The interpretation depends on the form of the user query. It must be a simple list of content terms, such as “book garden summer”, rather than a statement, such as “a book about gardening in the summer”. Search treats the simple list as an AND, but not the more complex statements or questions.
/implicitAndSize=N
The default value is 4. To disable this feature, set the value to 0.