The relQuestSettings attribute represents low-level numeric variables that control the search and relevancy processing. These settings can be declared in the XML using the format:

relQuestSettings="/param=value;/param=value;..."

They can also be changed in the <RelQuestSettings> tag in the global ATG Search configuration file <ATG10dir>\Search10.1\SearchEngine\platform\bin\AEConfig.xml.

<RelQuestSettings>/param=value;/param=value;...</RelQuestSettings>

Query XML attributes override settings in the AEConfig.xml file. The param string is the name of the parameter, and the value is an appropriate value for that parameter. Some parameters take a list of values, separated by commas. The remainder of this section describes the parameters. See also the strategy attribute, which allows you to set a number of parameters simultaneously.

Matching Statement Parameters

Oracle ATG Web Commerce Search constructs a candidate list of matching statements, sorted by an estimated relevancy metric. From the candidate list, the top candidates are matched in detail and have their final relevancy computed. The parameters described in this section apply to these candidate statements.

Note: The defaults are optimized to balance processing speed with result quality. Be cautious in making changes.

Parameter Name

Syntax and Default

Parameter Description

Statement matching maximum

/retMax=5000;

Limits the number of top candidates.

Statement matching cut-off

/retLimit=3000;

Detailed matching will end before the maximum of top candidates is reached if the number of relevant statements reaches this parameter value.

In this context, a relevant statement is one whose relevancy exceeds the Statement Minimum Relevance (see next row). The value of this parameter should be less than or equal to the retMax value.

Statement minimum relevance

/relevMinFAQ=10;

/relevMinSent=10;

Candidates that have a relevancy score less than this parameter are eliminated from the results. The relevMinFAQ parameter is used for preferred answer statement matches, relevMinSent for all other statement matches.

Statement relevance cut-off

/relevCutoff=0;

Candidates that have a relevancy score less than a percentage of the most relevant statement are eliminated from the results; the percentage is controlled by this parameter.

For example, if the highest relevancy score is 80 and the relevancy cut-off percentage is 70%, then all candidates that have score less than 56 are eliminated. The default is 0, which disables this mechanism.

Estimation minimum relevance

/estimateMin=10;

Only candidates that have an estimated relevancy that is greater than or equal to this parameter are matched in more detail. Normally, this value is less than or equal to the Statement Minimum Relevance threshold.

Statement candidate maximum

/estimateMax=50000;

The number of total candidates is limited by this parameter. Since this parameter takes effect before the statements are sorted, the candidate statements are collected on a first come, first served basis. However, the query terms are processed in inverse frequency order, guaranteeing the most highly weighted terms will fill in the statement candidates first.

Matching Document Parameters

Oracle ATG Web Commerce Search constructs a candidate list of retrieved documents, sorted by a term frequency (TF-IDF) metric. From this list, the top candidates are inspected for matching statements.

Parameter Name

Syntax and Default

Parameter Description

Document retrieval maximum

/maxDocuments=1000;

The number of top candidates is limited by the parameter.

Document candidate maximum

/estimateDocMax=10000;

The number of total candidates is limited by the parameter.

Since this parameter takes effect before the documents are sorted, the candidate documents are collected on a first come, first served basis. However, the query terms are processed in inverse frequency order, guaranteeing the most highly weighted terms will fill in the document candidates first.

Filtering by Thesaurus Link Strength

Oracle ATG Web Commerce Search expands query terms using a thesaurus. Thesaurus entries are characterized by link strength, ranging from equality to weak. By default, Search uses all link types during retrieval, but this behavior is controlled by the following parameter:

/link=none;/link=equality;/link=strong;/link=medium;/link=weak;

The value of none clears out any previous values, which would result in no term expansions being used during search. Any subsequent values are appended to the list of link types to use. Normally, only the following four setting combinations should be used:

/link=none;/link=equality;/link=strong;/link=medium;
/link=none;/link=equality;/link=strong;
/link=none;/link=equality;
/link=none;

The first example excludes weak links, the second excludes weak and medium links, the third excludes all but equal links, and the fourth disables all term expansion.

Extending Statement Result Text

By default, Oracle ATG Web Commerce Search retrieves a sentence term vector and constructs a statement result with the text of the sentence as the result string. However, some statements can be very small fragments, such as a section header, and lack enough context to be useful as a search result. Search can extend the statement text with subsequent statement text that is also retrieved by the query. This functionality is controlled by three parameters.

Parameter Name

Syntax and Default

Description

Minimum Answer Length

/minAnswerLength=75;

The maximum size of a statement text that can be extended, in number of characters. Any statement text that is greater than or equal this value will not be extended.

Maximum Answer Length

/maxAnswerLength=250;

The maximum size of an extended statement, in number of characters. If the extended statement size would exceed this value, the statement is not extended. The statement text is extended with successive statements until this limit is reached.

Maximum Intervening Characters

/maxIntervening=5;

The maximum intervening characters that can appear between the statement and its extension. Normally, only white space appears between statements, and large white space tends to indicate a separation of content which should not be joined together.

Statement Relevance Parameters

The parameters described in this section all act according to the computed weight of a statement. Oracle ATG Web Commerce Search relevancy computation uses a weighted sum of factors for a main score and a tie-breaker score, together forming the final relevancy value or weight.

Parameter Name

Syntax and Default

Parameter Description

Literal weight relevancy factor

/literalWgt=25;

/literalMain=1;

Quantifies how closely the surface query terms match the statement terms, discounting indirect matches through term expansions.

literalWgt must be a non-negative integer from 0 to 100.

literalMain is a Boolean variable. If true, this factor is main; if not, this is a tie-breaker factor.

Exact weight relevancy factor

/exactWgt=0;

/exactMain=1;

Quantifies if the query text matches exactly within the statement text, without regard to case and white space. The weight of this factor and whether it is a main factor are controlled by the parameters.

exactWgt must be a non-negative integer from 0 to 100.

exactMain is a Boolean variable. If true, this factor is main; it not, this is a tie-breaker factor.

This factor is disabled by default, but can be enabled as part of a search strategy (see the strategy attribute).

Proximity weight relevancy factor

/proxWgt=8;

/proxMain=1;

Quantifies how close in proximity do the query terms match the statement terms. The weight of this factor and whether it is a main factor are controlled by the following parameters.

proxWgt must be a non-negative integer from 0 to 100.

proxMain is a Boolean variable. If true, this factor is main; it not, this is a tie-breaker factor.

Document weight relevancy factor

/docWgt=8;

/docMain=0;

Quantifies how well the document pertains to the query, using the term frequency calculation. The weight of this factor and whether it is a main factor are controlled by the parameters.

docWgt must be a non-negative integer from 0 to 100.

docMain is a Boolean variable. If true, this factor is main; it not, this is a tie-breaker factor.

Context relevancy factor

/contextWgt=17;

/contextMain=0;

/contextSize=2;

Quantifies how well the surrounding statements also match the query. The weight of this factor and whether it is a main factor are controlled by the parameters.

contextWgt must be a non-negative integer from 0 to 100.

contextMain is a Boolean variable. If true, this factor is main; it not, this is a tie-breaker factor.

contextSize controls the size of the context, in number of statements, and must be a positive integer.

Metadata relevancy factor

/metaWgt=25;

/metaMain=1;

/metaWgtMax=100;

Quantifies how well the metadata of the statement’s index item match the weighted properties passed in with the query. The weight of this factor and whether it is a main factor are controlled by the parameters.

metaWgt must be a non-negative integer from 0 to 100.

metaMain is a Boolean variable. If true, this factor is main; it not, this is a tie-breaker factor.

metaWgtMax specifies the maximum weighted property weight.

Match denominator

/matchDenom=0;

The recall factor is the percentage of statement term weight that the query matched. This calculation is biased towards small statements, which have small total term weights. Use this parameter to force all statements to have the same total weight in terms of this recall calculation.

A value of 0 means the normal recall calculation is performed. A positive integer value means that that value is used as the recall denominator, in place of the statement’s total term weight.

Duplicate term factor

/dupTermFactor=2;

The recall factor is the percentage of statement term weight that the query matched. This calculation is biased towards statements with repeated terms, since each instance of a term is counted separately. Use this parameter to limit the number of occurrences that are significant in the recall calculation.

A value of 0 means the normal recall calculation is performed. A value of 1 means that only 1 occurrence of each term is used. An integer value greater than 1 means that up to that number of occurrences are used in the calculation.

Exclude unknown terms

/exUnk=1;

A query term that does not exist in the dictionary and has not occurred in the index items provides no information to the system and it cannot retrieve anything.

A value of 1 means that the unknown terms are excluded from the query processing and do not effect the relevancy.

A value of 0 means that unknown terms are included in the query processing and will hurt the relevancy of the results (since they cannot retrieve anything).

Special treatment for all-caps terms

/autoAllCapsMode=0;

In a mixed case query, often terms in all capital letters refer to the most important information.

A 0 value means that no special treatment is given to these terms.

A 1 value means that these terms are required to appear in the statement results, the equivalent of the single + query operator.

A value greater than 1 means that these terms are required to appear in the document results, the equivalent of the double ++ query operator.

Document Relevance Parameters

Oracle ATG Web Commerce Search constructs a candidate list of retrieved documents, sorted by a term frequency (TF-IDF) metric. The parameters described in this section all act according to the computed weight of a document. Search relevancy computation uses a weighted sum of factors for a main score and a tie-breaker score, together forming the final relevancy value or weight.

All terms are used in the statement matching algorithm, giving them some effect on the final results.

Parameter Name

Syntax and Default

Parameter Description

Document Weight Term Threshold

/docWgtTermThresh=20;

Terms whose weight is less than this parameter are excluded from retrieval.

Document Weight Link Threshold

/docWgtExpansion=equal;

Search uses term expansions for candidate retrieval, but excludes terms expansions whose link strength is less than this parameter.

The default value of equal restricts the retrieval to only equally-linked terms, retrieving results that are most similar to the original query terms. The other valid values are: strong, medium, and weak.

Search Fields

Oracle ATG Web Commerce Search indexes structured content and records the fields from which each sentence term vector was created. Queries can then be constrained to a limited set of those fields, also called a fielded search. The following parameter establishes which fields are included in a search of structured content such as Oracle ATG Web Knowledge Manager solutions or an Oracle ATG Web Commerce catalog:

/activeSolutionZones=role:id,role:goal,role:symptom,role:question;

This parameter can also take a special value to denote all fields should be searched:

/activeSolutionZones=*;

This is the default value.

Oracle ATG Web Commerce Search also indexes unstructured content and records the fields from which each sentence term vector was created. However, in this case, all sentences from the body of the unstructured content reside in a single field, called doc. The title of the content is stored in a role:title field and the URL is stored in a role:url field. The following parameter establishes which fields are included in search of unstructured content; all other fields are excluded from the search:

/activeSentenceZones=doc;

This parameter can also take a special value to denote all fields should be searched:

/activeSentenceZones=*;

To include the title and URL fields in the search, use the following:

/activeSentenceZones=doc,role:title,role:url;
Conditional Keyword Interpretation

If the query’s mode is nlp, Oracle ATG Web Commerce Search can treat user queries differently depending on the content of the query.

If the user query consists of N terms or fewer and the query is a simple list of content terms, then the engine will treat the query as a boolean AND on the documents. If the AND of the terms fails to return any results, the normal nlp mode is used instead. If the AND of terms succeeds, only those documents with all of the terms are returned.

The interpretation depends on the form of the user query. It must be a simple list of content terms, such as “book garden summer”, rather than a statement, such as “a book about gardening in the summer”. Search treats the simple list as an AND, but not the more complex statements or questions.

/implicitAndSize=N

The default value is 4. To disable this feature, set the value to 0.