Question
What are the details of how search scoring works for solutions and service requests? I need information about how to improve the search effectiveness of text score and how to calculate normalized usage score. What are the profiles related to these elements?
Answer
The Oracle Knowledge Management Simple Search UI shows a set of tabs, each containing a search result for a particular repository such as Solutions, Service Requests, and Enterprise Search. It can also show search results for a custom created repository. Because each search result for a repository, that is, each tab on the UI, can be generated by different searching methods, no single search scoring algorithm is used across the application. For example, a default repository like Solutions uses Oracle Text, while an Enterprise Search repository delegates the search to an Oracle Enterprise Search server. Customers write repositories and use whatever method they choose to return the search results.
Most questions about the scoring algorithms specifically refer to out-of-the-box seeded repositories, particularly Solutions and Service Requests. Because Knowledge Management uses Oracle Text for searching on these repositories, the scoring generally inherits the Oracle Text scoring algorithm.
Some general guidelines about searching are as follows:
Compare search scores for each search result row relative to each other, within a single search result set.
Comparing the absolute score number across search result sets is not meaningful.
The absolute score is not predictable and varies with the different search methods, whether the index has been optimized, and other factors. Therefore, one should not focus on what the exact search score number is.
In general, overall average scores are higher using the "Any of the words" search method as compared to "All of the words."
The following sections contains notes specific to each repository type, with respect to search.
Some specific information about scores follows:
Display score uses the formula Display score=Text score(1-x/100) + Normalized usage score (1-x/100), where x is a parameter (percentage) provided by the Knowledge Management administrator.
Oracle Text calculates Text score at runtime.
The Knowledge Management concurrent batch program calculates the Normalized usage score.
Searches on Solutions
Keyword Search: keyword criteria is matched against Solution title, statements (summary and detail), as well as the name and description of any items (products) associated with the solution.
(Optional) Attribute filters: solution type, 1 or more products, platforms, and categories can act as search result filters.
* For product, platform, and category criteria, they can be selectively used strict filters or as influencing factors, depending on the profile option Knowledge: Search results score includes weighting from Product, Platform and category filters. If this option is turned on, then these attribute criteria do not behave as filters and just influence the scoring on solutions.
**As a filter: a solution must match the keyword criteria, as well as be associated to the products, platforms, and categories specified in the search criteria to qualify as a search result. Scoring is based entirely on the keyword match.
**As an influencing factor: a solution must match the keyword criteria to qualify as a search result. If it matches one or more product criteria, then it matches an additional factor. If it matches one or more platform criteria, then it matches yet another additional factor. If it matches one or more category criteria, then it adds another matched factor again. Solution results having the most matched factors rank higher, that is, have a higher score, than solutions that match fewer factors. Solutions results having the same number of matched factors are ranked according to keyword match relevancy.
(Optional) Solution Usage influencing the score: Knowledge Management can be configured to have a solution's usage influence the overall score through the profile option Knowledge: Percentage of usage weighting in solution search result score, whose default is 0%.
* The overall score is a combination of the text score and the usage score. The text score is the keyword match relevancy derived from Oracle Text. The usage score is described below. They are combined together in a percentage ratio based on the profile option: (Text Score)*(100-Profile value) + (Usage Score)*(Profile value).
Important: Solution usage scores are incorporated after text match and thresholding happens. Basically the search first identifies which solutions match the text and attribute criteria and scores them according to text match relevancy. Then it takes the top N solutions, based on the text score only. The threshold, N, is controlled by profile option. Then the usage scores for those top N solutions are retrieved and combined, and finally the solutions are resorted based on the overall combined score. This means that a solution having a very high usage score but a low text relevancy match may not appear at all in the search result because it was filtered out by the top N threshold.
* Deriving the Usage Score: The usage score for solutions is calculated in the concurrent program Knowledge Management Calculate Solution Usage Score. The usage score is composed of a sum of a few different factors that are adjusted by age and then normalized from 0-100 across all solutions.
Service Requests
Keyword Search: Keyword criteria are matched against the service request summary and non-private notes including the full note detail.
* The keyword search method, "All of the keywords" or "Any of the keywords", is controlled by the Knowledge Management profile option Knowledge: Default Searching method.
Product/item is used as a strict filter on the search results. If specified, then a service request must be filed for that specific product to qualify as a search result.
Note: Internal API implementation note: If more than one product is specified, then the API uses only the last one and ignores the others.
Score: This generally just follows the standard Oracle Text scoring.