clustering

Query-time clustering dynamically organizes search results into groups to provide end users with different views of the top results. Clustered documents within one group, called a cluster node, share the same common topics or property values. A cluster node for a large document set can be categorized into child cluster nodes, creating a hierarchy. Users can navigate directly to a specific cluster node. Effective real-time clustering balances clustering quality and clustering time.

Object Type

Universal

State Properties

Property Value
status
ACTIVE
INACTIVE

Supported Operations

activate
deactivate
export
getState
update

Administration GUI Page

Global Settings - Query-Time Clustering Configuration

XML Description

The <search:clustering> element describes clustering:

<search:clustering>
   <search:maxTreeDepth>
   <search:maxChildrenPerNode>
   <search:minDocsPerNode>
   <search:minOccurrenceWords>
   <search:maxExtractWords>
   <search:minOccurrencePhrases>
   <search:maxExtractPhrases>
   <search:maxPhraseLength>

Element Descriptions 

<search:clustering>

Contains all of the elements for clustering parameters, which are described in the following paragraphs.

<search:maxTreeDepth>

Maximum number of levels in a cluster node hierarchy. (Optional)

A cluster node with a large document set can be categorized into child cluster nodes. A cluster hierarchy gives end users a quick overview of the results. They can navigate directly to a specific cluster node or refine their query by combining the original query and cluster results.

<search:maxChildrenPerNode>

Maximum number of cluster nodes on each level.

<search:minDocsPerNode>

Minimum number of documents in a cluster node.

<search:minOccurrenceWords>

Minimum occurrences of a word to be extracted for topic clustering.

<search:maxExtractWords>

Maximum number of words to be extracted for topic clustering.

<search:minOccurrencePhrases>

Minimum occurrences of a phrase to be extracted for topic clustering.

<search:maxExtractPhrases>

Maximum number of phrases to be extracted for topic clustering.

<search:maxPhraseLength>

Maximum word length of phrases to be extracted for topic clustering.

Example

This XML document configures clustering:

<?xml version="1.0" encoding="UTF-8"?>
<search:config productVersion="11.1.2.0.0" xmlns:search="http://xmlns.oracle.com/search">
   <search:clustering>
      <search:maxTreeDepth>7</search:maxTreeDepth>
      <search:maxChildrenPerNode>125</search:maxChildrenPerNode>
      <search:minDocsPerNode>1</search:minDocsPerNode>
      <search:minOccurrenceWords>9</search:minOccurrenceWords>
      <search:maxExtractWords>18</search:maxExtractWords>
      <search:minOccurrencePhrases>4</search:minOccurrencePhrases>
      <search:maxExtractPhrases>21</search:maxExtractPhrases>
      <search:maxPhraseLength>7</search:maxPhraseLength>
   </search:clustering>
</search:config>