The ATG Search categorization feature assigns indexed content and queries to one or more categories. Categories are also known as topics, and are grouped together hierarchically into taxonomies.

The taxonomy is a simple tree structure, in which each category has a unique ID, a name, an optional display label, and zero or more child categories. The names and structure of the taxonomy are customer-defined using Search Administration.

For each category, administrators can define one or more rules that determine how content should be assigned to the category. The rules can apply to the index items, the user query, or any other text content. The categorization results are used in several ways:

ATG Search uses the taxonomy rules to assign categories to indexed content or to input queries. The process has four major stages:

  1. ATG Search matches term vectors in the query against all the rules in the taxonomy.

  2. All of the rules that successfully matched are collected for each category.

  3. From the collected rules, ATG Search determines what categories to assign to the content. This determination depends on a number of factors, including some parameter values (see the Categorize Query chapter in this guide for details).

    The top applicable categories that meet the thresholds are selected as assignments. For index items, these assignments are converted into document sets linking the category to the item. For input queries, these assignments can either turn into query refinements returned in the response or automatically used as query document set constraints.

  4. After the category assignments have been selected, the following two processes take place.

    Taxonomy pruning is an optional process that eliminates category assignments that violate the requirement that content assigned to a child category should also be assigned to the parent. Taxonomy pruning works globally across all categories. The figure that follows illustrates the pruning process. Since item D1 was the only item assigned to the top-level category Product X, other items in the sub-categories are pruned.

    Programmatic routines execute outside the normal rule-matching algorithm. After all assignments have been made and any optional pruning has occurred, the routines execute according to their parameters. The end result is more assignments to the categories as provided by these special rules. These assignments by routine are not subject to the assignment parameters such as topicMaximum.

For information on how to create topics and taxonomies and to apply them during content indexing, see the ATG Search Administration Guide. For information on applying categories to queries, see the Categorize Query chapter of this guide.

 
loading table of contents...