Each rule is assigned a weight value from -100 to 100. Use a negative weight value to lower the rating for topic assignment. Use a positive weight value to raise the rating for topic assignment. Use a weight value of zero (0) to preempt other rule matches.

For example, for the topic union strike, you might define the following rules (weights are in parentheses):

1. strike three (0)
2. strike out (0)
3. strike two (0)
4. strike one (0)
5. union—strike (100)
6. strike (100)
7. baseball (-100)

Any document containing the word strike matches rule 6; however these documents may or may not be about union strikes. Any document that contains any of the first four phrases also matches rule 6, but because rules 1—4 appear first, they preempt the rule 6 match. (For example, strike out matches rules 2 and 6. Since rule 2 comes first, it is selected and receives a weight value of 0.) A document containing the phrase union strike matches rule 5 and receives a weight value of 100. Rule 7 lowers the Topic rating by -100 for every mention of baseball in the document.

The actual assignment of topics to documents occurs during indexing. The options that control thresholds for this are in TPO sets (see the Text Processing Option Sets chapter of this guide for more information):

  • Topic limit per doc—Controls the maximum number of topics to assign to an item; the default is 10.

  • Topic relevance threshold

  • Topic confidence threshold

The topic limit determines the maximum number of topics to assign to any content item; the default is 10. The relevance and confidence thresholds set thresholds for which the pattern weights must score the document above, in order for the topic to be considered a relevant topic for an item. The default values are 1 and 0, respectively.

 
loading table of contents...