Example: Calculating Rules from Frequent Itemsets
Calculate association rules from frequent itemsets, using examples to illustrate rule generation and confidence calculation.
The following tables show the itemsets and frequent itemsets that were calculated in "Association". The frequent itemsets are the itemsets that occur with a minimum support of 67%; at least 2 of the 3 transactions must include the itemset.
Table 7-9 Itemsets
Transaction | Itemsets |
---|---|
11 |
(B,D) (B,E) (D,E) (B,D,E) |
12 |
(A,B) (A,C) (A,E) (B,C) (B,E) (C,E) (A,B,C) (A,B,E) (A,C,E) (B,C,E) |
13 |
(B,C) (B,D) (B,E) (C,D) (C,E) (D,E) (B,C,D) (B,C,E) (B,D,E) (C,D,E) |
Table 7-10 Frequent Itemsets with Minimum Support 67%
Itemset | Transactions | Support |
---|---|---|
(B,C) |
12 and 13 |
67% |
(B,D) |
11 and 13 |
67% |
(B,E) |
11, 12, and 13 |
100% |
(C,E) |
12 and 13 |
67% |
(D,E) |
11 and 13 |
67% |
(B,C,E) |
12 and 13 |
67% |
(B,D,E) |
11 and 13 |
67% |
A rule expresses a conditional probability. Confidence in a rule is calculated by dividing the probability of the items occurring together by the probability of the occurrence of the antecedent.
For example, if B (antecedent) is present, what is the chance that C (consequent) is also present? What is the confidence for the rule "IF B, THEN C"?
As shown in Table 7-9:
-
All 3 transactions include B (3/3 or 100%)
-
Only 2 transactions include both B and C (2/3 or 67%)
-
Therefore, the confidence of the rule "IF B, THEN C" is 67/100 or 67%.
The following table the rules that can be derived from the frequent itemsets in Table 7-10.
Table 7-11 Frequent Itemsets and Rules
Frequent Itemset | Rules | prob(antecedent and consequent) / prob(antecedent) | Confidence |
---|---|---|---|
(B,C) |
|
|
|
(B,D) |
|
|
|
(B,E) |
|
|
|
(C,E) |
|
|
|
(D,E) |
|
|
|
(B,C,E) |
|
|
|
(B,D,E) |
|
|
|
If the minimum confidence is 70%, ten rules are generated for these frequent itemsets. If the minimum confidence is 60%, sixteen rules are generated.
Tip:
Increase the minimum confidence if you want to decrease the build time for the model and generate fewer rules.
Related Topics