15 Apriori
 About Apriori
Learn how to find associations involving rare events in a large number of items using Apriori.  Association Rules and Frequent Itemsets
Apriori calculates rules expressing probabilistic relationships between items in frequent itemsets, indicating item cooccurrence probabilities.  Data Preparation for Apriori
Prepare transactional data for Apriori by organizing it into case identifiers and associated values, for model processing.  Calculating Association Rules
Enumerate itemsets from transactions and calculate association rules.  Evaluating Association Rules
Evaluate association rules by using support and confidence.
Related Topics
Parent topic: Algorithms
15.1 About Apriori
Learn how to find associations involving rare events in a large number of items using Apriori.
An association machine learning problem can be decomposed into the following subproblems:

Find all combinations of items in a set of transactions that occur with a specified minimum frequency. These combinations are called frequent itemsets.

Calculate rules that express the probable cooccurrence of items within frequent itemsets.
Apriori calculates the probability of an item being present in a frequent itemset, given that another item or items is present.
Association rule machine learning is not recommended for finding associations involving rare events in problem domains with a large number of items. Apriori discovers patterns with frequencies above the minimum support threshold. Therefore, to find associations involving rare events, the algorithm must run with very low minimum support values. However, doing so potentially explodes the number of enumerated itemsets, especially in cases with a large number of items. This increases the execution time significantly. Classification or anomaly detection is more suitable for discovering rare events when the data has a high number of attributes.
Related Topics
Parent topic: Apriori
15.2 Association Rules and Frequent Itemsets
Apriori calculates rules expressing probabilistic relationships between items in frequent itemsets, indicating item cooccurrence probabilities.
For example, a rule derived from frequent itemsets containing A, B, and C might state that if A and B are included in a transaction, then C is likely to also be included.
An association rule states that an item or group of items implies the presence of another item with some probability. Unlike decision tree rules, which predict a target, association rules express correlation.
 Antecedent and Consequent
Defines antecedent and consequent in an Apriori algorithm.  Confidence
Specify the minimum confidence for rules, representing the conditional probability of the consequent given the antecedent.
Parent topic: Apriori
15.2.1 Antecedent and Consequent
Defines antecedent and consequent in an Apriori algorithm.
The IF component of an association rule is known as the antecedent. The THEN component is known as the consequent. The antecedent and the consequent are disjoint; they have no items in common.
Oracle Machine Learning for SQL supports association rules that have one or more items in the antecedent and a single item in the consequent.
Parent topic: Association Rules and Frequent Itemsets
15.2.2 Confidence
Specify the minimum confidence for rules, representing the conditional probability of the consequent given the antecedent.
Rules have an associated confidence, which is the conditional probability that the consequent occurs given the occurrence of the antecedent. You can specify the minimum confidence for rules.
Parent topic: Association Rules and Frequent Itemsets
15.3 Data Preparation for Apriori
Prepare transactional data for Apriori by organizing it into case identifiers and associated values, for model processing.
Association models are designed to use transactional data. In transactional data, there is a onetomany relationship between the case identifier and the values for each case. Each case ID/value pair is specified in a separate record (row).
 Native Transactional Data and Star Schemas
Store transactional data in native or star schema formats, transforming nonnative formats for Apriori processing.  Items and Collections
Understand that transactional data associates a subset of possible items with each case, reflecting purchase patterns in a store.  Sparse Data
Transactional data is typically sparse, with missing items indicating absence rather than null values.  Improved Sampling
Use improved sampling techniques to determine appropriate sample sizes for association rule generation with performance guarantees.
Parent topic: Apriori
15.3.1 Native Transactional Data and Star Schemas
Store transactional data in native or star schema formats, transforming nonnative formats for Apriori processing.
Transactional data may be stored in native transactional format, with a nonunique case ID column and a values column, or it may be stored in some other configuration, such as a star schema. If the data is not stored in native transactional format, it must be transformed to a nested column for processing by the Apriori algorithm.
Related Topics
Parent topic: Data Preparation for Apriori
15.3.2 Items and Collections
Understand that transactional data associates a subset of possible items with each case, reflecting purchase patterns in a store.
In transactional data, a collection of items is associated with each case. The collection theoretically includes all possible members of the collection. For example, all products can theoretically be purchased in a single marketbasket transaction. However, in actuality, only a tiny subset of all possible items are present in a given transaction; the items in the marketbasket represent only a small fraction of the items available for sale in the store.
Parent topic: Data Preparation for Apriori
15.3.3 Sparse Data
Transactional data is typically sparse, with missing items indicating absence rather than null values.
Missing items in a collection indicate sparsity. Missing items may be present with a null value, or they may be missing.
Nulls in transactional data are assumed to represent values that are known but not present in the transaction. For example, three items out of hundreds of possible items might be purchased in a single transaction. The items that were not purchased are known but not present in the transaction.
Oracle Machine Learning assumes sparsity in transactional data. The Apriori algorithm is optimized for processing sparse data.
Note:
Apriori is not affected by Automatic Data Preparation.
Related Topics
Parent topic: Data Preparation for Apriori
15.3.4 Improved Sampling
Use improved sampling techniques to determine appropriate sample sizes for association rule generation with performance guarantees.
Association rules (AR) can use a good sample size with performance guarantee, based on the work of Riondato and Upfal.The AR algorithm computes the sample size by the following inputs:

dindex of the dataset

Absolute error ε

Confidence level γ
dindex is defined as the maximum integer d such that the dataset contains at least d transactions of length d at the minimum. It is the upper bound of VapnikChervonenkis (VC) dimension. The AR algorithm computes dindex of the dataset by scanning the length of all transactions in the dataset.
Users specify absolute error ε and confidence level γ parameters. A large dindex, small AR support, small ε or large γ can cause a large sample size. The sample size theoretically guarantees that the absolute error of both the support and confidence of the approximated AR (from sampling) is less than ε compared to the exact AR with probability (or confidence level) at least γ. In this document this sample size is called ARspecific sample size.
 Sampling Implementation
Specify sampling settings to determine sample sizes or rely on algorithmcalculated sample sizes for efficient rule generation.
Parent topic: Data Preparation for Apriori
15.3.4.1 Sampling Implementation
Specify sampling settings to determine sample sizes or rely on algorithmcalculated sample sizes for efficient rule generation.
Usage Notes

If
ODMS_SAMPLING
is unspecified or set asODMS_SAMPLING_DISABLE
, the sampling is not performed for AR and the exact AR is obtained. 
If
ODMS_SAMPLING
is set asODMS_SAMPLING_ENABLE
and ifODMS_SAMPLE_SIZE
is specified as positive integer number then the userspecified sample size (ODMS_SAMPLE_SIZE
) is utilized. The sampling is performed in the general data preparation stage before the AR algorithm. The ARspecific sample size is not computed. The approximated AR is obtained. 
If
ODMS_SAMPLING
is set asODMS_SAMPLING_ENABLE
andODMS_SAMPLE_SIZE
is not specified, the ARspecified sample size is computed and then sampling is performed in the AR algorithm. The approximated AR is obtained.Note:
If the computed ARspecific sample size is larger than or equal to the total transaction size in the data set, the sampling is not performed and the exact AR is obtained.
If users do not have a good idea on the choice of sample
size for AR, it is suggested to leave ODMS_SAMPLE_SIZE
unspecified,
only specify proper values for sampling parameters and let AR algorithm compute the
suitable ARspecific sample size.
See Also:
DBMS_DATA_MINING — Machine Learning Function Settings for a listing and explanation of the available model settings.Note:
The term hyperparameter is also interchangeably used for model setting.Parent topic: Improved Sampling
15.4 Calculating Association Rules
Enumerate itemsets from transactions and calculate association rules.
The first step in association analysis is the enumeration of itemsets. An itemset is any combination of two or more items in a transaction.
 Itemsets
Define itemsets as combinations of items within transactions, specifying the maximum number of items per itemset.  Frequent Itemsets
Identify frequently bought items (itemsets), filtered based on a minimum userspecified limits, to create rules.  Example: Calculating Rules from Frequent Itemsets
Calculate association rules from frequent itemsets, using examples to illustrate rule generation and confidence calculation.  Aggregates
Aggregates refer to the quantities associated with each item that the user opts for association rules model to aggregate.  Example: Calculating Aggregates
This example shows how to calculate aggregates using the customer grocery purchase and profit data.  Including and Excluding Rules
Explains including rules and excluding rules used in association.  Performance Impact for Aggregates
Aggregating data for association rules necessitates increased memory and processing power to ensure smooth performance.
Parent topic: Apriori
15.4.1 Itemsets
Define itemsets as combinations of items within transactions, specifying the maximum number of items per itemset.
The maximum number of items in an itemset is userspecified. If the maximum is two, then all the item pairs are counted. If the maximum is greater than two, then all the item pairs, all the item triples, and all the item combinations up to the specified maximum are counted.
The following table shows the itemsets derived from the transactions shown in the following example, assuming that maximum number of items in an itemset is set to 3.
Table 151 Itemsets
Transaction  Itemsets 

11 
(B,D) (B,E) (D,E) (B,D,E) 
12 
(A,B) (A,C) (A,E) (B,C) (B,E) (C,E) (A,B,C) (A,B,E) (A,C,E) (B,C,E) 
13 
(B,C) (B,D) (B,E) (C,D) (C,E) (D,E) (B,C,D) (B,C,E) (B,D,E) (C,D,E) 
Example 151 Sample Transactional Data
TRANS_ID ITEM_ID
 
11 B
11 D
11 E
12 A
12 B
12 C
12 E
13 B
13 C
13 D
13 E
Parent topic: Calculating Association Rules
15.4.2 Frequent Itemsets
Identify frequently bought items (itemsets), filtered based on a minimum userspecified limits, to create rules.
Association rules are calculated from itemsets. If rules are generated from all possible itemsets, there can be a very high number of rules and the rules may not be very meaningful. Also, the model can take a long time to build. Typically it is desirable to only generate rules from itemsets that are wellrepresented in the data. Frequent itemsets are those that occur with a minimum frequency specified by the user.
The minimum frequent itemset support is a userspecified percentage that limits the number of itemsets used for association rules. An itemset must appear in at least this percentage of all the transactions if it is to be used as a basis for rules.
The following table shows the itemsets from Table 151 that are frequent itemsets with support > 66%.
Table 152 Frequent Itemsets
Frequent Itemset  Transactions  Support 

(B,C) 
2 of 3 
67% 
(B,D) 
2 of 3 
67% 
(B,E) 
3 of 3 
100% 
(C,E) 
2 of 3 
67% 
(D,E) 
2 of 3 
67% 
(B,C,E) 
2 of 3 
67% 
(B,D,E) 
2 of 3 
67% 
Related Topics
Parent topic: Calculating Association Rules
15.4.3 Example: Calculating Rules from Frequent Itemsets
Calculate association rules from frequent itemsets, using examples to illustrate rule generation and confidence calculation.
The following tables show the itemsets and frequent itemsets that were calculated in "Association". The frequent itemsets are the itemsets that occur with a minimum support of 67%; at least 2 of the 3 transactions must include the itemset.
Table 153 Itemsets
Transaction  Itemsets 

11 
(B,D) (B,E) (D,E) (B,D,E) 
12 
(A,B) (A,C) (A,E) (B,C) (B,E) (C,E) (A,B,C) (A,B,E) (A,C,E) (B,C,E) 
13 
(B,C) (B,D) (B,E) (C,D) (C,E) (D,E) (B,C,D) (B,C,E) (B,D,E) (C,D,E) 
Table 154 Frequent Itemsets with Minimum Support 67%
Itemset  Transactions  Support 

(B,C) 
12 and 13 
67% 
(B,D) 
11 and 13 
67% 
(B,E) 
11, 12, and 13 
100% 
(C,E) 
12 and 13 
67% 
(D,E) 
11 and 13 
67% 
(B,C,E) 
12 and 13 
67% 
(B,D,E) 
11 and 13 
67% 
A rule expresses a conditional probability. Confidence in a rule is calculated by dividing the probability of the items occurring together by the probability of the occurrence of the antecedent.
For example, if B (antecedent) is present, what is the chance that C (consequent) is also present? What is the confidence for the rule "IF B, THEN C"?
As shown in Table 153:

All 3 transactions include B (3/3 or 100%)

Only 2 transactions include both B and C (2/3 or 67%)

Therefore, the confidence of the rule "IF B, THEN C" is 67/100 or 67%.
The following table the rules that can be derived from the frequent itemsets in Table 154.
Table 155 Frequent Itemsets and Rules
Frequent Itemset  Rules  prob(antecedent and consequent) / prob(antecedent)  Confidence 

(B,C) 



(B,D) 



(B,E) 



(C,E) 



(D,E) 



(B,C,E) 



(B,D,E) 



If the minimum confidence is 70%, ten rules are generated for these frequent itemsets. If the minimum confidence is 60%, sixteen rules are generated.
Tip:
Increase the minimum confidence if you want to decrease the build time for the model and generate fewer rules.
Related Topics
Parent topic: Calculating Association Rules
15.4.4 Aggregates
Aggregates refer to the quantities associated with each item that the user opts for association rules model to aggregate.
There can be more than one aggregate. For example, the user can specify the model to aggregate both profit and quantity.
Parent topic: Calculating Association Rules
15.4.5 Example: Calculating Aggregates
This example shows how to calculate aggregates using the customer grocery purchase and profit data.
Calculating Aggregates for Grocery Store Data
Assume a grocery store has the following data:
Table 156 Grocery Store Data
Customer  Item A  Item B  Item C  Item D 

Customer 1  Buys (Profit $5.00)  Buys (Profit $3.20)  Buys (Profit $12.00)  NA 
Customer 2  Buys (Profit $4.00)  NA  Buys (Profit $4.20)  NA 
Customer 3  Buys (Profit $3.00)  Buys (Profit $10.00)  Buys (Profit $14.00)  Buys (Profit $8.00) 
Customer 4  Buys (Profit $2.00)  NA  NA  Buys (Profit $1.00) 
The basket of each customer can be viewed as a transaction. The manager of the store is interested in not only the existence of certain association rules, but also in the aggregated profit if such rules exist.
In this example, one of the association rules can be (A, B)=>C for customer 1 and customer 3. Together with this rule, the store manager may want to know the following:

The total profit of item A appearing in this rule

The total profit of item B appearing in this rule

The total profit for consequent C appearing in this rule

The total profit of all items appearing in the rule
For this rule, the profit for item A is $5.00 + $3.00 = $8.00, for item B the profit is $3.20 + $10.00 = $13.20, for consequent C, the profit is $12.00 + $14.00 = $26.00, for the antecedent itemset (A, B) is $8.00 + $13.20 = $21.20. For the whole rule, the profit is $21.20 + $26.00 = $47.40.
Related Topics
Parent topic: Calculating Association Rules
15.4.6 Including and Excluding Rules
Explains including rules and excluding rules used in association.
Including rules enables a user to provide a list of items such that at least one item from the list must appear in the rules that are returned. Excluding rules enables a user to provide a list of items such that no item from the list can appear in the rules that are returned.
Note:
Since each association rule includes both antecedent and consequent, a set of including or excluding rules can be specified for antecedent while another set of including or excluding rules can be specified for consequent. Including or excluding rules can also be defined for the association rule.Related Topics
Parent topic: Calculating Association Rules
15.4.7 Performance Impact for Aggregates
Aggregating data for association rules necessitates increased memory and processing power to ensure smooth performance.
For each item, the user may supply several columns to aggregate. It requires more memory to buffer the extra data and more time to compute the aggregate values.
Parent topic: Calculating Association Rules
15.5 Evaluating Association Rules
Evaluate association rules by using support and confidence.
Minimum support and confidence are used to influence the build of an association model. Support and confidence are also the primary metrics for evaluating the quality of the rules generated by the model. Additionally, Oracle Machine Learning for SQL supports lift for association rules. These statistical measures can be used to rank the rules and hence the usefulness of the predictions.
 Support
Measure support to indicate the frequency of item cooccurrence, helping identify significant itemsets in transactions.  Minimum Support Count
Define a minimum support count to ensure itemsets appear frequently enough in transactions to be considered significant.  Confidence
The confidence of a rule indicates the probability of both the antecedent and the consequent appearing in the same transaction.  Reverse Confidence
The reverse confidence of a rule is defined as the number of transactions in which the rule occurs divided by the number of transactions in which the consequent occurs.  Lift
Measure lift to evaluate the strength of a rule over random cooccurrence, ensuring the rule's predictive value.
Parent topic: Apriori
15.5.1 Support
Measure support to indicate the frequency of item cooccurrence, helping identify significant itemsets in transactions.
The support of a rule indicates how frequently the items in the rule occur together. For example, cereal and milk might appear together in 40% of the transactions. If so, the following rules each have a support of 40%:
cereal implies milk milk implies cereal
Support is the ratio of transactions that include all the items in the antecedent and consequent to the number of total transactions.
Support can be expressed in probability notation as follows:
support(A implies B) = P(A, B)
Parent topic: Evaluating Association Rules
15.5.2 Minimum Support Count
Define a minimum support count to ensure itemsets appear frequently enough in transactions to be considered significant.
Minimum support count defines minimum
threshold in transactions that each rule must satisfy. When the number of transactions
is unknown, the support percentage threshold parameter can be tricky to set
appropriately. For this reason, support can also be expressed as a count of
transactions, with the greater of the two thresholds being used to filter out infrequent
itemsets. The default is 1
indicating that this criterion is not
applied.
Parent topic: Evaluating Association Rules
15.5.3 Confidence
The confidence of a rule indicates the probability of both the antecedent and the consequent appearing in the same transaction.
Confidence is the conditional probability of the consequent given the antecedent. For example, cereal appears in 50 transactions; 40 of the 50 might also include milk. The rule confidence is:
cereal implies milk with 80% confidence
Confidence is the ratio of the rule support to the number of transactions that include the antecedent.
Confidence can be expressed in probability notation as follows.
confidence (A implies B) = P (B/A), which is equal to P(A, B) / P(A)
Related Topics
Parent topic: Evaluating Association Rules
15.5.4 Reverse Confidence
The reverse confidence of a rule is defined as the number of transactions in which the rule occurs divided by the number of transactions in which the consequent occurs.
Reverse confidence eliminates rules that occur because the consequent is frequent. The default is 0
.
15.5.5 Lift
Measure lift to evaluate the strength of a rule over random cooccurrence, ensuring the rule's predictive value.
Both support and confidence must be used to determine if a rule is valid. However, there are times when both of these measures may be high, and yet still produce a rule that is not useful. For example:
Convenience store customers who buy orange juice also buy milk with a 75% confidence. The combination of milk and orange juice has a support of 30%.
This at first sounds like an excellent rule, and in most cases, it would be. It has high confidence and high support. However, what if convenience store customers in general buy milk 90% of the time? In that case, orange juice customers are actually less likely to buy milk than customers in general.
A third measure is needed to evaluate the quality of the rule. Lift indicates the strength of a rule over the random cooccurrence of the antecedent and the consequent, given their individual support. It provides information about the improvement, the increase in probability of the consequent given the antecedent. Lift is defined as follows.
(Rule Support) /(Support(Antecedent) * Support(Consequent))
This can also be defined as the confidence of the combination of items divided by the support of the consequent. So in our milk example, assuming that 40% of the customers buy orange juice, the improvement would be:
30% / (40% * 90%)
which is 0.83 – an improvement of less than 1.
Any rule with an improvement of less than 1 does not indicate a real crossselling opportunity, no matter how high its support and confidence, because it actually offers less ability to predict a purchase than does random chance.
Tip:

Decrease the maximum rule length if you want to decrease the build time for the model and generate simpler rules.

Increase the minimum support if you want to decrease the build time for the model and generate fewer rules.
Parent topic: Evaluating Association Rules