4.2.10 Naive Bayes

The ore.odmNB function builds an in-database Naive Bayes model.

The Naive Bayes algorithm is based on conditional probabilities. Naive Bayes looks at the historical data and calculates conditional probabilities for the target values by observing the frequency of attribute values and of combinations of attribute values.

Naive Bayes assumes that each predictor is conditionally independent of the others. (Bayes' Theorem requires that the predictors be independent.)

For information on the ore.odmNB function arguments, call help(ore.odmNB).

Settings for a Naive Bayes Models

The following table lists settings that apply to Naive Bayes models.

Table 4-10 Naive Bayes Model Settings

Setting Name Setting Value Description

NABS_PAIRWISE_THRESHOLD

TO_CHAR(0<= numeric_expr <=1)

Value of pairwise threshold for NB algorithm

Default is 0.

NABS_SINGLETON_THRESHOLD

TO_CHAR(0<=numeric_expr <=1)

Value of singleton threshold for NB algorithm

Default value is 0.

Example 4-20 Using the ore.odmNB Function

This example creates an input ore.frame, builds a Naive Bayes model, makes predictions, and generates a confusion matrix.

m <- mtcars
m$gear <- as.factor(m$gear)
m$cyl  <- as.factor(m$cyl)
m$vs   <- as.factor(m$vs)
m$ID   <- 1:nrow(m)
mtcars_of <- ore.push(m)
row.names(mtcars_of) <- mtcars_of
# Build the model.
nb.mod  <- ore.odmNB(gear ~ ., mtcars_of)
summary(nb.mod)
# Make predictions and generate a confusion matrix.
nb.res  <- predict (nb.mod, mtcars_of, "gear")
with(nb.res, table(gear, PREDICTION))

Listing for This Example

R> m <- mtcars
R> m$gear <- as.factor(m$gear)
R> m$cyl  <- as.factor(m$cyl)
R> m$vs   <- as.factor(m$vs)
R> m$ID   <- 1:nrow(m)
R> mtcars_of <- ore.push(m)
R> row.names(mtcars_of) <- mtcars_of
R> # Build the model.
R> nb.mod  <- ore.odmNB(gear ~ ., mtcars_of)
R> summary(nb.mod)
 
Call:
ore.odmNB(formula = gear ~ ., data = mtcars_of)
 
Settings: 
          value
prep.auto    on
 
Apriori: 
      3       4       5 
0.46875 0.37500 0.15625 
Tables: 
$ID
  ( ; 26.5), [26.5; 26.5]  (26.5;  )
3              1.00000000           
4              0.91666667 0.08333333
5                         1.00000000
 
$am
          0         1
3 1.0000000          
4 0.3333333 0.6666667
5           1.0000000
 
$cyl
  '4', '6' '8'
3      0.2 0.8
4      1.0    
5      0.6 0.4
 
$disp
  ( ; 196.299999999999995), [196.299999999999995; 196.299999999999995]
3                                                           0.06666667
4                                                           1.00000000
5                                                           0.60000000
  (196.299999999999995;  )
3               0.93333333
4                         
5               0.40000000
 
$drat
  ( ; 3.385), [3.385; 3.385] (3.385;  )
3                  0.8666667  0.1333333
4                             1.0000000
5                             1.0000000
$hp
  ( ; 136.5), [136.5; 136.5] (136.5;  )
3                        0.2        0.8
4                        1.0           
5                        0.4        0.6
 
$vs
          0         1
3 0.8000000 0.2000000
4 0.1666667 0.8333333
5 0.8000000 0.2000000
 
$wt
  ( ; 3.2024999999999999), [3.2024999999999999; 3.2024999999999999]
3                                                        0.06666667
4                                                        0.83333333
5                                                        0.80000000
  (3.2024999999999999;  )
3              0.93333333
4              0.16666667
5              0.20000000
 
 
Levels: 
[1] "3" "4" "5"

R> # Make predictions and generate a confusion matrix.
R> nb.res  <- predict (nb.mod, mtcars_of, "gear")
R> with(nb.res, table(gear, PREDICTION))
    PREDICTION
gear  3  4  5
   3 14  1  0
   4  0 12  0
   5  0  1  4