This chapter describes the Oracle R Enterprise function ore.predict
and provides some examples of its use. The chapter contains the following topics:
Predictive models allow you to predict future behavior based on past behavior. After you build a model, you use it to score new data, that is, to make predictions.
R allows you to build many kinds of models. When you score data to predict new results using an R model, the data to score must be in an R data.frame
. With the ore.predict
function, you can use an R model to score database-resident data in an ore.frame
object.
The ore.predict
function provides the fastest way to operationalize R-based models for scoring in Oracle Database. The function has no dependencies on PMML or any other plug-ins.
Some advantages of using the ore.predict
function to score data in the database are the following:
Uses R-generated models to score in-database data.
The data to score is in an ore.frame
object.
Maximizes the use of Oracle Database as a compute engine.
The database provides a commercial grade, high performance, scalable scoring engine.
Simplifies application workflow.
You can go from a model to SQL scoring in one step.
The ore.predict
function is a generic function. It has the following usage:
ore.predict(object, newdata, ...)
The value of the object
argument is one of the model objects listed in Table 5-1. The value of the newdata
argument is an ore.frame
object that contains the data to score. The ore.predict
function has methods for use with specific R model classes. The ...
argument represents the various additional arguments that are accepted by the different methods.
Function ore.predict
has methods that support the model objects listed in Table 5-1.
Table 5-1 Models Supported by the ore.predict Function
Class of Model | Description of Model |
---|---|
|
Generalized linear model |
|
k-Means clustering model |
|
Linear regression model |
|
A |
|
Multinomial log-linear model |
|
Neural network model |
|
An Oracle R Enterprise model from the |
|
Principal components analysis on a matrix |
|
Principal components analysis on a numeric matrix |
|
Recursive partitioning and regression tree model |
For the function signatures of the ore.predict
methods, invoke the help
function on the following, as in help("ore.predict-kmeans")
:
ore.predict-glm
ore.predict-kmeans
ore.predict-lm
ore.predict-matrix
ore.predict-multinom
ore.predict-nnet
ore.predict-ore.model
ore.predict-prcomp
ore.predict-princomp
ore.predict-rpart
The following examples demonstrate the use of the ore.predict
function.
Example 5-1, "Using the ore.predict Function on a Linear Regression Model"
Example 5-2, "Using the ore.predict Function on a Generalized Linear Regression Model"
Example 5-3, "Using the ore.predict Function on an ore.model Model"
Example 5-1 builds a linear regression model, irisModel
, using the lm
function on the iris
data.frame
. The example pushes the data set to the database as the temporary table IRIS and the corresponding ore.frame
proxy, IRIS
. The example scores the model by invoking ore.predict
on it and then combines the prediction with IRIS
ore.frame
object. Finally, it displays the first six rows of the resulting object.
Example 5-1 Using the ore.predict Function on a Linear Regression Model
IRISModel <- lm(Sepal.Length ~ ., data = iris) IRIS <- ore.push(iris) IRIS_pred <- ore.predict(IRISModel, IRIS, se.fit = TRUE, interval = "prediction") IRIS <- cbind(IRIS, IRIS_pred) head(IRIS)
Listing for Example 5-1
R> IRISModel <- lm(Sepal.Length ~ ., data = iris) R> IRIS <- ore.push(iris) R> IRIS_pred <- ore.predict(IRISModel, IRIS, se.fit = TRUE, + interval = "prediction") R> IRIS <- cbind(IRIS, IRIS_pred) R> head(IRIS) Sepal.Length Sepal.Width Petal.Length Petal.Width Species PRED SE.PRED 1 5.1 3.5 1.4 0.2 setosa 5.004788 0.04479188 2 4.9 3.0 1.4 0.2 setosa 4.756844 0.05514933 3 4.7 3.2 1.3 0.2 setosa 4.773097 0.04690495 4 4.6 3.1 1.5 0.2 setosa 4.889357 0.05135928 5 5.0 3.6 1.4 0.2 setosa 5.054377 0.04736842 6 5.4 3.9 1.7 0.4 setosa 5.388886 0.05592364 LOWER.PRED UPPER.PRED 1 4.391895 5.617681 2 4.140660 5.373027 3 4.159587 5.386607 4 4.274454 5.504259 5 4.440727 5.668026 6 4.772430 6.005342 R> head(IRIS) Sepal.Length Sepal.Width Petal.Length Petal.Width Species PRED SE.PRED LOWER.PRED UPPER.PRED 1 5.1 3.5 1.4 0.2 setosa 5.004788 0.04479188 4.391895 5.617681 2 4.9 3.0 1.4 0.2 setosa 4.756844 0.05514933 4.140660 5.373027 3 4.7 3.2 1.3 0.2 setosa 4.773097 0.04690495 4.159587 5.386607 4 4.6 3.1 1.5 0.2 setosa 4.889357 0.05135928 4.274454 5.504259 5 5.0 3.6 1.4 0.2 setosa 5.054377 0.04736842 4.440727 5.668026 6 5.4 3.9 1.7 0.4 setosa 5.388886 0.05592364 4.772430 6.005342
Example 5-2 builds a generalized linear model using the infert
data set and then invokes the ore.predict
function on the model.
Example 5-2 Using the ore.predict Function on a Generalized Linear Regression Model
infertModel <- glm(case ~ age + parity + education + spontaneous + induced, data = infert, family = binomial()) INFERT <- ore.push(infert) INFERTpred <- ore.predict(infertModel, INFERT, type = "response", se.fit = TRUE) INFERT <- cbind(INFERT, INFERTpred) head(INFERT)
Listing for Example 5-2
R> infertModel <- + glm(case ~ age + parity + education + spontaneous + induced, + data = infert, family = binomial()) R> INFERT <- ore.push(infert) R> INFERTpred <- ore.predict(infertModel, INFERT, type = "response", + se.fit = TRUE) R> INFERT <- cbind(INFERT, INFERTpred) R> head(INFERT) education age parity induced case spontaneous stratum pooled.stratum 1 0-5yrs 26 6 1 1 2 1 3 2 0-5yrs 42 1 1 1 0 2 1 3 0-5yrs 39 6 2 1 0 3 4 4 0-5yrs 34 4 2 1 0 4 2 5 6-11yrs 35 3 1 1 1 5 32 6 6-11yrs 36 4 2 1 1 6 36 PRED SE.PRED 1 0.5721916 0.20630954 2 0.7258539 0.17196245 3 0.1194459 0.08617462 4 0.3684102 0.17295285 5 0.5104285 0.06944005 6 0.6322269 0.10117919
Example 5-3 pushes the iris
data set to the database as the temporary table IRIS and the corresponding ore.frame
proxy, IRIS
. The example builds a linear regression model, IRISModel2
, using the ore.lm
function. It scores the model and adds a column to IRIS
.
Example 5-3 Using the ore.predict Function on an ore.model Model
IRIS <- ore.push(iris) IRISModel2 <- ore.lm(Sepal.Length ~ ., data = IRIS) IRIS$PRED <- ore.predict(IRISModel2, IRIS) head(IRIS, 3)
Listing for Example 5-3
R> IRIS <- ore.push(iris) R> IRISModel2 <- ore.lm(Sepal.Length ~ ., data = IRIS) R> IRIS$PRED <- ore.predict(IRISModel, IRIS) R> head(IRIS, 3) Sepal.Length Sepal.Width Petal.Length Petal.Width Species PRED 1 5.1 3.5 1.4 0.2 setosa 5.004788 2 4.9 3.0 1.4 0.2 setosa 4.756844 3 4.7 3.2 1.3 0.2 setosa 4.773097