Predictive models allow you to predict future behavior based on past behavior. After you build a model, you use it to score new data, that is, to make predictions.
R allows you to build many kinds of models. When you score data to predict new results using an R model, the data to score must be in an R data.frame. With the ore.predict function, you can use an R model to score database-resident data in an ore.frame object.
With the ore.predict function, you can only make predictions using ore.frame objects; you cannot rebuild the model. For scalability and performance, build models in the database table using the algorithms and functions described in Chapter 4, "Building Models in Oracle R Enterprise." These include both algorithms that are native to Oracle R Enterprise and those from Oracle Data Mining that are exposed in R.
The ore.predict function is a generic function. It has the following usage:
ore.predict(object, newdata, ...)
The value of the object argument is one of the R models or objects listed in Table 5-1. The value of the newdata argument is an ore.frame object that contains the data to score. The OREpredict package has methods for use with specific R model classes. The ... argument represents the various additional arguments that are accepted by the different methods.
Table 5-1 lists the methods employed by the generic ore.predict function, the class of the object the method accepts as the object argument, and a description of the type of model or object.
Table 5-1 Methods of the Generic ore.predict Function
| OREpredict Method | Class of Object | Description of Object |
|---|---|---|
|
|
|
Generalized linear model |
|
|
|
k-Means clustering model |
|
|
|
Linear regression model |
|
|
|
A |
|
|
|
Multinomial log-linear model |
|
|
|
Neural network models |
|
|
|
An Oracle R Enterprise model |
|
|
|
Principal components analysis on a matrix |
|
|
|
Principal components analysis on a numeric matrix |
|
|
|
Recursive partitioning and regression tree model |
For the arguments of the ore.predict methods, invoke the help function on the method, such as help("ore.predict-glm").
Example 5-1 builds a linear regression model, irisModel, using the lm function on the iris data.frame. It pushes the data set to the database as iris_of, an ore.frame object. It then scores the model by invoking ore.predict on it.
Example 5-1 Using the ore.predict Function on an LM Model
irisModel <- lm(Sepal.Length ~ ., data = iris)
iris_of <- ore.push(iris)
iris_of_pred <- ore.predict(irisModel, iris_of, se.fit = TRUE,
interval = "prediction")
iris_of <- cbind(iris_of, iris_of_pred)
head(iris_of)
Listing for Example 5-1
R> irisModel <- lm(Sepal.Length ~ ., data = iris) R> iris_of <- ore.push(iris) R> iris_of_pred <- ore.predict(irisModel, iris_of, se.fit = TRUE, + interval = "prediction") R> iris_of <- cbind(iris_of, iris_of_pred) R> head(iris_of) Sepal.Length Sepal.Width Petal.Length Petal.Width Species PRED SE.PRED 1 5.1 3.5 1.4 0.2 setosa 5.004788 0.04479188 2 4.9 3.0 1.4 0.2 setosa 4.756844 0.05514933 3 4.7 3.2 1.3 0.2 setosa 4.773097 0.04690495 4 4.6 3.1 1.5 0.2 setosa 4.889357 0.05135928 5 5.0 3.6 1.4 0.2 setosa 5.054377 0.04736842 6 5.4 3.9 1.7 0.4 setosa 5.388886 0.05592364 LOWER.PRED UPPER.PRED 1 4.391895 5.617681 2 4.140660 5.373027 3 4.159587 5.386607 4 4.274454 5.504259 5 4.440727 5.668026 6 4.772430 6.005342 R> head(iris_of) Sepal.Length Sepal.Width Petal.Length Petal.Width Species PRED SE.PRED LOWER.PRED UPPER.PRED 1 5.1 3.5 1.4 0.2 setosa 5.004788 0.04479188 4.391895 5.617681 2 4.9 3.0 1.4 0.2 setosa 4.756844 0.05514933 4.140660 5.373027 3 4.7 3.2 1.3 0.2 setosa 4.773097 0.04690495 4.159587 5.386607 4 4.6 3.1 1.5 0.2 setosa 4.889357 0.05135928 4.274454 5.504259 5 5.0 3.6 1.4 0.2 setosa 5.054377 0.04736842 4.440727 5.668026 6 5.4 3.9 1.7 0.4 setosa 5.388886 0.05592364 4.772430 6.005342
Example 5-2 builds a generalized linear model using the infert data set and then invokes the ore.predict function on the model.
Example 5-2 Using the ore.predict Function on a GLM Model
infertModel <-
glm(case ~ age + parity + education + spontaneous + induced,
data = infert, family = binomial())
INFERT <- ore.push(infert)
INFERTpred <- ore.predict(infertModel, INFERT, type = "response",
se.fit = TRUE)
INFERT <- cbind(INFERT, INFERTpred)
head(INFERT)
Listing for Example 5-2
R> infertModel <-
+ glm(case ~ age + parity + education + spontaneous + induced,
+ data = infert, family = binomial())
R> INFERT <- ore.push(infert)
R> INFERTpred <- ore.predict(infertModel, INFERT, type = "response",
+ se.fit = TRUE)
R> INFERT <- cbind(INFERT, INFERTpred)
R> head(INFERT)
education age parity induced case spontaneous stratum pooled.stratum
1 0-5yrs 26 6 1 1 2 1 3
2 0-5yrs 42 1 1 1 0 2 1
3 0-5yrs 39 6 2 1 0 3 4
4 0-5yrs 34 4 2 1 0 4 2
5 6-11yrs 35 3 1 1 1 5 32
6 6-11yrs 36 4 2 1 1 6 36
PRED SE.PRED
1 0.5721916 0.20630954
2 0.7258539 0.17196245
3 0.1194459 0.08617462
4 0.3684102 0.17295285
5 0.5104285 0.06944005
6 0.6322269 0.10117919