Oracle R Enterprise includes several functions that create R models with data in Database tables.
These functions are available at this time:
This approach has several advantages, as described in ore.lm() and ore.stepwise() Advantages.
ore.lm()
performs least squares regression on data represented in an ore.frame object. The model creates a model matrix using the model.matrix
method from the OREstats
package. The model matrix and the response variable are then represented in SQL and passed to an in-database algorithm. The in-database algorithm estimates the model using an algorithm involving a block update QR decomposition with column pivoting. After the in-database algorithm estimates the coefficients, it does a second pass of the data to estimate the model-level statistics. Finally, the model is returned as an ore.lm
object.
The implementation of ore.lm()
and ore.stepwise()
provides several advantages, as described in ore.lm() and ore.stepwise() Advantages.
ore.lm
will not estimate the coefficient values for a set of collinear terms.
After the model is created, use summary
to create a summary of the model.
For an example, see Linear Regression Example.
These are important advantages of the way that ore.lm()
and ore.stepwise()
are implemented:
Both algorithms provide accurate solutions using out-of-core QR factorization. QR factorization decomposes a matrix into an orthogonal matrix and a triangular matrix.
QR-based estimates are often are substantially more accurate than alternative techniques.
QR is an algorithm of choice for difficult rank-deficient models.
You can process data that does not fit into machine's memory, that is, out-of-core data. QR factors a matrix into two matrices, one of which fit into memory with he other stored on disk.
ore.lm()
and ore.stepwise()
can solve data sets with more than one billion rows.
ore.lm()
and ore.stepwise()
allow fast implementations of forward, backward, and stepwise model selection techniques.
ore.neural
has similar advantages.
This example pusheslongley
to a table and builds a regression model:
# longley consiste of employment statistics: head(longley) GNP.deflator GNP Unemployed Armed.Forces Population Year Employed 1947 83.0 234.289 235.6 159.0 107.608 1947 60.323 1948 88.5 259.426 232.5 145.6 108.632 1948 61.122 1949 88.2 258.054 368.2 161.6 109.773 1949 60.171 1950 89.5 284.599 335.1 165.0 110.929 1950 61.187 1951 96.2 328.975 209.9 309.9 112.075 1951 63.221 1952 98.1 346.999 193.2 359.4 113.270 1952 63.639 #Push longley to a table LONGLEY <- ore.push(longley) # Fit full model oreFit1 <- ore.lm(Employed ~ ., data = LONGLEY) summary(oreFit1)
For more information, see the R help associated with ore.lm
invoked by help(ore.lm)
.
ore.stepwise()
performs stepwise least squares regression on data represented in an ore.frame
object. The model creates a model matrix using the model.matrix
method from the OREstats
package. The model matrix and the response variable are then represented in SQL and passed to an in-database algorithm. The in-database algorithm estimates the model using an algorithm involving a block update QR decomposition with column pivoting. After the in-database algorithm estimates the coefficients, it does a second pass of the data to estimate the model-level statistics. Finally, the model is returned as an ore.stepwise
object.
ore.stepwise()
excludes collinear terms throughout the computation.
After the model is created, use summary
to view a summary of the model.
For an example, see Stepwise Regression Example.
This example pushes longley
to a table and builds a stepwise model.
LONGLEY <- ore.push(longley) # Two stepwise alternatives oreStep1 <- ore.stepwise(Employed ~ .^2, data = LONGLEY, add.p = 0.1, drop.p = 0.1) oreStep2 <- step(ore.lm(Employed ~ 1, data = LONGLEY), scope = terms(Employed ~ .^2, data = LONGLEY))
For more information, see the R help associated with ore.lm
invoked by help(ore.lm)
.
Neural network models can be used to capture intricate nonlinear relationships between inputs and outputs, or to find patterns in data.
ore.neural()
builds a single layer feedforward neural network on ore.frame
data.
ore.neural()
uses the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method to solve the underlying unconstrained nonlinear optimization problem that results from fitting a neural network.
The output of ore.neural()
is an object of type ore.neural
.
For detailed information about parameters and output, see the R help for ore.neural()
. For an example, see Neural Network Example.
This example builds a neural network with default values, including hidden size 1.
The longley
data set consists of statistics related to employment. This example pushes longley
to a table. Note that the example creates a model that uses a subset of longley
and then predicts results for a different subset of longley
.
trainData <- ore.push(longley[1:11, ]) testData <- ore.push(longley[12:16, ]) fit <- ore.neural('Employed ~ GNP + Population + Year', data = trainData) ans <- predict(fit, newdata = testData)