4.3 Cross-Validating Models

Predictive models are usually built on given data and verified on held-aside or unseen data. Cross-validation is a model improvement technique that avoids the limitations of a single train-and-test experiment by building and testing multiple models through repeated sampling from the available data. It's purpose is to offer better insight into how well the model would generalize to new data and to avoid over-fitting and deriving wrong conclusions from misleading peculiarities of the seen data.

The ore.CV utility R function uses Oracle R Enterprise for performing cross-validation of regression and classification models. The function ore.CV is available for download from the following Oracle R Technologies blog post:

https://blogs.oracle.com/R/entry/model_cross_validation_with_ore

For a select set of algorithms and cases, the function ore.CV performs cross-validation for models that were generated by Oracle R Enterprise regression and classification functions using in-database data.

The ore.CV function works with models generated by the following Oracle R Enterprise functions:

ore.lm
ore.stepwise
ore.glm
ore.neural
ore.odmDT
ore.odmGLM
ore.odmNB
ore.odmSVM

You can also use ore.CV to cross-validate models generated with some R regression functions through Oracle R Enterprise embedded R execution. Those R functions are the following:

lm
glm
svm

For more information on, and examples of, using ore.CV, and to download the function itself, see the blog post:

https://blogs.oracle.com/R/entry/model_cross_validation_with_ore