8 Cross-Validate Models

Cross-validation is a model improvement technique that avoids the limitations of a single train-and-test experiment by building and testing multiple models through repeated sampling from the available data.

Predictive models are usually built on given data and verified on held-aside or unseen data. The purpose of cross-validation is to offer better insight into how well the model would generalize to new data and to avoid over-fitting and deriving wrong conclusions from misleading peculiarities of the seen data.

The ore.CV utility R function uses Oracle Machine Learning for R for performing cross-validation of regression and classification models.

For a select set of algorithms and cases, the function ore.CV performs cross-validation for models that were generated by OML4R regression and classification functions using in-database data.

The ore.CV function works with models generated by the following OML4R functions:

  • ore.odmDT

  • ore.odmGLM

  • ore.odmNB

  • ore.odmSVM

You can also use ore.CV to cross-validate models generated with some R regression functions through OML4R embedded R execution. Those R functions are the following:

  • lm

  • glm

  • svm

To download the function ore.CV, see define the function and run the function.