4.2.3 Attribute Importance Model

The ore.odmAI attribute important function ranks attributes according to their significance in predicting a target.

The ore.odmAI function uses the OML4SQL Minimum Description Length algorithm to calculate attribute importance. Minimum Description Length (MDL) is an information theoretic model selection principle. It is an important concept in information theory (the study of the quantification of information) and in learning theory (the study of the capacity for generalization based on empirical data).

MDL assumes that the simplest, most compact representation of the data is the best and most probable explanation of the data. The MDL principle is used to build OML4SQL attribute importance models.

Attribute importance models built using OML4SQL cannot be applied to new data.

The ore.odmAI function produces a ranking of attributes and their importance values.

Note:

OREdm attribute importance models differ from OML4SQL attribute importance models in these ways: a model object is not retained, and an R model object is not returned. Only the importance ranking created by the model is returned.

For information on the ore.odmAI function arguments, invoke help(ore.odmAI).

Example 4-10 Using the ore.odmAI Function

This example pushes the data.frame iris to the database as the ore.frame iris_of. The example then builds an attribute importance model.

iris_of <- ore.push(iris)
ore.odmAI(Species ~ ., iris_of)

Listing for This Example

R> iris_of <- ore.push(iris)
R> ore.odmAI(Species ~ ., iris_of)
 
Call:
ore.odmAI(formula = Species ~ ., data = iris_of)
 
Importance: 
             importance rank
Petal.Width   1.1701851    1
Petal.Length  1.1494402    2
Sepal.Length  0.5248815    3
Sepal.Width   0.2504077    4