|Oracle9i Data Mining Concepts
Part Number A95961-02
This chapter discusses two major topics:
For an example of ODM basic usage, see Chapter 3.
This chapter provides an overview of the steps required to perform basic ODM tasks. For detailed examples of how to perform these tasks, see the ODM sample programs. The ODM sample programs are distributed with the ODM documentation. For an overview of the ODM sample programs, see Appendix A.
This chapter does not include a detailed description of any of the ODM API classes and methods. For detailed information about the ODM API, see the ODM Javadoc in the directory
$ORACLE_HOME/dm/doc on any system where ODM is installed.
ODM depends on the following Oracle9i Java Archive (
$ORACLE_HOME/jdbc/lib/classes12.jar $ORACLE_HOME/lib/xmlparserv2.jar $ORACLE_HOME/rdbms/jlib/jmscommon.jar $ORACLE_HOME/rdbms/jlib/aqapi.jar $ORACLE_HOME/rdbms/jlib/xsu12.jar $ORACLE_HOME/dm/lib/odmapi.jar
These files must be in your CLASSPATH to compile and execute ODM programs.
If you use a database character set that is not US7ASCII, WE8DEC, WE8ISO8859P1, or UTF8, you must also include the following in your CLASSPATH:
If you do not include nls_charset12.zip in your CLASSPATH, an ODM program will fail with the following error:
oracle.jms.AQjmsException: Non supported character set:oracle-character-set-178
This section describes the steps required to perform several common data mining tasks using ODM.
All work in ODM is done using
This section summarizes the steps required to build a model.
getCurrentStatusmethod to get the status of the task. Alternatively, use the
waitForCompletionmethod to wait until all asynchronous activity for task completes.
After successful completion of the task, a build results object exists.
The following sample programs illustrate building ODM models:
Data mining tasks are usually performed in sequence. The following sequence of tasks is typical:
To implement a sequence of dependent task executions, you may periodically check the asynchronous task execution status using the
getCurrentStatus method or block for completion using the
waitForCompletion method. You can then perform the dependent task after completion of the previous task.
For example, follow these steps to perform the build, test, and compute lift sequence:
MiningTestTaskobject. Either periodically check the status of the test operation or block until the task completes.
Model Seeker builds multiple models; it then evaluates and compares the models to find a "best" model.
Follow these steps to use Model Seeker:
ModelSeekerTask(MST) instance to hold the information needed to specify the models to build. The required information is defined in subclasses of the
You can specify a combination of as many instances of the following as desired:
(You cannot specify clustering models or Association Rules models.)
getCurrentStatusmethod to get the status of the task, using the task name. Alternatively, use the
waitForCompletionmethod to wait until all asynchronous activity for the required work completes.
getResultsmethod to view the summary information and the best model. Model Seeker discards all models that it builds except the best one.
The sample program
Sample_ModelSeeker.java illustrates how to use Model Seeker.
Models based on data sets with a large number of attributes can have very long build times. To minimize build time, you can use ODM Attribute Importance to identify the critical attributes and then build a model using these attributes only.
Identify the most important attributes by building an Attributes Importance model as follows:
The sample program
Sample_AttributeImportanceBuild.java illustrates how to build an attribute importance model.
After identifying the important attributes, build a model using the selected attributes as follows:
MiningFunctionSetting. Only the attributes returned by Attribute Importance will be active for model building.
The sample program
Sample_AttributeImportanceUsage.java illustrates how to build a model using the important attributes.
You make predictions by applying a model to new data, that is, by scoring the data.
Any table that you score (apply a model to) must have the same format as the table used to build the model. If you build a model using a table that is in transactional format, any table that you apply that model to must be in transactional format. Similarly, if the table used to build the model was in nontransactional format, any table to which you apply the model must be in nontransactional format.
Note that you can score a single record, which must also be in the same format as the table used to build the model.