29 The Data Mining Sample Programs

Describes the data mining sample programs that ship with Oracle Database.

29.1 About the Data Mining Sample Programs

You can learn a great deal about the Oracle Data Mining application programming interface (API) from the data mining sample programs. The programs illustrate typical approaches to data preparation, algorithm selection, algorithm tuning, testing, and scoring.

The programs are easy to use. They include extensive inline comments to help you understand the code. They delete all temporary objects on exit; you can run the programs repeatedly without setup or cleanup.

The data mining sample programs are installed with Oracle Database Examples in the demo directory under Oracle Home. The demo directory contains sample programs that illustrate many features of Oracle Database. You can locate the data mining files by doing a directory listing of dm*.sql. The following example shows this directory listing on a Linux system.

Note that the directory listing in the following example includes one file, dmhpdemo.sql, that is not a data mining program.

Example 29-1 Directory Listing of the Data Mining Sample Programs

> cd $ORACLE_HOME/rdbms/demo
> ls dm*.sql
dmaidemo.sql      dmkmdemo.sql    dmsvddemo.sql              
dmardemo.sql      dmnbdemo.sql    dmsvodem.sql    
dmdtdemo.sql      dmnmdemo.sql    dmsvrdem.sql               
dmdtxvlddemo.sql  dmocdemo.sql    dmtxtnmf.sql                      
dmemdemo.sql      dmsh.sql        dmtxtsvm.sql
dmglcdem.sql      dmshgrants.sql                          
dmglrdem.sql      dmstardemo.sql                          
dmhpdemo.sql      dmsvcdem.sql

The data mining sample programs create a set of mining models in the user's schema. After executing the programs, you can list the models with a query like the one in the following example.

Example 29-2 Models Created by the Sample Programs

SELECT mining_function, algorithm, model_name FROM user_mining_models
    ORDER BY mining_function;
 
MINING_FUNCTION                ALGORITHM                      MODEL_NAME
------------------------------ ------------------------------ -------------------
ASSOCIATION_RULES              APRIORI_ASSOCIATION_RULES      AR_SH_SAMPLE
CLASSIFICATION                 GENERALIZED_LINEAR_MODEL       GLMC_SH_CLAS_SAMPLE
CLASSIFICATION                 SUPPORT_VECTOR_MACHINES        T_SVM_CLAS_SAMPLE
CLASSIFICATION                 SUPPORT_VECTOR_MACHINES        SVMC_SH_CLAS_SAMPLE
CLASSIFICATION                 SUPPORT_VECTOR_MACHINES        SVMO_SH_CLAS_SAMPLE
CLASSIFICATION                 NAIVE_BAYES                    NB_SH_CLAS_SAMPLE
CLASSIFICATION                 DECISION_TREE                  DT_SH_CLAS_SAMPLE
CLUSTERING                     EXPECTATION_MAXIMIZATION       EM_SH_CLUS_SAMPLE
CLUSTERING                     O_CLUSTER                      OC_SH_CLUS_SAMPLE
CLUSTERING                     KMEANS                         KM_SH_CLUS_SAMPLE
CLUSTERING                     KMEANS                         DM_STAR_CLUSTER
FEATURE_EXTRACTION             SINGULAR_VALUE_DECOMP          SVD_SH_SAMPLE
FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR      NMF_SH_SAMPLE
FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR      T_NMF_SAMPLE
REGRESSION                     SUPPORT_VECTOR_MACHINES        SVMR_SH_REGR_SAMPLE
REGRESSION                     GENERALIZED_LINEAR_MODEL       GLMR_SH_REGR_SAMPLE

29.2 Installing the Data Mining Sample Programs

Learn how to install Data Mining sample programs.

The data mining sample programs require:

  • Oracle Database Enterprise Edition with the Advanced Analytics option

  • Oracle Database sample schemas

  • Oracle Database Examples

  • A data mining user account

  • Execution of dmshgrants.sql by a system administrator

  • Execution of dmsh.sql by the data mining user

Follow these steps to install the data mining sample programs:

  1. Install or obtain access to Oracle Database 12c Enterprise Edition with the Advanced Analytics option. To install the Database, see the installation instructions for your platform at Oracle Database 12c Release 2.

  2. Ensure that the sample schemas are installed in the database. The sample schemas are installed by default with Oracle Database. See Oracle Database Sample Schemasfor details about the sample schemas.

  3. Verify that Oracle Database Examples has been installed with the database, or install it locally. Oracle Database Examples loads the Database sample programs into the rdbms/demo directory under Oracle home. See Oracle Database Examples Installation Guidefor installation instructions.

  4. Verify that a data mining user account has been created, or create it yourself if you have administrative privileges. See "Creating a Data Mining User".

  5. Ask your system administrator to run dmshgrants.sql, or run it yourself if you have administrative privileges. dmshgrants grants the privileges that are required for running the sample programs. These include SELECT access to tables in the SH schema as described in "The Data Mining Sample Data" and the system privileges listed in the following table.

    Pass the name of the data mining user to dmshgrants.

    SQL> CONNECT sys / as sysdba
    Enter password: sys_password
    Connected.
    SQL> @ $ORACLE_HOME/rdbms/demo/dmshgrants dmuser

    Table 29-1 System Privileges Granted by dmshgrants.sql to the Data Mining User

    Privilege Allows the data mining user to

    CREATE SESSION

    log in to a database session

    CREATE TABLE

    create tables, such as the settings tables for CREATE_MODEL

    CREATE VIEW

    create views, such as the views of tables in the SH schema

    CREATE MINING MODEL

    create data mining models

    EXECUTE ON ctxsys.ctx_ddl

    execute procedures in the ctxsys.ctx_ddl PL/SQL package; required for text mining

  6. Connect to the database as the data mining user and run dmsh.sql. This script creates views of the sample data in the schema of the data mining user.

    SQL> CONNECT dmuser
    Enter password: dmuser_password
    Connected.
    SQL> @ $ORACLE_HOME/rdbms/demo/dmsh

29.3 The Data Mining Sample Data

The data used by the sample data mining programs is based on these tables in the SH schema:

SH.CUSTOMERS 
SH.SALES 
SH.PRODUCTS 
SH.SUPPLEMENTARY_DEMOGRAPHICS
SH.COUNTRIES 

The dmshgrants script grants SELECT access to the tables in SH. The dmsh.sql script creates views of the SH tables in the schema of the data mining user. The views are described in the following table:

Table 29-2 The Data Mining Sample Data

View Name Description

MINING_DATA

Joins and filters data

MINING_DATA_BUILD_V

Data for building models

MINING_DATA_TEST_V

Data for testing models

MINING_DATA_APPLY_V

Data to be scored

MINING_BUILD_TEXT

Data for building models that include text

MINING_TEST_TEXT

Data for testing models that include text

MINING_APPLY_TEXT

Data, including text columns, to be scored

MINING_DATA_ONE_CLASS_V

Data for anomaly detection

The association rules program creates its own transactional data.