1 Installing the Software

This chapter is intended for anyone who wishes to install and use Oracle Data Mining on a personal computer for educational or demonstration purposes. It provides the basic information you will need to install the Data Mining software on Microsoft Windows and run the sample programs locally on your PC or laptop. To run the programs remotely, see the instructions in Chapter 2.

This chapter contains the following sections. Complete the instructions in each section before proceeding to the next section.

Install Oracle Database

Oracle Data Mining is part of Oracle Database Enterprise Edition. To perform data mining activities, you must be able to log on to an Oracle database, and your user ID must have the appropriate database privileges.

The instructions in this section explain how to perform a basic installation of Oracle Database with the sample schemas on your personal computer. The sample schemas are needed for the Data Mining sample programs.

Note:

These instructions are not intended as a replacement for Oacle Database installation documentation. If you have questions, if you encounter problems during the installation, or if you already have Oracle components installed on your PC, refer to:

Additional documentation is available on the Installing and Upgrading page of the Oracle Database 11g Online Documentation Library:

http://www.oracle.com/pls/db111/db111.homepage

  1. Before you begin the installation, ensure that your computer meets the system requirements described in Oracle Database Installation Guide for Microsoft Windows.

  2. Stop any Oracle services that may be running on your computer.

    In Windows Control Panel, choose Administrative Tools, then Services. Find the service names that start with "Oracle". Choose Stop for each one.

  3. To start the installation, run SETUP.EXE from the Database installation directory.

    Oracle Universal Installer opens and displays the Select a Product to Install dialog. Choose Oracle Database 11g.

    Description of install1.gif follows
    Description of the illustration install1.gif

    Choose Next.

  4. The Installer displays the Select Installation Method page.

    Description of install2.gif follows
    Description of the illustration install2.gif

    • Choose Basic Installation.

    • Specify the Oracle base and home directories. Oracle home is a subdirectory of the Oracle base directory. You may install Oracle Database in an existing Oracle base, but you must specify a new Oracle home.

      Oracle Installer creates the directories for Oracle base and Oracle home if they do not already exist on your computer.

    • Choose Enterprise Edition (or Personal Edition) as the Installation Type.

    • Check the Create Starter Database box.

    • Specify a unique name for Global Database Name. You can use the default global database name provided by the Installer, as long as it does not already exist on your computer.

    • Specify a password for the database system accounts. The password must have at least eight characters and include both alphabetic and numeric characters. For details about specifying passwords, refer to Oracle Database Security Guide.

      You will have the opportunity to change the passwords for the database system accounts at a later time.

    • Click Next.

  5. The Installer performs prerequisites checks.

    If the checks succeed, choose Next to advance to the next step.

    If any of the checks do not succeed, correct the problem and then click Retry. If more extensive changes are needed, cancel the installation, fix the problem, and then restart Oracle Installer

  6. On the Oracle Configuration Manager Registration page, you can choose to register your installation with your Metalink account.

    This page is optional. You can simply choose Next.

  7. The Summary page displays the settings and components for the installation.

    Description of install4.gif follows
    Description of the illustration install4.gif

    Click Install.

  8. The Installer proceeds with the installation.

    Description of install5.gif follows
    Description of the illustration install5.gif

  9. The Installer invokes the Configuration Assistants to configure and start the starter database.

    Description of install7.gif follows
    Description of the illustration install7.gif

    If the Configuration Assistants encounter an error, check the logs to determine the problem. You can choose to continue the installation and start the assistants manually later, or you can restart the installation. To continue the installation, click Install.

  10. Database Configuration Assistant creates the starter database.

    Description of install8.gif follows
    Description of the illustration install8.gif

  11. Database Configuration Assistant displays information about the starter database.

    Description of install9.gif follows
    Description of the illustration install9.gif

    Click the Password Management button.

  12. Unlock the SH account and specify a password. The SH schema is used by the sample programs.

    You can change the passwords for SYS and SYSTEM if you wish. The password must have at least eight characters and include both alphabetic and numeric characters

    For details about specifying passwords, refer to Oracle Database Security Guide.

    Description of install10.gif follows
    Description of the illustration install10.gif

    Click OK to return to the Database Configuration Assistant page.

    On the Database Configuration Assistant page, click OK.

  13. On the End of Installation page, confirm that the installation was successful.

    Description of install11.gif follows
    Description of the illustration install11.gif

    Click Exit to exit the Installer.

Install Oracle Database Examples

The Oracle Data Mining sample programs are installed with Oracle Database Examples.

The Database Examples installation process copies the Oracle Data Mining sample programs, along with examples and demonstrations of other database features, to the \RDBMS\demo subdirectory of Oracle home.

To install Database Examples, perform these steps:

  1. Ensure that your computer meets the system requirements described in Oracle Database Examples Installation Guide.

  2. Stop any Oracle services that may be running on your computer.

    In Windows Control Panel, choose Administrative Tools, then Services. Find the service names that start with "Oracle". Choose Stop for each one.

  3. To start the installation, go to the Examples installation directory and run SETUP.EXE.

    Oracle Universal Installer opens and displays the Welcome page. Click Next to advance to the next page.

  4. On the Specify Home Details page, specify the Oracle home directory in which you installed Oracle Database. Do not assume that the directory displayed by the Installer is correct.

    Description of cpinstall2.gif follows
    Description of the illustration cpinstall2.gif

  5. The Summary page displays the settings and components for the installation.

    Description of cpinstall3.gif follows
    Description of the illustration cpinstall3.gif

    Click Install.

  6. The Installer proceeds with the installation.

    Description of cpinstall4.gif follows
    Description of the illustration cpinstall4.gif

  7. On the End of Installation page, confirm that the installation was successful.

    Description of cpinstall5.gif follows
    Description of the illustration cpinstall5.gif

    Click Exit to exit the Installer.

Create a Data Mining Demo User

To build and score Data Mining models, you must have an Oracle user ID with the appropriate privileges. Follow these instructions to create a demo user that has the privileges you need to run the sample programs and create and score models within your schema.

See Also:

Chapter 4, "Users and Privileges for Data Mining" to create data mining users that are capable of performing broader data mining tasks

Note on ORACLE_HOME:

In the following sections, you will find references to ORACLE_HOME, the environment variable that represents the Oracle home directory.

The ORACLE_HOME environment variable is not required on Windows systems. If you wish to create it, you can do so by editing the system properties for your computer. In Control Panel, open System and choose the Advanced tab.

If ORACLE_HOME is not defined, you must specify the full path when executing scripts that reside under Oracle home.

To create a demo user for Oracle Data Mining:

  1. From the Windows Start menu, select the Oracle home directory of the local database.

  2. Choose Application Development.

  3. Choose SQL*Plus.

  4. Log in with system privileges.

        Enter user-name: sys / as sysdba
        Enter password: password
    
  5. To create the user, type a command like the following.

    CREATE USER dmuser IDENTIFIED BY password
           DEFAULT TABLESPACE USERS
           TEMPORARY TABLESPACE TEMP
           QUOTA UNLIMITED ON USERS;
    
  6. Run dmshgrants.sql to grant access to the SH schema. Several tables in SH are used by the Data Mining sample programs. Specify the Data Mining user name as the parameter.

     @ %ORACLE_HOME%\RDBMS\demo\dmshgrants dmuser
    

    Note: If you have not upgraded to 11.1.0.7, run dmshgrants by specifying the SH password in addition to the Data Mining user name.

     @ %ORACLE_HOME%\RDBMS\demo\dmshgrants SH_password dmuser
    
  7. Now connect to the database as the Data Mining user.

    CONNECT dmuser
    Enter password: password
    
  8. Run dmsh.sql to populate the schema of the Data Mining user with tables, views, and other objects needed by the sample programs.

    @ %ORACLE_HOME%\RDBMS\demo\dmsh
    COMMIT;
    

Once you have completed these steps, you can run the Data Mining sample programs whenever you log in to the database as the Data Mining demo user.

Run the Sample Programs

To locate the sample programs on your computer, navigate to the RDBMS\demo subdirectory under Oracle home.

To display the Data Mining PL/SQL sample programs, search for the files that start with dm and end with .sql. (The list will include dmsh.sql and dmshgrants.sql, which are used to configure the Data Mining demo user ID.) The PL/SQL sample programs are listed in Table 1-1.

Table 1-1 Sample PL/SQL Data Mining Programs

Program File Algorithm Mining Function or Task

dmaidemo.sql

Minimum Descriptor Length

Attribute Importance

dmardemo.sql

Apriori

Association

dmdtdemo.sql

Decision Tree

Classification

dmdtxvlddemo.sql

Decision Tree (cross validation)

Classification

dmglcdem.sql

Binary Logistic Regression (GLM)

Classification

dmglrdem.sql

Multivariate Linear Regression (GLM)

Regression

dmkmdemo.sql

k-Means

Clustering

dmnbdemo.sql

Naive Bayes

Classification

dmnmdemo.sql

Non-Negative Matrix Factorization

Feature Extraction

dmocdemo.sql

O-Cluster

Clustering

dmsvcdem.sql

Support Vector Machine

Classification

dmsvodem.sql

Support Vector Machine

Anomaly Detection

dmsvrdem.sql

Support Vector Machine

Regression

dmtxtfe.sql

Term extraction using Oracle Text

Text transformation for mining

dmtxtnmf.sql

Non-Negative Matrix Factorization

Text mining using NMF

dmtxtsvm.sql

Support Vector Machine

Text mining using SVM


In the same directory, search for the files that start with dm and end with .java to display the Java samples. The Java sample programs are listed in Table 1-2.

Table 1-2 Sample Java Data Mining Programs

Program File Algorithm Mining Function or Task

dmaidemo.java

Minimum Description Length

Attribute importance

dmapplydemo.java

Naive Bayes

Illustrate scoring methods

dmardemo.java

Apriori

Association

dmexpimpdemo.java

export/import

Model Export/Import

dmglcdemo.java

Binary Logistic Regression (GLM)

Classification

dmglrdemo.java

Multivariate Linear Regression (GLM)

Regression

dmkmdemo.java

k-Means

Clustering

dmnbdemo.java

Naive Bayes

Classification

dmnmdemo.java

Non-Negative Matrix Factorization

Feature extraction

dmocdemo.java

O-Cluster

Clustering

dmpademo.java

Automated predict and explain

Predictive Analytics

dmsvcdemo.java

Support Vector Machine

Classification

dmsvodemo.java

Support Vector Machine (one class)

Classification

dmsvrdemo.java

Support Vector Machine

Regression

dmtreedemo.java

Decision Tree

Classification

dmtxtnmfdemo.java

Non-Negative Matrix Factorization

Text mining with NMF

dmtxtsvmdemo.java

Support Vector Machine

Text mining with SVM classification

dmxfdemo.java

Binning, clipping, and normalization

Data Transformations


View the Source Code

You will learn a great deal about the Data Mining APIs by investigating the source code of the sample programs. The programs illustrate typical approaches to data preparation, algorithm selection, algorithm tuning, testing, and scoring. All the programs include extensive comments to help you understand what the code is doing.

You can view the source code simply by opening the files in a text editor.

Run the PL/SQL Sample Programs

Now that you have a user ID with the required privileges and a schema populated with the required objects, you can run the sample programs. Each program creates a Data Mining model.

While the program is running, it displays the program code and the program output.

You can run the sample programs as many times as you wish. The programs clean up the results of the previous run before executing the current run.

To run the PL/SQL programs:

  1. Start SQL*Plus and log in as the Data Mining user.

        Enter user-name: dmuser
        Enter password: password
    
  2. Run the program by specifying an at sign (@) followed by the fully-qualified path of the program. This example executes the program dmnbdemo.sql, which creates a Naive Bayes model.

    SQL>@ %ORACLE_HOME%\RDBMS\demo\dmnbdemo
    

Prepare to Run the Java Programs

Before you can run the Java programs, you must set up your Java environment and compile the programs. You can do this in an Integrated Development Environment such as Oracle JDeveloper, or you can execute the following commands at the operating system prompt.

  1. Check that the version of Java you are using is 1.5 or higher. You can execute the following in a command window to check the version of Java.

    >java -version
    
  2. Add %ORACLE_HOME%\jdk\bin\ to your PATH variable before the paths of any other Java versions.

  3. Add the following Data Mining JAR files to your Windows CLASSPATH:

                %ORACLE_HOME%\RDBMS\jlib\jdm.jar
                %ORACLE_HOME%\RDBMS\jlib\ojdm_api.jar
                %ORACLE_HOME%\RDBMS\jlib\xdb.jar
                %ORACLE_HOME%\jdbc\lib\ojdbc5.jar
                %ORACLE_HOME%\oc4j\j2ee\home\lib\connector.jar
                %ORACLE_HOME%\jlib\orai18n.jar   
                %ORACLE_HOME%\jlib\orai18n-mapping.jar
                %ORACLE_HOME%\lib\xmlparserv2.jar
    
  4. Compile the programs listed in Table 1-2. To use the JAVAC executable, open a command window and go to \RDBMS\demo in Oracle home.

    >javac program_name.java
    

    For example:

    >javac dmnbdemo.java
    

    If JAVAC is not found, then check the value of the PATH variable.

Run the Java Programs

You can run a Java program from the operating system prompt with a command like this:

>java program_name host_name:port_number:database_identifier user password

View the Models Created by the Sample Programs

In SQL*Plus, you can query the USER_MINING_MODELS view to list the models in your schema. This example shows that there are two mining models in your schema. The model name, mining function, and algorithm are displayed

SQL> set linesize 100
SQL> SELECT model_name, mining_function, algorithm from user_mining_models;
 
MODEL_NAME               MINING_FUNCTION            ALGORITHM
------------------------ -------------------------- ------------------------------
AI_SH_SAMPLE             ATTRIBUTE_IMPORTANCE       MINIMUM_DESCRIPTION_LENGTH
AR_SH_SAMPLE             ASSOCIATION_RULES          APRIORI_ASSOCIATION_RULES

To find all the columns defined in a view, use a DESCRIBE command.

SQL> DESCRIBE user_mining_models

You can query the USER_MINING_MODEL_ATTRIBUTES and USER_MINING_MODEL_SETTINGS views to obtain information about the attributes and settings for the models in your schema.