1 Oracle Data Miner System Overview

This chapter introduces Oracle Data Miner and the programmatic interfaces of Oracle Machine Learning. It also supplies links to resources to help you learn more about the product.

Oracle Data Miner Architecture

Oracle Data Miner is an extension of Oracle SQL Developer, an integrated development environment for Oracle SQL.

Oracle Data Miner uses machine learning technology embedded in Oracle Database to create, run, and manage workflows that visually represent and encapsulate machine learning process steps and operations. It uses the ODMRSYS schema as a dedicated system repository.

The architecture of Oracle Data Miner is illustrated in Figure 1-1

Figure 1-1 Oracle Data Miner Architecture

Oracle Data Miner Architecture

About the Oracle Data Miner Repository

Oracle Data Miner requires the installation of a repository, which resides in the ODMRSYS schema, in the database server. Oracle Data Miner users must have the privileges that are required for accessing objects in ODMRSYS.

Oracle Data Miner repository manages:
  • Storage: The repository stores the projects and workflows of all the Oracle Data Miner users that have established connections to this database.

  • Runtime Functions: The repository is the application layer for Oracle Data Miner. It controls the running of workflows and other runtime operations.

Database Features Used by Oracle Data Miner

Oracle Data Miner uses several Oracle Database features such as Oracle Machine Learning for SQL, Oracle XML DB, optionally Oracle Machine Learning for R.

Oracle Data Miner uses the following Oracle Database features:

  • Oracle Machine Learning: Provides the model building, testing, and scoring capabilities for Oracle Data Miner.

  • Oracle XML DB: Manages the metadata in the Oracle Data Miner repository.

  • Oracle Text: Supports text mining integrated with the modeling process.

  • Oracle Scheduler: Provides the engine for scheduling the Oracle Data Miner workflows.

  • Oracle Machine Learning for R: Runs user-defined R functions using embedded R execution.

Note:

Except for Oracle Machine Learning for R, these features are all included by default in Oracle Database Enterprise Edition. Oracle Machine Learning for R requires additional installation steps.

Oracle Data Miner and Oracle Machine Learning

Oracle Data Miner is a Graphical User Interface (GUI) for Oracle Machine Learning available through SQL Developer, which uses in-database machine learning capabilities.

Oracle Data Miner is a component Oracle SQL Developer.

Components of Oracle Machine Learning:

  • Oracle Machine Learning (required by Oracle Data Miner)

    Oracle Machine Learning is a powerful machine learning engine embedded in the Oracle Database kernel. Oracle Machine Learning supports algorithms for multiple techniques, including classification, regression, clustering, feature selection, , association (market basket analysis), among others. The PL/SQL Application Programming Interface (API) of Oracle Machine Learning for SQL performs data preparation and creates, evaluates, and maintains mining models. SQL functions supporting scoring data using in-database machine learning models.

  • Oracle Machine Learning for R (not required by Oracle Data Miner)

    Oracle Data Miner provides limited support for Oracle Machine Learning for R. If a user supplies a user-defined R function that includes embedded R execution in the Oracle Data Miner SQL Query node, Oracle Data Miner uses Oracle Machine Learning for R to run that function.

About Oracle Machine Learning APIs

Oracle Data Miner is an application that uses the Oracle Machine Learning APIs in Oracle Database and Oracle Autonomous Database. Models produced using Oracle Data Miner can also be manipulated and used through the broader Oracle Machine Learning APIs.

The APIs are public and can be used directly for application development. The APIs are summarized in the following topics:

Oracle Machine Learning PL/SQL Packages

PL/SQL APIs manipulate machine learning models, which are first-class database schema objects.

Table 1-1 lists the PL/SQL packages and their descriptions.

Table 1-1 Oracle Machine Learning PL/SQL Packages

Package Description

DBMS_DATA_MINING

DDL procedures for managing mining models.

Mining model settings.

Procedures for testing mining models, functions for querying machine learning models, and an APPLY procedure for batch scoring.

DBMS_DATA_MINING_TRANSFORM

Procedures for specifying transformation expressions and applying the transformations to columns of data.

Transformations can be passed to the model creation process and embedded in the model definition, or they can be applied externally to data views.

DBMS_PREDICTIVE_ANALYTICS

Procedures that perform predict, explain, and profile operations without a user-created machine learning model.

Note:

The machine learning operations in the DBMS_PREDICTIVE_ANALYTICS package are available in code snippets in Oracle Data Miner.

Oracle Machine Learning SQL Scoring Functions

A set of specialized SQL functions provides the primary mechanism for scoring data in Oracle Machine Learning. When called as single-row functions, the SQL Machine Learning functions apply a user-supplied mining model to each row of input data.

From Oracle Database 12c onward, the functions can also be called as analytic functions, where the algorithmic processing is performed dynamically without a user-supplied model. The term Predictive Query refers to this mode of scoring.

Table 1-2 Machine Learning SQL Scoring Functions

Function Name Function Description

CLUSTER_DETAILS

Returns cluster details for each row in the input data.

CLUSTER_DISTANCE

Returns the distance between each row and the centroid.

CLUSTER_ID

Returns the ID of the highest probability cluster for each row.

CLUSTER_PROBABILITY

Returns the highest probability cluster for each row.

CLUSTER_SET

Returns a set of cluster ID and probability pairs for each row.

FEATURE_COMPARE Compares two different documents including short ones such as keyword phrases or two attribute lists for similarity or dissimilarity.

FEATURE_DETAILS

Returns a set of feature and value pairs for each row.

FEATURE_ID

Returns feature details for each row in the input data.

FEATURE_SET

Returns a set of feature ID and feature value pairs for each row.

FEATURE_VALUE

Returns the value of the highest value feature for each row

PREDICTION

Returns the prediction for each row in the input.

PREDICTION_BOUNDS

Returns the upper and lower bounds of prediction for each row (GLM only).

PREDICTION_COST

Returns a cost for each row.

PREDICTION_DETAILS

Returns prediction details for each row.

PREDICTION_PROBABILITY

Returns the probability of each prediction.

PREDICTION_SET

Returns the prediction or cost with probability for each row.

Note:

The SQL scoring functions are available in code snippets in Oracle Data Miner.

Oracle Machine Learning Data Dictionary Views

The data dictionary views store information about machine learning models in the Oracle Database system catalog. All views are available for DBA, USER, and ALL access.

Table 1-3 lists the Oracle Machine Learning data dictionary views and their descriptions.

Table 1-3 Oracle Machine Learning Data Dictionary Views

View Name Description

ALL_MINING_MODELS

Provides information about all accessible machine learning models

ALL_MINING_MODEL_ATTRIBUTES

Provides information about the attributes of all accessible machine learning models

ALL_MINING_MODEL_SETTINGS

Provides information about the settings of all accessible machine learning models

ALL_MINING_MODEL_PARTITIONS Provides all the model partitions accessible to the user.
ALL_MINING_MODEL_VIEWS Provides a description of the user's own model views. Its columns, except for OWNER, are the same as those in ALL_MINING_MODEL_VIEWS.
ALL_MINING_MODEL_XFORMS Provides the user-specified transformations embedded in all models accessible to the user.

Resources For Learning About Oracle Data Miner

This section lists the resources such as documentation, forums, blogs, trainings, and tutorials for Oracle Data Miner.