1 Oracle Data Miner System Overview

This chapter introduces Oracle Data Miner and the programmatic interfaces of Oracle Data Mining. It also supplies links to resources to help you learn more about the product.

Oracle Data Miner Architecture

Oracle Data Miner is an extension of Oracle SQL Developer, a graphical development environment for Oracle SQL.

Oracle Data Miner uses the data mining technology embedded in Oracle Database to create, execute, and manage workflows that encapsulate data mining operations. It uses the ODMRSYS schema as a dedicated system repository.

The architecture of Oracle Data Miner is illustrated in Figure 1-1.

Figure 1-1 Oracle Data Miner Architecture for Big Data



About the Oracle Data Miner Repository

Oracle Data Miner requires the installation of a repository, the ODMRSYS schema, in the database server. Oracle Data Miner users must have the privileges that are required for accessing objects in ODMRSYS.

Oracle Data Miner repository manages:
  • Storage: The repository stores the projects and workflows of all the Oracle Data Miner users that have established connections to this database.

  • Runtime Functions: The repository is the application layer for Oracle Data Miner. It controls the execution of workflows and other runtime operations.

Database Features Used by Oracle Data Miner

Oracle Data Miner uses a number of Oracle Database features such as Oracle Data Mining, Oracle XML DB, Oracle R Enterprise and so on.

Oracle Data Miner uses the following Oracle Database features:

  • Oracle Data Mining: Provides the model building, testing, and scoring capabilities of Oracle Data Miner.

  • Oracle XML DB: Manages the metadata in the Oracle Data Miner repository.

  • Oracle Text: Supports text mining.

  • Oracle Scheduler: Schedules workflow execution.

  • Oracle R Enterprise: Executes embedded R scripts supplied by the user.

Note:

With the exception of Oracle R Enterprise, these features are all included by default in Oracle Database Enterprise Edition. Oracle R Enterprise requires additional installation steps.

Oracle Data Miner and Oracle Advanced Analytics

Oracle Data Miner is the Graphical User Interface (GUI) for Oracle Data Mining, the data mining engine in Oracle Database.

Oracle Data Mining is a component of the Oracle Advanced Analytics option of Oracle Database Enterprise Edition.

Components of Oracle Advanced Analytics:

  • Oracle Data Mining (required by Oracle Data Miner)

    Oracle Data Mining is a powerful data mining engine embedded in the Database kernel. Oracle Data Mining supports algorithms for classification, regression, clustering, feature selection, feature extraction, and association (market basket analysis). The Data Mining PL/SQL Application Programming Interface (API) performs data preparation and creates, evaluates, and maintains mining models. Data Mining SQL functions score data using mining models or predictive queries.

  • Oracle R Enterprise (not required by Oracle Data Miner)

    Oracle Data Miner provides limited support for Oracle R Enterprise. If a user supplies a script that includes embedded R in the Oracle Data Miner SQL Query node, then Oracle Data Miner uses Oracle R Enterprise to execute the script.

    Oracle R Enterprise integrates the open source R statistical programming language and environment with Oracle Database. Oracle R Enterprise supports a transparency layer, which allows R to act transparently on Oracle data, and embedded R execution, which allows the execution of R scripts in the database.

About Data Mining APIs

Oracle Data Miner is an application based on the Data Mining APIs in Oracle Database.

The APIs are public and can be used directly for application development. The APIs are summarized in the following topics:

Data Mining PL/SQL Packages

PL/SQL APIs manipulate mining models, which are database schema objects.

Table 1-1 lists the PL/SQL packages and their descriptions.

Table 1-1 Oracle Data Mining PL/SQL Packages

Package Description

DBMS_DATA_MINING

DDL procedures for managing mining models.

Mining model settings.

Procedures for testing mining models, functions for querying mining models, and an APPLY procedure for batch scoring.

DBMS_DATA_MINING_TRANSFORM

Procedures for specifying transformation expressions and applying the transformations to columns of data.

Transformations can be passed to the model creation process and embedded in the model definition, or they can be applied externally to data views.

DBMS_PREDICTIVE_ANALYTICS

Procedures that perform predict, explain, and profile operations without a user-created mining model.

Note:

The mining operations in the DBMS_PREDICTIVE_ANALYTICS package are available in code snippets in Oracle Data Miner.

Data Mining SQL Scoring Functions

A set of specialized SQL functions provides the primary mechanism for scoring data in Oracle Data Mining. When called as single-row functions, the SQL Data Mining functions apply a user-supplied mining model to each row of input data.

In Oracle Database 12c, the functions can also be called as analytic functions, where the algorithmic processing is performed dynamically without a user-supplied mining model. The term Predictive Query refers to this mode of scoring.

Table 1-2 Data Mining SQL Scoring Functions

Function Name Function Description

CLUSTER_DETAILS

Returns cluster details for each row in the input data.

CLUSTER_DISTANCE

Returns the distance between each row and the centroid.

CLUSTER_ID

Returns the ID of the highest probability cluster for each row.

CLUSTER_PROBABILITY

Returns the highest probability cluster for each row.

CLUSTER_SET

Returns a set of cluster ID and probability pairs for each row.

FEATURE_COMPARE Compares two different documents including short ones such as keyword phrases or two attribute lists for similarity or dissimilarity.

FEATURE_DETAILS

Returns a set of feature and value pairs for each row.

FEATURE_ID

Returns feature details for each row in the input data.

FEATURE_SET

Returns a set of feature ID and feature value pairs for each row.

FEATURE_VALUE

Returns the value of the highest value feature for each row

PREDICTION

Returns the prediction for each row in the input.

PREDICTION_BOUNDS

Returns the upper and lower bounds of prediction for each row (GLM only).

PREDICTION_COST

Returns a cost for each row.

PREDICTION_DETAILS

Returns prediction details for each row.

PREDICTION_PROBABILITY

Returns the probability of each prediction.

PREDICTION_SET

Returns the prediction or cost with probability for each row.

Note:

The SQL scoring functions are available in code snippets in Oracle Data Miner.

Data Mining Data Dictionary Views

The data dictionary views store information about mining models in the Oracle Database system catalog. All views are available for DBA, USER, and ALL access.

Table 1-3 lists the Data mining data dictionary views and their descriptions.

Table 1-3 Data Mining Data Dictionary Views

View Name Description

ALL_MINING_MODELS

Provides information about all accessible mining models

ALL_MINING_MODEL_ATTRIBUTES

Provides information about the attributes of all accessible mining models

ALL_MINING_MODEL_SETTINGS

Provides information about the settings of all accessible mining models

ALL_MINING_MODEL_PARTITIONS Provides all the model partitions accessible to the user.
ALL_MINING_MODEL_VIEWS Provides a description of the user's own model views. Its columns, except for OWNER, are the same as those in ALL_MINING_MODEL_VIEWS.
ALL_MINING_MODEL_XFORMS Provides the user-specified transformations embedded in all models accessible to the user.

Related Topics

Resources For Learning About Oracle Data Miner

This section lists the resources such as documentation, forums, blogs, trainings, and tutorials for Oracle Data Miner.