22 About the Data Mining API
Overview of the Oracle Data Mining application programming interface (API) components.
22.1 About Mining Models
Mining models are database schema objects that perform data mining.
As with all schema objects, access to mining models is controlled by database privileges. Models can be exported and imported. They support comments, and they can be tracked in the Database auditing system.
Mining models are created by the CREATE_MODEL procedure in the DBMS_DATA_MINING PL/SQL package. Models are created for a specific mining function, and they use a specific algorithm to perform that function. Mining function is a data mining term that refers to a class of mining problems to be solved. Examples of mining functions are: regression, classification, attribute importance, clustering, anomaly detection, and feature extraction. Oracle Data Mining supports one or more algorithms for each mining function.
22.2 Data Mining Data Dictionary Views
Lists Oracle Data Mining data dictionary views.
The data dictionary views for Oracle Data Mining are listed in the following table. A database administrator (DBA) and USER versions of the views are also available.
Table 22-1 Data Dictionary Views for Oracle Data Mining
22.2.1 ALL_MINING_MODELS
Describes an example of ALL_MINING_MODELS and shows a sample query.
The following example describes ALL_MINING_MODELS and shows a sample query.
Example 22-1 ALL_MINING_MODELS
describe ALL_MINING_MODELS
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER NOT NULL VARCHAR2(128)
MODEL_NAME NOT NULL VARCHAR2(128)
MINING_FUNCTION VARCHAR2(30)
ALGORITHM VARCHAR2(30)
CREATION_DATE NOT NULL DATE
BUILD_DURATION NUMBER
MODEL_SIZE NUMBER
PARTITIONED VARCHAR2(3)
COMMENTS VARCHAR2(4000)The following query returns the models accessible to you that use the Support Vector Machine algorithm.
SELECT mining_function, model_name
FROM all_mining_models
WHERE algorithm = 'SUPPORT_VECTOR_MACHINES'
ORDER BY mining_function, model_name;
MINING_FUNCTION MODEL_NAME
------------------------- --------------------
CLASSIFICATION PART2_CLAS_SAMPLE
CLASSIFICATION PART_CLAS_SAMPLE
CLASSIFICATION SVMC_SH_CLAS_SAMPLE
CLASSIFICATION SVMO_SH_CLAS_SAMPLE
CLASSIFICATION T_SVM_CLAS_SAMPLE
REGRESSION SVMR_SH_REGR_SAMPLE Related Topics
22.2.2 ALL_MINING_MODEL_ATTRIBUTES
Describes an example of ALL_MINING_MODEL_ATTRIBUTES and shows a sample query.
The following example describes ALL_MINING_MODEL_ATTRIBUTES and shows a sample query. Attributes are the predictors or conditions that are used to create models and score data.
Example 22-2 ALL_MINING_MODEL_ATTRIBUTES
describe ALL_MINING_MODEL_ATTRIBUTES Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER NOT NULL VARCHAR2(128) MODEL_NAME NOT NULL VARCHAR2(128) ATTRIBUTE_NAME NOT NULL VARCHAR2(128) ATTRIBUTE_TYPE VARCHAR2(11) DATA_TYPE VARCHAR2(106) DATA_LENGTH NUMBER DATA_PRECISION NUMBER DATA_SCALE NUMBER USAGE_TYPE VARCHAR2(8) TARGET VARCHAR2(3) ATTRIBUTE_SPEC VARCHAR2(4000)
The following query returns the attributes of an SVM classification model named T_SVM_CLAS_SAMPLE. The model has both categorical and numerical attributes and includes one attribute that is unstructured text.
SELECT attribute_name, attribute_type, target
FROM all_mining_model_attributes
WHERE model_name = 'T_SVM_CLAS_SAMPLE'
ORDER BY attribute_name;
ATTRIBUTE_NAME ATTRIBUTE_TYPE TAR
------------------------- -------------------- ---
AFFINITY_CARD CATEGORICAL YES
AGE NUMERICAL NO
BOOKKEEPING_APPLICATION NUMERICAL NO
BULK_PACK_DISKETTES NUMERICAL NO
COMMENTS TEXT NO
COUNTRY_NAME CATEGORICAL NO
CUST_GENDER CATEGORICAL NO
CUST_INCOME_LEVEL CATEGORICAL NO
CUST_MARITAL_STATUS CATEGORICAL NO
EDUCATION CATEGORICAL NO
FLAT_PANEL_MONITOR NUMERICAL NO
HOME_THEATER_PACKAGE NUMERICAL NO
HOUSEHOLD_SIZE CATEGORICAL NO
OCCUPATION CATEGORICAL NO
OS_DOC_SET_KANJI NUMERICAL NO
PRINTER_SUPPLIES NUMERICAL NO
YRS_RESIDENCE NUMERICAL NO
Y_BOX_GAMES NUMERICAL NO Related Topics
22.2.3 ALL_MINING_MODEL_PARTITIONS
Describes an example of ALL_MINING_MODEL_PARTITIONS and shows a sample query.
ALL_MINING_MODEL_PARTITIONS and shows a sample query.
Example 22-3 ALL_MINING_MODEL_PARTITIONS
describe ALL_MINING_MODEL_PARTITIONS Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER NOT NULL VARCHAR2(128) MODEL_NAME NOT NULL VARCHAR2(128) PARTITION_NAME VARCHAR2(128) POSITION NUMBER COLUMN_NAME NOT NULL VARCHAR2(128) COLUMN_VALUE VARCHAR2(4000)
The following query returns the partition names and partition key values for two partitioned models. Model PART2_CLAS_SAMPLE has a two column partition key with system-generated partition names.
SELECT model_name, partition_name, position, column_name, column_value
FROM all_mining_model_partitions
ORDER BY model_name, partition_name, position;
MODEL_NAME PARTITION_ POSITION COLUMN_NAME COLUMN_VALUE
-------------------- ---------- -------- -------------------- ---------------
PART2_CLAS_SAMPLE DM$$_P0 1 CUST_GENDER F
PART2_CLAS_SAMPLE DM$$_P0 2 CUST_INCOME_LEVEL HIGH
PART2_CLAS_SAMPLE DM$$_P1 1 CUST_GENDER F
PART2_CLAS_SAMPLE DM$$_P1 2 CUST_INCOME_LEVEL LOW
PART2_CLAS_SAMPLE DM$$_P2 1 CUST_GENDER F
PART2_CLAS_SAMPLE DM$$_P2 2 CUST_INCOME_LEVEL MEDIUM
PART2_CLAS_SAMPLE DM$$_P3 1 CUST_GENDER M
PART2_CLAS_SAMPLE DM$$_P3 2 CUST_INCOME_LEVEL HIGH
PART2_CLAS_SAMPLE DM$$_P4 1 CUST_GENDER M
PART2_CLAS_SAMPLE DM$$_P4 2 CUST_INCOME_LEVEL LOW
PART2_CLAS_SAMPLE DM$$_P5 1 CUST_GENDER M
PART2_CLAS_SAMPLE DM$$_P5 2 CUST_INCOME_LEVEL MEDIUM
PART_CLAS_SAMPLE F 1 CUST_GENDER F
PART_CLAS_SAMPLE M 1 CUST_GENDER M
PART_CLAS_SAMPLE U 1 CUST_GENDER U Related Topics
22.2.4 ALL_MINING_MODEL_SETTINGS
Describes an example of ALL_MINING_MODEL_SETTINGS and shows a sample query.
The following example describes ALL_MINING_MODEL_SETTINGS and shows a sample query. Settings influence model behavior. Settings may be specific to an algorithm or to a mining function, or they may be general.
Example 22-4 ALL_MINING_MODEL_SETTINGS
describe ALL_MINING_MODEL_SETTINGS Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER NOT NULL VARCHAR2(128) MODEL_NAME NOT NULL VARCHAR2(128) SETTING_NAME NOT NULL VARCHAR2(30) SETTING_VALUE VARCHAR2(4000) SETTING_TYPE VARCHAR2(7)
The following query returns the settings for a model named SVD_SH_SAMPLE. The model uses the Singular Value Decomposition algorithm for feature extraction.
SELECT setting_name, setting_value, setting_type
FROM all_mining_model_settings
WHERE model_name = 'SVD_SH_SAMPLE'
ORDER BY setting_name;
SETTING_NAME SETTING_VALUE SETTING
------------------------------ ------------------------------ -------
ALGO_NAME ALGO_SINGULAR_VALUE_DECOMP INPUT
ODMS_MISSING_VALUE_TREATMENT ODMS_MISSING_VALUE_AUTO DEFAULT
ODMS_SAMPLING ODMS_SAMPLING_DISABLE DEFAULT
PREP_AUTO OFF INPUT
SVDS_SCORING_MODE SVDS_SCORING_SVD DEFAULT
SVDS_U_MATRIX_OUTPUT SVDS_U_MATRIX_ENABLE INPUT Related Topics
22.2.5 ALL_MINING_MODEL_VIEWS
Describes an example of ALL_MINING_MODEL_VIEWS and shows a sample query.
ALL_MINING_MODEL_VIEWS and shows a sample query. Model views provide details on the models.
Example 22-5 ALL_MINING_MODEL_VIEWS
describe ALL_MINING_MODEL_VIEWS
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER NOT NULL VARCHAR2(128)
MODEL_NAME NOT NULL VARCHAR2(128)
VIEW_NAME NOT NULL VARCHAR2(128)
VIEW_TYPE VARCHAR2(128)
The following query returns the model views for a model SVD_SH_SAMPLE. The model uses the Singular Value Decomposition algorithm for feature extraction.
SELECT view_name, view_type
FROM all_mining_model_views
WHERE model_name = 'SVD_SH_SAMPLE'
ORDER BY view_name;
VIEW_NAME VIEW_TYPE
------------------------- --------------------------------------------------
DM$VESVD_SH_SAMPLE Singular Value Decomposition S Matrix
DM$VGSVD_SH_SAMPLE Global Name-Value Pairs
DM$VNSVD_SH_SAMPLE Normalization and Missing Value Handling
DM$VSSVD_SH_SAMPLE Computed Settings
DM$VUSVD_SH_SAMPLE Singular Value Decomposition U Matrix
DM$VVSVD_SH_SAMPLE Singular Value Decomposition V Matrix
DM$VWSVD_SH_SAMPLE Model Build Alerts Related Topics
22.2.6 ALL_MINING_MODEL_XFORMS
Describes an example of ALL_MINING_MODEL_XFORMS and provides a sample query.
ALL_MINING_MODEL_XFORMS and provides a sample query.
Example 22-6 ALL_MINING_MODEL_XFORMS
describe ALL_MINING_MODEL_XFORMS Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER NOT NULL VARCHAR2(128) MODEL_NAME NOT NULL VARCHAR2(128) ATTRIBUTE_NAME VARCHAR2(128) ATTRIBUTE_SUBNAME VARCHAR2(4000) ATTRIBUTE_SPEC VARCHAR2(4000) EXPRESSION CLOB REVERSE VARCHAR2(3)
The following query returns the embedded transformations for a model PART2_CLAS_SAMPLE.
SELECT attribute_name, expression
FROM all_mining_model_xforms
WHERE model_name = 'PART2_CLAS_SAMPLE'
ORDER BY attribute_name;
ATTRIBUTE_NAME
-------------------------
EXPRESSION
--------------------------------------------------------------------------------
CUST_INCOME_LEVEL
CASE CUST_INCOME_LEVEL WHEN 'A: Below 30,000' THEN 'LOW'
WHEN 'L: 300,000 and above' THEN 'HIGH'
ELSE 'MEDIUM' END Related Topics
22.3 Data Mining PL/SQL Packages
The PL/SQL interface to Oracle Data Mining is implemented in three packages.
Related Topics
22.3.1 DBMS_DATA_MINING
Understand the routines of DBMS_DATA_MINING package.
The DBMS_DATA_MINING package contains routines for creating mining models, for performing operations on mining models, and for querying mining models. The package includes routines for:
-
Creating, dropping, and performing other DDL operations on mining models
-
Obtaining detailed information about model attributes, rules, and other information internal to the model (model details)
-
Computing test metrics for classification models
-
Specifying costs for classification models
-
Exporting and importing models
-
Building models using Oracle's native algorithms as well as algorithms written in R
Related Topics
22.3.2 DBMS_DATA_MINING_TRANSFORM
Understand the routines of DBMS_DATA_MINING_TRANSFORM package.
The DBMS_DATA_MINING_TRANSFORM package contains routines that perform data transformations such as binning, normalization, and outlier treatment. The package includes routines for:
-
Specifying transformations in a format that can be embedded in a mining model.
-
Specifying transformations as relational views (external to mining model objects).
-
Specifying distinct properties for columns in the build data. For example, you can specify that the column must be interpreted as unstructured text, or that the column must be excluded from Automatic Data Preparation.
22.3.2.1 Transformation Methods in DBMS_DATA_MINING_TRANSFORM
Summarizes the methods for transforming data in DBMS_DATA_MINING_TRANSFORM package.
Table 22-3 DBMS_DATA_MINING_TRANSFORM Transformation Methods
| Transformation Method | Description |
|---|---|
|
|
|
|
|
|
|
|
Specifies transformations for embedding in a model |
The statements in the following example create an Support Vector Machine (SVM) Classification model called T_SVM_Clas_sample with an embedded transformation that causes the comments attribute to be treated as unstructured text data.
Example 22-7 Sample Embedded Transformation
DECLARE
xformlist dbms_data_mining_transform.TRANSFORM_LIST;
BEGIN
dbms_data_mining_transform.SET_TRANSFORM(
xformlist, 'comments', null, 'comments', null, 'TEXT');
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'T_SVM_Clas_sample',
mining_function => dbms_data_mining.classification,
data_table_name => 'mining_build_text',
case_id_column_name => 'cust_id',
target_column_name => 'affinity_card',
settings_table_name => 't_svmc_sample_settings',
xform_list => xformlist);
END;
/ 22.3.3 DBMS_PREDICTIVE_ANALYTICS
Understand the routines of DBMS_PREDICTIVE_ANALYTICS package.
The DBMS_PREDICTIVE_ANALYTICS package contains routines that perform an automated form of data mining known as predictive analytics. With predictive analytics, you do not need to be aware of model building or scoring. All mining activities are handled internally by the procedure. The DBMS_PREDICTIVE_ANALYTICS package includes these routines:
-
EXPLAINranks attributes in order of influence in explaining a target column. -
PREDICTpredicts the value of a target column based on values in the input data. -
PROFILEgenerates rules that describe the cases from the input data.
The EXPLAIN statement in the following example lists attributes in the view mining_data_build_v in order of their importance in predicting affinity_card.
Related Topics
22.4 Data Mining SQL Scoring Functions
Understand the different data mining SQL scoring functions.
The Data Mining SQL language functions use Oracle Data Mining to score data. The functions can apply a mining model schema object to the data, or they can dynamically mine the data by executing an analytic clause. SQL functions are available for all the data mining algorithms that support the scoring operation. All Data Mining SQL functions, as listed in the following table can operate on R Mining Model with the corresponding mining function. However, the functions are not limited to the ones listed here.
Table 22-4 Data Mining SQL Functions
The following example shows a query that returns the results of the CLUSTER_ID function. The query applies the model em_sh_clus_sample, which finds groups of customers that share certain characteristics. The query returns the identifiers of the clusters and the number of customers in each cluster.
Example 22-9 CLUSTER_ID Function
-- -List the clusters into which the customers in this -- -data set have been grouped. -- SELECT CLUSTER_ID(em_sh_clus_sample USING *) AS clus, COUNT(*) AS cnt FROM mining_data_apply_v GROUP BY CLUSTER_ID(em_sh_clus_sample USING *) ORDER BY cnt DESC; SQL> -- List the clusters into which the customers in this SQL> -- data set have been grouped. SQL> -- SQL> SELECT CLUSTER_ID(em_sh_clus_sample USING *) AS clus, COUNT(*) AS cnt 2 FROM mining_data_apply_v 3 GROUP BY CLUSTER_ID(em_sh_clus_sample USING *) 4 ORDER BY cnt DESC; CLUS CNT ---------- ---------- 9 311 3 294 7 215 12 201 17 123 16 114 14 86 19 64 15 56 18 36
Related Topics