4 Create a Model
Explains how to create Oracle Machine Learning for SQL models and to query model details.
4.1 Model Detail Views
Model detail views are algorithmspecific. Viewing the model detail views will provide you with additional information about the model you created. The names of model detail views begin with DM$. Some model views, such as Global NameValue Pairs view (DM$VG
model_name), Computed Settings view (DM$VS
model_name), Model Build Alerts view (DM$VW
model_name), and Normalization and Missing Value Handling view (DM$VN
model_name), are shared by all algorithms and are documented separately. Aside from that, classification, clustering, and regression algorithms share some common views. The columns returned by these views may differ between algorithms.
The following are the model views, grouped by model function:
Association:
Classification, Regression, and Anomaly Detection:
 Model Detail View for Multivariate State Estimation Technique  Sequential Probability Ratio Test
 Model Detail Views for XGBoost
Clustering:
Feature Extraction:
Feature Selection:
Data Preparation and Other:
Time Series:
4.1.1 Model Detail Views for Association Rules
The model detail view DM$VR
model_name contains the generated rules for association models.
Model Views  Description 

DM$VAmodel_name 
Association Rules For Transactional Data 
DM$VG model_name 
Global NameValue Pairs 
DM$VI model_name:

Association Rule Itemsets 
DM$VR model_name 
Association Rules 
DM$VS model_name 
Computed Settings 
DM$VT model_name 
Association Rule Itemsets For Transactional Data 
DM$VW model_name 
Model Build Alerts 
DM$VR
model_name) different sets of columns. Settings ODMS_ITEM_ID_COLUMN_NAME
and ODMS_ITEM_VALUE_COLUMN_NAME
determine how each item is defined. If ODMS_ITEM_ID_COLUMN_NAME
is set, the input format is called transactional input, otherwise, the input format is called 2Dimensional input. With transactional input, if setting ODMS_ITEM_VALUE_COLUMN_NAME
is not set, each item is defined by ITEM_NAME
, otherwise, each item is defined by ITEM_NAME
and ITEM_VALUE
. With 2Dimensional input, each item is defined by ITEM_NAME
, ITEM_SUBNAME
and ITEM_VALUE
. Setting ASSO_AGGREGATES
specifies the columns to aggregate, which is displayed in the view.
Note:
SettingASSO_AGGREGATES
is not allowed for 2dimensional input.
Transactional Input Without ASSO_AGGREGATES Setting
ITEM_NAME
(ODMS_ITEM_ID_COLUMN_NAME
) and do not set ITEM_VALUE
(ODMS_ITEM_VALUE_COLUMN_NAME
), the view contains the following. The consequent item is defined with only the name field. If you also set ITEM_VALUE
, the view has the additional column CONSEQUENT_VALUE
that specifies the value field.Name Type
 
PARTITION_NAME VARCHAR2(128)
RULE_ID NUMBER
RULE_SUPPORT NUMBER
RULE_CONFIDENCE NUMBER
RULE_LIFT NUMBER
RULE_REVCONFIDENCE NUMBER
ANTECEDENT_SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
CONSEQUENT_SUPPORT NUMBER
CONSEQUENT_NAME VARCHAR2(4000)
ANTECEDENT SYS.XMLTYPE
Table 41 Rule View Columns for Transactional Inputs
Column Name  Description 


A partition in a partitioned model to retrieve details. 

The identifier of the rule. 

The number of transactions that satisfy the rule. 

The likelihood of a transaction satisfying the rule. 

The degree of improvement in the prediction over random chance when the rule is satisfied. 

The number of transactions in which the rule occurs divided by the number of transactions in which the consequent occurs. 

The ratio of the number of transactions that satisfy the antecedent to the total number of transactions. 

The total number of attributes referenced in the antecedent and consequent of the rule. 

The ratio of the number of transactions that satisfy the consequent to the total number of transactions. 

The name of the consequent. 

The value of the consequent. This column is present when 

The antecedent is described as an itemset. At the itemset level, it specifies the number of aggregates, and if not zero, the names of the columns to be aggregated (as well as the mapping to

Transactional Input With ASSO_AGGREGATES Setting

Rule view when
ODMS_ITEM_ID_COLUMN_NAME
is set andItem_value
(ODMS_ITEM_VALUE_COLUMN_NAME
) is not set. 
Rule view when
ODMS_ITEM_ID_COLUMN_NAME
is set andItem_value
(ODMS_ITEM_VALUE_COLUMN_NAME
) is set withTYPE
as numerical, the view has aCONSEQUENT_VALUE
column. 
Rule view when
ODMS_ITEM_ID_COLUMN_NAME
is set andItem_value
(ODMS_ITEM_VALUE_COLUMN_NAME
) is set withTYPE
as categorical, the view has aCONSEQUENT_VALUE
column.
For the example that produces the following rules, see “Example: Calculating Aggregates” in Oracle Machine Learning for SQL Concepts.
The view reports two sets of aggregates results:

ANT_RULE_PROFIT
refers to the total profit for the antecedent itemset with respect to the rule, the profit for each individual item of the antecedent itemset is shown in theANTECEDENT(XMLtype)
column,CON_RULE_PROFIT
refers to the total profit for the consequent item with respect to the rule.In the example, for rule (A, B) => C, the rule itemset (A, B, C) occurs in the transactions of customer 1 and customer 3. The
ANT_RULE_PROFIT
is $21.20, TheANTECEDENT
is shown as follow, which tells that item A has profit 5.00 + 3.00 = $8.00 and item B has profit 3.20 + 10.00 = $13.20, which sum up toANT_RULE_PROFIT
.<itemset NUMAGGR="1" ASSO_AGG0="profit"><item><item_name>A</item_name><ASSO_AGG0>8.0E+000</ASSO_AGG0></item><item><item_name>B</item_name><ASSO_AGG0>1.32E+001</ASSO_AGG0></item></itemset> The CON_RULE_PROFIT is 12.00 + 14.00 = $26.00

ANT_PROFIT
refers to the total profit for the antecedent itemset, whileCON_PROFIT
refers to the total profit for the consequent item. The difference betweenCON_PROFIT
andCON_RULE_PROFIT
(the same applies toANT_PROFIT
andANT_RULE_PROFIT
) is thatCON_PROFIT
counts all profit for the consequent item across all transactions where the consequent occurs, whileCON_RULE_PROFIT
only counts across transactions where the rule itemset occurs.For example, item C occurs in transactions for customer 1, 2 and 3,
CON_PROFIT
is 12.00 + 4.20 + 14.00 = $30.20, whileCON_RULE_PROFIT
only counts transactions for customer 1 and 3 where the rule itemset (A, B, C) occurs.Similarly,
ANT_PROFIT
counts all transactions where itemset (A, B) occurs, whileANT_RULE_PROFIT
counts only transactions where the rule itemset (A, B, C) occurs. In this example, by coincidence, both count transactions for customer 1 and 3, and have the same value.
Example 41 Examples
The following example shows the view when setting ASSO_AGGREGATES
specifies column profit and column sales to be aggregated. In this example, ITEM_VALUE
column is not specified.
Name Type
 
PARTITION_NAME VARCHAR2(128)
RULE_ID NUMBER
RULE_SUPPORT NUMBER
RULE_CONFIDENCE NUMBER
RULE_LIFT NUMBER
RULE_REVCONFIDENCE NUMBER
ANTECEDENT_SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
CONSEQUENT_SUPPORT NUMBER
CONSEQUENT_NAME VARCHAR2(4000)
ANTECEDENT SYS.XMLTYPE
ANT_RULE_PROFIT BINARY_DOUBLE
CON_RULE_PROFIT BINARY_DOUBLE
ANT_PROFIT BINARY_DOUBLE
CON_PROFIT BINARY_DOUBLE
ANT_RULE_SALES BINARY_DOUBLE
CON_RULE_SALES BINARY_DOUBLE
ANT_SALES BINARY_DOUBLE
CON_SALES BINARY_DOUBLE
The rule view has a CONSEQUENT_VALUE
column when ODMS_ITEM_ID_COLUMN_NAME
is set and Item_value
(ODMS_ITEM_VALUE_COLUMN_NAME
) is set with TYPE
as numerical or categorical.
2Dimensional Inputs
In Oracle Machine Learning for SQL, association models can be built using either transactional or twodimensional data formats. For twodimensional input, each item is defined by three fields: NAME
, VALUE
and SUBNAME
. The NAME
field is the name of the column. The VALUE
field is the content of the column. The SUBNAME
field is used when the input data table contains a nested table. In that case, SUBNAME
is the name of the nested table's column. See, Example: Creating a Nested Column for Market Basket Analysis. In this example, there is a nested column. The CONSEQUENT_SUBNAME
is the ATTRIBUTE_NAME
part of the nested column. That is, 'O/S Documentation Set  English'
and CONSEQUENT_VALUE
is the value part of the nested column, which is, 1.
The view uses three columns for the consequent. The rule view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
RULE_ID NUMBER
RULE_SUPPORT NUMBER
RULE_CONFIDENCE NUMBER
RULE_LIFT NUMBER
RULE_REVCONFIDENCE NUMBER
ANTECEDENT_SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
CONSEQUENT_SUPPORT NUMBER
CONSEQUENT_NAME VARCHAR2(4000)
CONSEQUENT_SUBNAME VARCHAR2(4000)
CONSEQUENT_VALUE VARCHAR2(4000)
ANTECEDENT SYS.XMLTYPE
Note:
All of the types for three columns for the consequent areVARCHAR2
. ASSO_AGGREGATES
is not applicable for 2Dimensional input format.
The following table displays rule view columns for 2Dimensional input with the descriptions of only the fields that are specific to 2D inputs.
Table 42 Rule View for 2Dimensional Input
Column Name  Description 

CONSEQUENT_SUBNAME 
For twodimensional inputs, 

The value of the consequent when setting 

The antecedent is described as an itemset. The itemset contains As an example, assuming that this is not a nested table input, and the antecedent contains one item: (name
For 2Dimensional input with nested table, the subname field is filled. 
Global NameValue Pairs View for Association Rules
Global NameValue Pairs View produces a single column for an association model. The following table describes the columns returned for association model.
Table 43 Global NameValue Pairs View for an Association Model
Name  Description 


The number of itemsets generated. 

The maximum support. 

The total number of rows used in the build. 

The number of association rules in the model generated. 

The number of the transactions in the input data. 
4.1.2 Model Detail View for Frequent Itemsets
The model detail view DM$VI
model_name contains information about frequent itemsets.
The Association Rule Itemsets view (DM$VI
model_name) has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2 (128)
ITEMSET_ID NUMBER
SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
ITEMSET SYS.XMLTYPE
Table 44 Association Rule Itemsets View
Column Name  Description 


A partition in a partitioned model 

Itemset identifier 

Support of the itemset 

Number of items in the itemset 

Frequent itemset The structure of the 
4.1.3 Model Detail Views for Transactional Itemsets
The model detail view DM$VT
model_name contains information about the transactional itemsets.
For the very common case of transactional data without aggregates, the Association Rule Itemsets For Transactional Data view (DM$VT
model_name) provides the itemsets information in transactional format. This view can help improve performance for some queries as compared to the view with the XML column. The transactional itemsets view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ITEMSET_ID NUMBER
ITEM_ID NUMBER
SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
ITEM_NAME VARCHAR2(4000)
Table 45 Association Rule Itemsets For Transactional Data View
Column Name  Description 


A partition in a partitioned model 

Itemset identifier 

Item identifier 

Support of the itemset 

Number of items in the itemset 

The name of the item 
4.1.4 Model Detail View for Transactional Rule
The model detail view DM$VA
model_name contains information about transactional rules and transactional itemsets.
Transactional data without aggregates also has an Association Rules For Transactional Data view (DM$VA
model_name). This view can improve performance for some queries as compared to the view with the XML column. The transactional rule view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
RULE_ID NUMBER
ANTECEDENT_PREDICATE VARCHAR2(4000)
CONSEQUENT_PREDICATE VARCHAR2(4000)
RULE_SUPPORT NUMBER
RULE_CONFIDENCE NUMBER
RULE_LIFT NUMBER
RULE_REVCONFIDENCE NUMBER
RULE_ITEMSET_ID NUMBER
ANTECEDENT_SUPPORT NUMBER
CONSEQUENT_SUPPORT NUMBER
NUMBER_OF_ITEMS NUMBER
Table 46 Association Rules For Transactional Data View
Column Name  Description 


A partition in a partitioned model 

Rule identifier 

Name of the Antecedent item. 

Name of the Consequent item 

Support of the rule 

The likelihood a transaction satisfies the rule when it contains the Antecedent. 

The degree of improvement in the prediction over random chance when the rule is satisfied 

The number of transactions in which the rule occurs divided by the number of transactions in which the consequent occurs 

Itemset identifier 

The ratio of the number of transactions that satisfy the antecedent to the total number of transactions 

The ratio of the number of transactions that satisfy the consequent to the total number of transactions 

Number of items in the rule 
4.1.5 Model Detail Views for Classification Algorithms
Model detail views for classification algorithms are the target map view and scoring cost view, which are applicable to all classification algorithms.
Model Views  Description 

DM$VA model_name 
Variable Importance 
DM$VC model_name 
Scoring Cost Matrix 
DM$VG model_name 
Global NameValue Pairs 
DM$VS model_name 
Computed Settings 
DM$VT model_name

Classification Targets 
DM$VW model_name:

Model Build Alerts 
The Classification Targets view (DM$VT
model_name) describes the target distribution for classification models. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
TARGET_COUNT NUMBER
TARGET_WEIGHT NUMBER
Table 47 Classification Targets View
Column Name  Description 


Partition name in a partitioned model 

Target value, numerical or categorical 

Number of rows for a given 

Weight for a given 
The Scoring Cost Matrix view (DM$VC
model_name) describes the scoring cost matrix for classification models. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ACTUAL_TARGET_VALUE NUMBER/VARCHAR2
PREDICTED_TARGET_VALUE NUMBER/VARCHAR2
COST NUMBER
Table 48 Scoring Cost Matrix View
Column Name  Description 


Partition name in a partitioned model 

A valid target value 

Predicted target value 

Associated cost for the actual and predicted target value pair 
4.1.6 Model Detail Views for CUR Matrix Decomposition
Model detail views for CUR Matrix Decomposition contain information about the scores and ranks of attributes and rows.
CUR Matrix Decomposition models have the following views:
Attribute importance and rank: DM$VC
model_name
Row importance and rank: DM$VR
model_name
Global statistics: DM$VG
The attribute importance and rank view DM$VC
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
ATTRIBUTE_IMPORTANCE NUMBER
ATTRIBUTE_RANK NUMBER
Table 49 Attribute Importance and Rank View
Column Name  Description 


Partition name in a partitioned model 

Attribute name 

Attribute subname. The value is null for nonnested columns. 

Value of the attribute 

Attribute leverage score 

Attribute rank based on leverage score 
The view DM$VR
model_name exposes the leverage scores and ranks of all selected rows through a view. This view is created when users decide to perform row importance and the CASE_ID
column is present. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CASE_ID Original cid data types,
including NUMBER, VARCHAR2,
DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
ROW_IMPORTANCE NUMBER
ROW_RANK NUMBER
Table 410 Row Importance and Rank View
Column Name  Description 


Partition name in a partitioned model 

Case ID. The supported case ID types are the same as that supported for GLM, SVD, and ESA algorithms. 

Row leverage score 

Row rank based on leverage score 
Table 411 CUR Matrix Decomposition Statistics Information In Model Global View.
Name  Description 


Number of SVD components (SVD rank) 

Number of rows used in the model build 
4.1.7 Model Detail Views for Decision Tree
The model detail views specific to Decision Tree are the hierarchy view, node statistics view, node description view, and the cost matrix view.
The Decision Tree Hierarchy view (DM$VP
model_name) describes the decision tree hierarchy and the split information for each level in the decision tree. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
PARENT NUMBER
SPLIT_TYPE VARCHAR2
NODE NUMBER
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
OPERATOR VARCHAR2
VALUE SYS.XMLTYPE
Table 412 Decision Tree Hierarchy View
Column Name  Description 


Partition name in a partitioned model 

Node ID of the parent 

The main or surrogate split 

The node ID 

The attribute used as the splitting criterion at the parent node to produce this node. 

Split attribute subname. The value is null for nonnested columns. 

Split operator 

Value used as the splitting criterion. This is an XML element described using the For example, 
The Decision Tree Statistics view (DM$VI
model_name) describes the statistics associated with individual tree nodes. The statistics include a target histogram for the data in the node. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
NODE NUMBER
NODE_SUPPORT NUMBER
PREDICTED_TARGET_VALUE NUMBER/VARCHAR2
TARGET_VALUE NUMBER/VARCHAR2
TARGET_SUPPORT NUMBER
Table 413 Decision Tree Statistics View
Parameter  Description 


Partition name in a partitioned model 

The node ID 

Number of records in the training set that belong to the node 

Predicted Target value 

A target value seen in the training data 

The number of records that belong to the node and have the value specified in the 
The Decision Tree Nodes (DM$VO
model_name) view describes higher level node. The DM$VO
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
NODE NUMBER
NODE_SUPPORT NUMBER
PREDICTED_TARGET_VALUE NUMBER/VARCHAR2
PARENT NUMBER
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
OPERATOR VARCHAR2
VALUE SYS.XMLTYPE
Table 414 Decision Tree Nodes View
Parameter  Description 


Partition name in a partitioned model 

The node ID 

Number of records in the training set that belong to the node 

Predicted Target value 

The ID of the parent 

Specifies the attribute name 

Specifies the attribute subname 

Attribute predicate operator  a conditional operator taking the following values: IN, = , <>, < , >, <=, and >= 

Value used as the description criterion. This is an XML element described using the For example, 
The Decision Tree Build Cost Matrix view (DM$VM
model_name) describes the cost matrix used by the Decision Tree build. The DM$VM
model_name view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ACTUAL_TARGET_VALUE NUMBER/VARCHAR2
PREDICTED_TARGET_VALUE NUMBER/VARCHAR2
COST NUMBER
Table 415 Decision Tree Build Cost Matrix View
Parameter  Description 


Partition name in a partitioned model 

Valid target value 

Predicted Target value 

Associated cost for the actual and predicted target value pair 
4.1.8 Model Detail Views for Generalized Linear Model
Model detail views specific to Generalized Linear Model (GLM) such as details and row diagnostics for linear and logistic regression models are discussed.
The GLM Regression Attribute Diagnostics view (DM$VD
model_name) describes the final model information for both linear regression models and logistic regression models.
For linear regression, the view DM$VD
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
FEATURE_EXPRESSION VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
STD_ERROR BINARY_DOUBLE
TEST_STATISTIC BINARY_DOUBLE
P_VALUE BINARY_DOUBLE
VIF BINARY_DOUBLE
STD_COEFFICIENT BINARY_DOUBLE
LOWER_COEFF_LIMIT BINARY_DOUBLE
UPPER_COEFF_LIMIT BINARY_DOUBLE
For logistic regression, the view DM$VD
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
FEATURE_EXPRESSION VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
STD_ERROR BINARY_DOUBLE
TEST_STATISTIC BINARY_DOUBLE
P_VALUE BINARY_DOUBLE
STD_COEFFICIENT BINARY_DOUBLE
LOWER_COEFF_LIMIT BINARY_DOUBLE
UPPER_COEFF_LIMIT BINARY_DOUBLE
EXP_COEFFICIENT BINARY_DOUBLE
EXP_LOWER_COEFF_LIMIT BINARY_DOUBLE
EXP_UPPER_COEFF_LIMIT BINARY_DOUBLE
Table 417 Model View for Linear and Logistic Regression Models
Column Name  Description 


The name of a feature in the model 

Valid target value 

The attribute name when there is no subname, or first part of the attribute name when there is a subname. 

Nested column subname. The value is null for nonnested columns. When the nested column is numeric, the machine learning attribute is identified by the combination 

A unique value that can be assumed by a categorical column or nested categorical column. For categorical columns, a machine learning attribute is identified by a unique 

The feature name constructed by the algorithm when feature selection is enabled. If feature selection is not enabled, the feature name is the fullyqualified attribute name (attribute_name.attribute_subname if the attribute is in a nested column). For categorical attributes, the algorithm constructs a feature name that has the following form: fullyqualified_attribute_name.attribute_value When feature generation is enabled, a term in the model can be a single machine learning attribute or the product of up to 3 machine learning attributes. Component machine learning attributes can be repeated within a single term. If feature generation is not enabled or, if feature generation is enabled, but no multiple component terms are discovered by the Note: In 12c Release 2, the algorithm does not subtract the mean from numerical components. 

The estimated coefficient. 

Standard error of the coefficient estimate. 

For linear regression, the tvalue of the coefficient estimate. For logistic regression, the Wald chisquare value of the coefficient estimate. 

Probability of the 

Variance Inflation Factor. The value is zero for the intercept. For logistic regression, 

Standardized estimate of the coefficient. 

Lower confidence bound of the coefficient. 

Upper confidence bound of the coefficient. 

Exponentiated coefficient for logistic regression. For linear regression, 

Exponentiated coefficient for lower confidence bound of the coefficient for logistic regression. For linear regression, 

Exponentiated coefficient for upper confidence bound of the coefficient for logistic regression. For linear regression, 
The GLM Regression Row Diagnostics view DM$VA
model_name describes row level information for both linear regression models and logistic regression models. For linear regression, the view DM$VA
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CASE_ID NUMBER/VARHCAR2, DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
TARGET_VALUE BINARY_DOUBLE
PREDICTED_TARGET_VALUE BINARY_DOUBLE
Hat BINARY_DOUBLE
RESIDUAL BINARY_DOUBLE
STD_ERR_RESIDUAL BINARY_DOUBLE
STUDENTIZED_RESIDUAL BINARY_DOUBLE
PRED_RES BINARY_DOUBLE
COOKS_D BINARY_DOUBLE
Table 418 GLM Regression Row Diagnostics View for Linear Regression
Column Name  Description 


Partition name in a partitioned model 

Name of the case identifier 

The actual target value as taken from the input row 

The model predicted target value for the row 

The diagonal element of the n*n (n=number of rows) that the Hat matrix identifies with a specific input row. The model predictions for the input data are the product of the Hat matrix and vector of input target values. The diagonal elements (Hat values) represent the influence of the i^{th} row on the i^{th} fitted value. Large Hat values are indicators that the i^{th} row is a point of high leverage, a potential outlier. 

The difference between the predicted and actual target value for a specific input row. 

The standard error residual, sometimes called the Studentized residual, rescales the residual to have constant variance across all input rows in an effort to make the input row residuals comparable. The process multiplies the residual by square root of the row weight divided by the product of the model mean square error and 1 minus the Hat value. 

Studentized deletion residual adjusts the standard error residual for the influence of the current row. 

The predictive residual is the weighted square of the deletion residuals, computed as the row weight multiplied by the square of the residual divided by 1 minus the Hat value. 

Cook's distance is a measure of the combined impact of the i^{th} case on all of the estimated regression coefficients. 
For logistic regression, the view DM$VA
model_name has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CASE_ID NUMBER/VARHCAR2, DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
TARGET_VALUE NUMBER/VARCHAR2
TARGET_VALUE_PROB BINARY_DOUBLE
Hat BINARY_DOUBLE
WORKING_RESIDUAL BINARY_DOUBLE
PEARSON_RESIDUAL BINARY_DOUBLE
DEVIANCE_RESIDUAL BINARY_DOUBLE
C BINARY_DOUBLE
CBAR BINARY_DOUBLE
DIFDEV BINARY_DOUBLE
DIFCHISQ BINARY_DOUBLE
Table 419 GLM Regression Row Diagnostics View for Logistic Regression
Column Name  Description 


Partition name in a partitioned model 

Name of the case identifier 

The actual target value as taken from the input row 

Model estimate of the probability of the predicted target value. 

The Hat value concept from linear regression is extended to logistic regression by multiplying the linear regression Hat value by the variance function for logistic regression, the predicted probability multiplied by 1 minus the predicted probability. 

The working residual is the residual of the working response. The working response is the response on the linearized scale. For logistic regression it has the form: the i^{th} row residual divided by the variance of the i^{th} row prediction. The variance of the prediction is the predicted probability multiplied by 1 minus the predicted probability.


The Pearson residual is a rescaled version of the working residual, accounting for the weight. For logistic regression, the Pearson residual multiplies the residual by a factor that is computed as square root of the weight divided by the variance of the predicted probability for the i^{th} row.


The 

Measures the overall change in the fitted logits due to the deletion of the i^{th} observation for all points including the one deleted (the i^{th} point). It is computed as the square of the Pearson residual multiplied by the Hat value divided by the square of 1 minus the Hat value. Confidence interval displacement diagnostics that provides scalar measure of the influence of individual observations. 

C and CBAR are extensions of Cooks’ distance for logistic regression. CBAR measures the overall change in the fitted logits due to the deletion of the i^{th} observation for all points excluding the one deleted (the i^{th} point). It is computed as the square of the Pearson residual multiplied by the Hat value divided by (1 minus the Hat value)
Confidence interval displacement diagnostic which measures the influence of deleting an individual observation. 

A statistic that measures the change in deviance that occurs when an observation is deleted from the input. It is computed as the square of the deviance residual plus 

A statistic that measures the change in the Pearson chisquare statistic that occurs when an observation is deleted from the input. It is computed as 
Global Details for GLM: Linear Regression
The following table describes Global NameValue Pairs (DM$VG) for a linear regression model.
Table 420 Global Details for Linear Regression
Name  Description 


Adjusted RSquare 

Akaike's information criterion 

Coefficient of variation 

Indicates whether the model build process has converged to specified tolerance. The following are the possible values:


Corrected total degrees of freedom 

Corrected total sum of squares 

Dependent mean 

Error degrees of freedom 

Error mean square 

Error sum of squares 

Model F value statistic 

Estimated mean square error of the prediction, assuming multivariate normality 

Hocking Sp statistic 

Tracks the number of SGD iterations. Applicable only when the solver is SGD. 

JP statistic (the final prediction error) 

Model degrees of freedom 

Model F value probability 

Model mean square error 

Model sum of square errors 

Number of parameters (the number of coefficients, including the intercept) 

Number of rows 

RSquare 

The number of predictors excluded from the model due to multicollinearity 

Root mean square error 

Schwarz's Bayesian information criterion 
Global Details for GLM: Logistic Regression
The following table returns Global NameValue Pairs (DM$VG) for a logistic regression model.
Table 421 Global Details for Logistic Regression
Name  Description 


Akaike's criterion for the fit of the baseline, interceptonly, model 

Akaike's criterion for the fit of the intercept and the covariates (predictors) mode 

Indicates whether the model build process has converged to specified tolerance. The following are the possible values:


Dependent mean 

Tracks the number of SGD iterations (number of IRLS iterations). Applicable only when the solver is SGD. 

Likelihood ratio degrees of freedom 

Likelihood ratio chisquare value 

Likelihood ratio chisquare probability value 

2 log likelihood of the baseline, interceptonly, model 

2 log likelihood of the model 

Number of parameters (the number of coefficients, including the intercept) 

Number of rows 

Percent of correct predictions 

Percent of incorrectly predicted rows 

Percent of cases where the estimated probabilities are equal for both target classes 

Pseudo Rsquare Cox and Snell 

Pseudo Rsquare Nagelkerke 

The number of predictors excluded from the model due to multicollinearity 

Schwarz's Criterion for the fit of the baseline, interceptonly, model 

Schwarz's Criterion for the fit of the intercept and the covariates (predictors) model 
Note:

When ridge regression is enabled, fewer global details are returned. For information about ridge, see Oracle Machine Learning for SQL Concepts.

When the value is
NULL
for a partitioned model, an exception is thrown. When the value is not null, it must contain the desired partition name.
4.1.9 Model Detail View for Multivariate State Estimation Technique  Sequential Probability Ratio Test
The model detail view specific to Multivariate State Estimation Technique  Sequential Probability Ratio Test contains information about Global NameValue Paris.
The following table lists the Global NameValue Pairs (DM$VG
model_name) for an MSETSPRT. This statistic is included when due to memory constraints MSETSPRT cannot use the MSET_MEMORY_VECTORS
value set by the user.
Table 422 MSETSPRT Information in the Model Global View
Name  Description 

NUM_MVEC 
The number of memory vectors used by the model. 
4.1.10 Model Detail Views for Naive Bayes
The model detail views specific to Naive Bayes are the prior view and result view.
The Naive Bayes Target Priors view (DM$VP
model_name) describes the priors of the targets for a Naive Bayes model. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
PRIOR_PROBABILITY BINARY_DOUBLE
COUNT NUMBER
Table 423 Naive Bayes Target Priors View for Naive Bayes
Column Name  Description 


The name of a feature in the model 

Name of the target column 

Target value, numerical or categorical 

Prior probability for a given 

Number of rows for a given 
The Naive Bayes Conditional Probabilities view (DM$VV
model_view) describes the conditional probabilities of the Naive Bayes model. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
CONDITIONAL_PROBABILITY BINARY_DOUBLE
COUNT NUMBER
Table 424 Naive Bayes Conditional Probabilities View for Naive Bayes
Column Name  Description 


The name of a feature in the model 

Name of the target column 

Target value, numerical or categorical 

Column name 

Nested column subname. The value is null for nonnested columns. 

Machine learning attribute value for the column 

Conditional probability of a machine learning attribute for a given target 

Number of rows for a given machine learning attribute and a given target 
4.1.11 Model Detail Views for Neural Network
Model detail views specific to Neural Network contain information about the weights of the neurons: input layer and hidden layers.
The Neural Network Weights view (DM$VA
model_name) has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
LAYER NUMBER
IDX_FROM NUMBER
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
IDX_TO NUMBER
TARGET_VALUE NUMBER/VARCHAR2
WEIGHT BINARY_DOUBLE
Table 426 Neural Network Weights View
Column Name  Description 


Partition name in a partitioned model 

Layer ID, 0 as an input layer 

Node index that the weight connects from (attribute id for input layer) 

Attribute name (only for the input layer) 

Attribute subname. The value is null for nonnested columns. 

Categorical attribute value 

Node index that the weights connects to 

Target value. The value is null for regression. 

Value of the weight 
The view Global NameValue Pairs (DM$VG
model_name) is a preexisting view. The following namevalue pairs are specific to a Neural Network view.
Table 427 Global NameValue Pairs Viewfor Neural Network
Name  Description 


Indicates whether the model build process has converged to specified tolerance. The following are the possible values:


Number of iterations 

Loss function value (if it is with 

Number of rows in the model (or partitioned model) 
4.1.12 Model Detail Views for Random Forest
Model detail views specific to Random Forest contain variable importance measures and statistics.
Model detail views and statistics specific to Random Forest are:

Variable Importance statistics
DM$VA
model_name 
Random Forest statistics in the Global NameValue Pairs
DM$VG
model_name view
One of the important outputs from a Random Forest model build is a ranking of attributes based on their relative importance. This is measured using Mean Decrease Gini. The DM$VA
model_name view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(128)
ATTRIBUTE_IMPORTANCE BINARY_DOUBLE
Table 428 Variable Importance Model View
Column Name  Description 


Partition name. The value is null for models which are not partitioned. 

Column name 

Nested column subname. The value is null for nonnested columns. 

Measure of importance for an attribute in the forest (mean Decrease Gini value) 
The Global NameValue Pairs (DM$VG
model_name) view is a preexisting view. The following namevalue pairs are added to the view.
Table 429 Random Forest Statistics Information In Model Global View
Name  Description 


Average depth of the trees in the forest 

Average number of nodes per tree 

Maximum depth of the trees in the forest 

Maximum number of nodes per tree 

Minimum depth of the trees in the forest 

Minimum number of nodes per tree 

The total number of rows used in the build 
4.1.13 Model Detail View for Support Vector Machine
Model detail views specific to Support Vector Machine (SVM) contain linear coefficients and support vector statistics.
Model Views  Description 

DM$VCS model_name 
Scoring Cost Matrix 
DM$VG model_name 
Global NameValue Pairs 
DM$VN model_name 
Normalization and Missing Value Handling 
DM$VS model_name 
Computed Settings 
DM$VT model_name 
Classification Targets 
DM$VW model_name 
Model Build Alerts 
The linear coefficient view DM$VL
model_name describes the coefficients of a linear SVM algorithm. The target_value field in the view is present only for classification and has the type of the target. Regression models do not have a target_value field.
The reversed_coefficient field shows the value of the coefficient after reversing the automatic data preparation transformations. If data preparation is disabled, then coefficient and reversed_coefficient have the same value. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
REVERSED_COEFFICIENT BINARY_DOUBLE
Table 430 Linear Coefficient View for Support Vector Machine
Column Name  Description 


Partition name in a partitioned model 

Target value, numerical or categorical 

Column name 

Nested column subname. The value is null for nonnested columns. 

Value of a categorical attribute 

Projection coefficient value 

Coefficient transformed on the original scale 
Table 431 Support Vector Statistics Information In Model Global View
Name  Description 


Indicates whether the model build process has converged to specified tolerance:


Number of iterations performed during build 

Number of rows used for the build 

Number of rows removed due to 0 norm. This applies to oneclass linear models only. 
4.1.14 Model Detail Views for XGBoost
The model detail views specific to XGBoost contain information about Feature Importance view and Global NameValue Pairs view.
The DM$VImodel_name
view reports the feature importance values for each attribute of each partition of the model.
The view has the following columns for tree models (gbtree
and dart
boosters).
Name Type
 
PNAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
GAIN BINARY_DOUBLE
COVER BINARY_DOUBLE
FREQUENCY BINARY_DOUBLE
Table 432 Feature Importance View for a Tree Model
Column Name  Description 

PNAME 
The name of a partition in a partitioned model. 
ATTRIBUTE_NAME 
The column name. 
ATTRIBUTE_SUBNAME 
The nested column subname; the value is null for nonnested columns. 
ATTRIBUTE_VALUE 
The value of a categorical attribute. 
GAIN 
The fractional contribution of each feature to the model based on the total gain of a feature’s splits; a higher percentage means a more important predictive feature. 
COVER 
The number of observations related to the feature. 
FREQUENCY 
A percentage representing the relative number of times a feature has been used in trees. 
For a linear model (gblinear
) booster, the feature importance is the absolute magnitude of linear coefficients.
The view has the following columns for linear models.
Name Type
 
PNAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
WEIGHT BINARY_DOUBLE
CLASS BINARY_DOUBLE
Table 433 Feature Importance View for a Linear Model
Column Name  Description 

PNAME 
The name of a partition in a partitioned model. 
ATTRIBUTE_NAME 
The column name. 
ATTRIBUTE_SUBNAME 
The nested column subname; the value is null for nonnested columns. 
ATTRIBUTE_VALUE 
The value of a categorical attribute. 
WEIGHT 
The linear coefficient of the feature. 
CLASS 
The class label for a multiclass model. 
The DM$VGmodel_name
view reports global statistics for an XGBoost model. The statistics include an evaluation of the training data set done by the evaluation metric you specified with the learning task eval_metric
setting, or by the default eval_metric
if you didn't specify one. The view contains only the result of the last training iteration. When you specify more than one eval_metric
, the view contains multiple rows, one for each eval_metric
.
4.1.15 Model Detail Views for Clustering Algorithms
Oracle Machine Learning for SQL supports these clustering algorithms: Expectation Maximization (EM), kMeans (KM), and orthogonal partitioning clustering (OCluster, OC).
All clustering algorithms share the following views:
Model Views  Description 

DM$VD model_name:

Clustering Description 
DM$VA model_name 
Clustering Attribute Statistics 
DM$VH model_name 
Clustering Histograms 
DM$VR model_name 
Clustering Rules 
The Cluster Description view DM$VD
model_name describes cluster level information about a clustering model. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
CLUSTER_NAME NUMBER/VARCHAR2
RECORD_COUNT NUMBER
PARENT NUMBER
TREE_LEVEL NUMBER
LEFT_CHILD_ID NUMBER
RIGHT_CHILD_ID NUMBER
Table 434 Clustering Description View
Column Name  Description 


Partition name in a partitioned model 

The ID of a cluster in the model 

Specifies the label of the cluster 

Specifies the number of records 

The ID of the parent 

Specifies the number of splits from the root 

The ID of the child cluster on the left side of the split 

The ID of the child cluster on the right side of the split 
The attribute view DM$VA
model_name describes attribute level information about a clustering model. The values of the mean, variance, and mode for a particular cluster can be obtained from this view. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
CLUSTER_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
MEAN BINARY_DOUBLE
VARIANCE BINARY_DOUBLE
MODE_VALUE VARCHAR2(4000)
Table 435 Clustering Attribute Statistics
Column Name  Description 


A partition in a partitioned model 

The ID of a cluster in the model 

Specifies the label of the cluster 

Specifies the attribute name 

Specifies the attribute subname 

The field returns the average value of a numeric attribute 

The variance of a numeric attribute 

The mode is the most frequent value of a categorical attribute 
The histogram view DM$VH
model_name describes histogram level information about a clustering model. The bin information as well as bin counts can be obtained from this view. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
CLUSTER_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
BIN_ID NUMBER
LOWER_BIN_BOUNDARY BINARY_DOUBLE
UPPER_BIN_BOUNDARY BINARY_DOUBLE
ATTRIBUTE_VALUE VARCHAR2(4000)
COUNT NUMBER
Table 436 Clustering Histograms View
Column Name  Description 


A partition in a partitioned model 

The ID of a cluster in the model 

Specifies the label of the cluster 

Specifies the attribute name 

Specifies the attribute subname 

Bin ID 

Numeric lower bin boundary 

Numeric upper bin boundary 

Categorical attribute value 

Histogram count 
The rule view DM$VR
model_name describes the rule level information about a clustering model. The information is provided at attribute predicate level. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
CLUSTER_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
OPERATOR VARCHAR2(2)
NUMERIC_VALUE NUMBER
ATTRIBUTE_VALUE VARCHAR2(4000)
SUPPORT NUMBER
CONFIDENCE BINARY_DOUBLE
RULE_SUPPORT NUMBER
RULE_CONFIDENCE BINARY_DOUBLE
Table 437 Clustering Rules View
Column Name  Description 


A partition in a partitioned model 

The ID of a cluster in the model 

Specifies the label of the cluster 

Specifies the attribute name 

Specifies the attribute subname 

Attribute predicate operator  a conditional operator taking the following values: IN, = , <>, < , >, <=, and >= 

Numeric lower bin boundary 

Categorical attribute value 

Attribute predicate support 

Attribute predicate confidence 

Rule level support 

Rule level confidence 
4.1.16 Model Detail Views for Expectation Maximization
Model detail views specific to Expectation Maximization (EM) contain additional information about an EM model.
The following views contain information that is not in the clustering views for an EM model. For the clustering views, refer to "Model Detail Views for Clustering Algorithms".
The Expectation Maximization Components view (DM$VO
model_name) describes the EM components. The component view contains information about their prior probabilities and what cluster they map to. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
COMPONENT_ID NUMBER
CLUSTER_ID NUMBER
PRIOR_PROBABILITY BINARY_DOUBLE
Table 438 Expectation Maximization Components View
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of a component 

The ID of a cluster in the model 

Component prior probability 
The Expectation Maximization Gaussian view (DM$VM
model_name) provides information about the mean and variance parameters for the attributes by Gaussian distribution models. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
COMPONENT_ID NUMBER
ATTRIBUTE_NAME VARCHAR2(4000)
MEAN BINARY_DOUBLE
VARIANCE BINARY_DOUBLE
The Expectation Maximization Bernoulli parameters view (DM$VF
model_name) provides information about the parameters of the multivalued Bernoulli distributions used by the EM model. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
COMPONENT_ID NUMBER
ATTRIBUTE_NAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
FREQUENCY BINARY_DOUBLE
Table 439 Expectation Maximization Bernoulli parameters View
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of a component 

Column name 

Categorical attribute value 

The frequency of the multivalued Bernoulli distribution for the attribute/value combination specified by 
For 2Dimensional columns, EM provides an attribute ranking similar to that of attribute importance. This ranking is based on a rankweighted average over Kullback–Leibler divergence computed for pairs of columns. This unsupervised attribute importance is shown in the Unsupervised Attribute Importance view (DM$VI
model_name) and has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_IMPORTANCE_VALUE BINARY_DOUBLE
ATTRIBUTE_RANK NUMBER
Table 440 Unsupervised Attribute Importance View for Expectation Maximization
Column Name  Description 


Partition name in a partitioned model 

Column name 

Importance value 

An attribute rank based on the importance value 
The pairwise
Kullback–Leibler divergence is reported in the Attribute Pair KullbackLeibler Divergence view (DM$VB
model_name). This metric evaluates how much the observed joint distribution of two attributes diverges from the expected distribution under the assumption of independence. That is, the higher the value, the more dependent the two attributes are. The dependency value is scaled based on the size of the grid used for each pairwise computation. That ensures that all values fall within the [0; 1] range and are comparable. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME_1 VARCHAR2(128)
ATTRIBUTE_NAME_2 VARCHAR2(128)
DEPENDENCY BINARY_DOUBLE
Table 441 Attribute Pair KullbackLeibler Divergence View for Expectation Maximization
Column Name  Description 


Partition name in a partitioned model 

Name of the first attribute 

Name of the second attribute 

Scaled pairwise KullbackLeibler divergence 
The projection table DM$VP
model_name shows the coefficients used by random projections to map nested columns to a lower dimensional space. The view has rows only when nested or text data is present in the build data. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_NAME VARCHAR2(4000)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT NUMBER
Table 442 Projection table for Expectation Maximization
Column Name  Description 


Partition name in a partitioned model 

Name of feature 

Column name 

Nested column subname. The value is null for nonnested columns. 

Categorical attribute value 

Projection coefficient. The representation is sparse; only the nonzero coefficients are returned. 
Global Details for Expectation Maximization
The following table describes global details for EM.
Table 443 Global Details for Expectation Maximization
Name  Description 


Indicates whether the model build process has converged to specified tolerance. The possible values are:


Loglikelihood on the build data 

Number of components produced by the model 

Number of clusters produced by the model 

Number of rows used in the build 

The random seed value used for the model build 

The number of empty components excluded from the model 
Related Topics
4.1.17 Model Detail Views for kMeans
Model detail views specific to kMeans (KM) contain clustering description view (DM$VG
), and scoring information.
Model Views  Description 

DM$VA model_name 
Clustering Attribute Statistics 
DM$VC model_name 
kMeans Scoring Centroids 
DM$VD model_name 
Clustering Description 
DM$VG model_name 
Global NameValue Pairs 
DM$VH model_name 
Clustering Histograms 
DM$VN model_name 
Normalization and Missing Value Handling 
DM$VR model_name 
Clustering Rules 
DM$VS model_name 
Computed Settings 
DM$VW model_name 
Model Build Alerts 
"Model Detail Views for Clustering Algorithms" discusses common model views across clustering algorithms. Global NameValue Pairs view (DM$VG
), which contains information about Computed Settings view (DM$VS
) and Model Build Alerts view (DM$VW
), and Normalization and Missing Value Handling view (DM$VN
) are addressed individually.
The following views contain information that is specific to kMeans model.
The kMeans Clustering Description view DM$VD
model_name has an additional column:
Name Type
 
DISPERSION BINARY_DOUBLE
Table 444 Clustering Description for kMeans
Column Name  Description 


A measure used to quantify whether a set of observed occurrences are dispersed compared to a standard statistical model. 
The kMeans Scoring Centroids view DM$VC
model_name describes the centroid of each leaf clusters:
Name Type
 
PARTITION_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
CLUSTER_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
VALUE BINARY_DOUBLE
Table 445 kMeans Scoring Centroids View
Column Name  Description 


Partition name in a partitioned model 

The ID of a cluster in the model 

Specifies the label of the cluster 

Column name 

Nested column subname. The value is null for nonnested columns. 

Categorical attribute value 

Specifies the centroid value 
DM$VG
) for kMeans.
Table 446 k–Means Global NameValue Pairs View
Name  Description 


Indicates whether the model build process has converged to specified tolerance. The following are the possible values:


Number of rows used in the build 

Number of rows removed due to 0 norm. This applies only to models using cosine distance. 
4.1.18 Model Detail Views for OCluster
Model detail views specific to OCluster (OC) contain information about description view, histograms view, and global view.
The following views contain information that is specific to an OCluster model. For the clustering views, refer to "Model Detail Views for Clustering Algorithms". The OC algorithm uses the same descriptive statistics views as Expectation Maximization (EM) and kMeans (KM). The following are the statistics views:
The Cluster Description view (DM$VD
model_name) describes the OCluster components. The Cluster Description view has additional fields that specify the split predicate. The view has the following columns:
Name Type
 
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
OPERATOR VARCHAR2(2)
VALUE SYS.XMLTYPE
Table 447 Cluster Description View for OCluster
Column Name  Description 


Column name 

Nested column subname. The value is null for nonnested columns. 

Split operator 

List of split values 
SYS.XMLTYPE
is as follows:<Element>splitval1</Element>
The OC algorithm uses a Clustering Histograms view (DM$VH
model_name) with different columns than EM and KM. The view has the following columns:
Name Type
 
PARTITON_NAME VARCHAR2(128)
CLUSTER_ID NUMBER
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
BIN_ID NUMBER
LABEL VARCHAR2(4000)
COUNT NUMBER
Table 448 Clustering Histograms View for OCluster
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of a component 

Column name 

Nested column subname. The value is null for nonnested columns. 

Unique identifier 

Bin label 

Bin histogram count 
The following table describes the Global NameValue Pairs (DM$VG
model_name) view specific to OCluster.
Table 449 OCluster Statistics Information In Model Global View
Name  Description 


The total number of rows used in the build 
Related Topics
4.1.19 Model Detail Views for Explicit Semantic Analysis
Model detail views specific to Explicit Semantic Analysis (ESA) contain information about attribute statistics and features.
Model Views  Description 

DM$VA model_name 
Explicit Semantic Analysis Matrix 
DM$VF model_name 
Explicit Semantic Analysis Features 
DM$VG model_name 
Global NameValue Pairs 
DM$VN model_name 
Normalization and Missing Value Handling 
DM$VS model_name 
Computed Settings 
DM$VW model_name 
Model Build Alerts 
DM$VX model_name 
Text Features 

Explicit Semantic Analysis Matrix (
DM$VA
model_name): This view has different columns for feature extraction and classification. For feature extraction, this view contains model attribute coefficients per feature. For classification, this view contains model attribute coefficients per target class. 
Explicit Semantic Analysis Features (
DM$VF
model_name): This view is applicable only for feature extraction.
The Explicit Semantic Analysis Matrix view (DM$VA
model_name) has the following columns for feature extraction:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER/VARHCAR2, DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
Table 450 Explicit Semantic Analysis Matrix for Feature Extraction
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of a feature as it appears in the training data 

Column name 

Nested column subname. The value is null for nonnested columns. 

Categorical attribute value 

A measure of the weight of the attribute with respect to the feature 
The (DM$VA
model_name) view comprises of attribute coefficients for all target classes.
The view Explicit Semantic Analysis Matrix (DM$VA
model_name) has the following columns for classification:
Name Type
 
PARTITION_NAME VARCHAR2(128)
TARGET_VALUE NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
Table 451 Explicit Semantic Analysis Matrix for Classification
Column Name  Description 


Partition name in a partitioned model 

Value of the target 

Column name 

Nested column subname. The value is null for nonnested columns. 

Categorical attribute value 

A measure of the weight of the attribute with respect to the feature 
The Explicit Semantic Analysis Features view (DM$VF
model_name) has a unique row for every feature in one view. This feature is helpful if the model was prebuilt and the source training data are not available. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER/VARHCAR2, DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
Table 452 Explicit Semantic Analysis Features for Explicit Semantic Analysis
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of a feature as it appears in the training data 
DM$VG
model_name) specific to ESA.
Table 453 Explicit Semantic Analysis Statistics Information In Model Global View
Name  Description 


The total number of input rows 

Number of rows removed by filters 
4.1.20 Model Detail Views for NonNegative Matrix Factorization
Model detail views specific to NonNegative Matrix Factorization (NMF) contain information about the encoding H matrix and H inverse matrix.
The views specific to NMF are:

NonNegative Matrix Factorization H Matrix view (
DM$VE
model_name) 
NonNegative Matrix Factorization Inverse H Matrix view (
DM$VI
model_name)
The view DM$VE
model_name describes the encoding (H) matrix of an NMF model. The FEATURE_NAME
column type may be either NUMBER
or VARCHAR2
. The view has the following columns.
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER
FEATURE_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
Table 454 NonNegative Matrix Factorization H Matrix View
Column Name  Description 


Partition name in a partitioned model 

The ID of a feature in the model 

The name of a feature in the model 

Column name 

Nested column subname. The value is null for nonnested columns. 

Specifies the value of attribute 

The attribute encoding that represents its contribution to the feature 
The view DM$VI
model_view describes the inverse H matrix of an NMF model. The FEATURE_NAME
column type may be either NUMBER
or VARCHAR2
. The view has the following schema:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER
FEATURE_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
COEFFICIENT BINARY_DOUBLE
Table 455 NonNegative Matrix Factorization Inverse H Matrix View
Column Name  Description 


Partition name in a partitioned model 

The ID of a feature in the model 

The name of a feature in the model 

Column name 

Nested column subname. The value is null for nonnested columns. 

Specifies the value of attribute 

The attribute encoding that represents its contribution to the feature 
DM$VG
model_name) specific to NMF.
Table 456 Global NameValue Pairs View for NMF
Name  Description 


Convergence error 

Indicates whether the model build process has converged to specified tolerance. The following are the possible values:


Number of iterations performed during build 

Number of rows used in the build input data set 

Number of rows used by the build 
4.1.21 Model Detail Views for Singular Value Decomposition
Model detail views specific to Singular Value Decomposition (SVD) contain information about the S matrix, rightsingular vectors, and leftsingular vectors.
Model Views  Description 

DM$VE model_name 
Singular Value Decomposition S Matrix 
DM$VG model_name 
Global NameValue Pairs 
DM$VN model_name 
Normalization and Missing Value Handling 
DM$VS model_name 
Computed Settings 
DM$VU model_name 
Singular Value Decomposition U Matrix 
DM$VV model_name 
Singular Value Decomposition V Matrix 
DM$VW model_name 
Model Build Alerts 
The Singular Value Decomposition S Matrix view (DM$VE
model_name) leverages the fact that each singular value in the SVD model has a corresponding principal component in the associated Principal Components Analysis (PCA) model to relate a common set of information for both classes of models. For an SVD model, it describes the content of the S matrix. When PCA scoring is selected as a build setting, the variance and percentage cumulative variance for the corresponding principal components are shown as well. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER
FEATURE_NAME NUMBER/VARCHAR2
VALUE BINARY_DOUBLE
VARIANCE BINARY_DOUBLE
PCT_CUM_VARIANCE BINARY_DOUBLE
Table 457 Singular Value Decomposition S Matrix View
Column Name  Description 


Partition name in a partitioned model 

The ID of a feature in the model 

The name of a feature in the model 

The matrix entry value 

The variance explained by a component. This column is only present for SVD models with setting This column is nonnull only if the build data is centered, either manually or because of the following setting: 

The percent cumulative variance explained by the components thus far. The components are ranked by the explained variance in descending order. This column is only present for SVD models with setting This column is nonnull only if the build data is centered, either manually or because of the following setting: 
The Singular Value Decomposition V Matrix view (DM$VV
model_view) describes the rightsingular vectors of an SVD model. For a PCA model it describes the principal components (eigenvectors). The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
FEATURE_ID NUMBER
FEATURE_NAME NUMBER/VARCHAR2
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_VALUE VARCHAR2(4000)
VALUE BINARY_DOUBLE
Table 458 Singular Value Decomposition V Matrix View
Column Name  Description 


Partition name in a partitioned model 

The ID of a feature in the model 

The name of a feature in the model 

Column name 

Nested column subname. The value is null for nonnested columns. 

Categorical attribute value. For numerical attributes, 

The matrix entry value 
DM$VU
model_name) describes the leftsingular vectors of an SVD model. For a PCA model, it describes the projection of the data in the principal components. This view does not exist unless the settings dbms_data_mining.svds_u_matrix_output
is set to dbms_data_mining.svds_u_matrix_enable
. The view has the following columns:Name Type
 
PARTITION_NAME VARCHAR2(128)
CASE_ID NUMBER/VARHCAR2, DATE, TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
FEATURE_ID NUMBER
FEATURE_NAME NUMBER/VARCHAR2
VALUE BINARY_DOUBLE
Table 459 Singular Value Decomposition U Matrix View or Projection Data in Principal Components
Column Name  Description 


Partition name in a partitioned model 

Unique identifier of the row in the build data described by the U matrix projection. 

The ID of a feature in the model 

The name of a feature in the model 

The matrix entry value 
Global Details for Singular Value Decomposition
The following table describes the Global NameValue Pairs view (DM$VG
model_name) specific to a SVD model.
Table 460 Global NameValue Pairs View for Singular Value Decomposition
Name  Description 


Number of features (components) produced by the model 

The total number of rows used in the build 

Suggested cutoff that indicates how many of the top computed features capture most of the variance in the model. Using only the features below this cutoff would be a reasonable strategy for dimensionality reduction. 
Related Topics
4.1.22 Model Detail Views for Minimum Description Length
Model detail views specific to Minimum Description Length (MDL) (for calculating attribute importance) contain information about attribute importance models.
The Attribute Importance view (DM$VA
model_name) describes the attribute importance as well as the attribute importance rank. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
ATTRIBUTE_IMPORTANCE_VALUE BINARY_DOUBLE
ATTRIBUTE_RANK NUMBER
Table 461 Attribute Importance View for Minimum Description Length
Column Name  Description 


Partition name in a partitioned model 

Column name 

Nested column subname. The value is null for nonnested columns. 

Importance value 

Rank based on importance 
DM$VG
model_name) specific to MDL.
Table 462 Global NameValue Pairs View for MDL
Name  Description 


The total number of rows used in the build 
4.1.23 Model Detail Views for Binning
The binning view DM$VB
describes the bin boundaries used in automatic data preparation.
The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
BIN_ID NUMBER
LOWER_BIN_BOUNDARY BINARY_DOUBLE
UPPER_BIN_BOUNDARY BINARY_DOUBLE
ATTRIBUTE_VALUE VARCHAR2(4000)
Table 463 Model Details View for Binning
Column Name  Description 


Partition name in a partitioned model 

Specifies the attribute name 

Specifies the attribute subname 

Bin ID (or bin identifier) 

Numeric lower bin boundary 

Numeric upper bin boundary 

Categorical value 
4.1.24 Model Detail Views for Global Information
Model detail views for global information contain information about global statistics, alerts, and computed settings.
The Global NameValue Pairs view (DM$VG
model_name) describes global statistics related to the model build. Examples include the number of rows used in the build, the convergence status, and the model quality metrics. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
NAME VARCHAR2(30)
NUMERIC_VALUE NUMBER
STRING_VALUE VARCHAR2(4000)
Table 464 Global NameValue Pairs View
Column Name  Description 


Partition name in a partitioned model 

Name of the statistic 

Numeric value of the statistic 

Categorical value of the statistic 
The Model Build Alerts view (DM$VW
model_name) lists alerts issued during the model build. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ERROR_NUMBER BINARY_DOUBLE
ERROR_TEXT VARCHAR2(4000)
Table 465 Model Build Alerts View
Column Name  Description 


Partition name in a partitioned model 

Error number (valid when event is Error) 

Error message 
The Computed Settings view (DM$VS
model_name) lists the algorithm computed settings. The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
SETTING_NAME VARCHAR2(30)
SETTING_VALUE VARCHAR2(4000)
Table 466 Computed Settings View
Column Name  Description 


Partition name in a partitioned model 

Name of the setting 

Value of the setting 
4.1.25 Model Detail Views for Normalization and Missing Value Handling
The Normalization and Missing Value Handling view DM$VN
describes the normalization parameters used in Automatic Data Preparation (ADP) and the missing value replacement when a NULL
value is encountered. Missing value replacement applies only to the twodimensional columns and does not apply to the nested columns.
The view has the following columns:
Name Type
 
PARTITION_NAME VARCHAR2(128)
ATTRIBUTE_NAME VARCHAR2(128)
ATTRIBUTE_SUBNAME VARCHAR2(4000)
NUMERIC_MISSING_VALUE BINARY_DOUBLE
CATEGORICAL_MISSING_VALUE VARCHAR2(4000)
NORMALIZATION_SHIFT BINARY_DOUBLE
NORMALIZATION_SCALE BINARY_DOUBLE
Table 467 Normalization and Missing Value Handling View
Column Name  Description 


A partition in a partitioned model 

Column name 

Nested column subname. The value is null for nonnested columns. 

Numeric missing value replacement 

Categorical missing value replacement 

Normalization shift value 

Normalization scale value 
4.1.26 Model Detail Views for Exponential Smoothing
Model detail views specific to Exponential Smoothing (ESM) contain information about the model output and global information.
An ESM model has the following views:
 Model output:
DM$VP
model_name  Model global information:
DM$VG
model_name
Exponential Smoothing Forecast view (DM$VP
model_name) contains the result of an ESM model. The output has a set of records such as partition, CASE_ID
, value, prediction, lower, upper, and so on and ordered by partition and CASE_ID
(time). Each partition has a separate smoothing model. For a given partition, for each time (CASE_ID
) point that the input time series covers, the value is the observed or accumulated value at the time point, and the prediction is the onestepahead forecast at that time point. For each time point (future prediction) beyond the range of input time series, the value is NULL
, and the prediction is the model forecast for that time point. Lower and upper are the lower bound and upper bound of the user specified confidence interval for the prediction.
Global NameValue Pairs view (DM$VG
model_name) contains the global information of the model along with the estimated smoothing constants, the estimated initial state, and global diagnostic measures.
Table 468 Global NameValue Pairs View for ESM
Name  Description 


Negative loglikelihood of model 

Smoothing constant 

Akaike information criterion 

Corrected Akaike information criterion 

Average mean square error over userspecified time window 

Trend smoothing constant 

Bayesian information criterion 

Seasonal smoothing constant 

Model estimate of value one time interval prior to start of observed series 

Model estimate of seasonal effect for season i one time interval prior to start of observed series 

Model estimate of trend one time interval prior to start of observed series 

Model mean absolute error 

Model mean square error 

Damping parameter 

Model standard error 

Model standard deviation of residuals 
4.1.27 Model Detail Views for Text Features
The model details view for text features is DM$VX
model_name.
The text feature view DM$VX
model_name describes the extracted text features if there are text attributes present. The view has the following schema:
Name Type
 
PARTITION_NAME VARCHAR2(128)
COLUMN_NAME VARCHAR2(128)
TOKEN VARCHAR2(4000)
DOCUMENT_FREQUENCY NUMBER
Table 469 Text Feature View for Extracted Text Features
Column Name  Description 


A partition in a partitioned model to retrieve details 

Name of the identifier column 

Text token which is usually a word or stemmed word 

A measure of token frequency in the entire training set 