MySQL HeatWave User Guide
After training a classification model, you can query the default model explanation or query new model explanations. You can also generate prediction explanations. Explanations help you understand which features had the most influence on generating predictions.
Feature importance is presented as an attribution value. A positive value indicates that a feature contributed toward the prediction. A negative value can have different interpretations depending on the specific model explainer used for the model. For example, a negative value for the permutation importance explainer means that the feature is not important.
Complete the following tasks:
After training a model, you can query the default model explanation with the Premutation Importance explainer.
To generate explanations for other model explainers, see Generate Model Explanations and ML_EXPLAIN.
Query the model_explanation
column from
the model catalog and define the model handle previously
created. Update user1
with your own user
name. Use JSON_PRETTY
to view the output
in an easily readable format.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG
WHERE model_handle='classification_use_case';
+---------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------+
| {
"permutation_importance": {
"Debt": 0.5014,
"Assets": 0.0,
"Gender": 0.0,
"Income": 0.0,
"ClientID": 0.0,
"LoanType": 0.0,
"ClientAge": 0.1231,
"Education": 0.0,
"LoanAmount": 0.0,
"Occupation": 0.0,
"CreditScore": 0.0,
"Liabilities": 0.0525
}
} |
+---------------------------------------------------------------------------------------------------+
1 row in set (0.0382 sec)
Feature importance values display for each column.
After training a model, you can generate a table of
prediction explanations on the
testing_data
dataset by using the default
Permutation Importance prediction explainer.
To generate explanations for other model explainers, see Generate Prediction Explanations and ML_EXPLAIN_TABLE.
If not already done, load the model. You can use the
session variable for the model that is valid for the
duration of the connection. Alternatively, you can use
the model handle previously set. For the option to set
the user name, you can set it to
NULL
.
The following example uses the session variable.
mysql> CALL sys.ML_MODEL_LOAD(@model, NULL);
The following example uses the model handle.
mysql> CALL sys.ML_MODEL_LOAD('classification_use_case', NULL);
Use the ML_EXPLAIN_TABLE
routine to generate explanations for predictions made in
the test dataset.
mysql> CALL sys.ML_EXPLAIN_TABLE(table_name
, model_handle
, output_table_name
, [options
]);
Replace table_name
,
model_handle
, and
output_table_name
with your
own values. Add options
as
needed.
The following example runs
ML_EXPLAIN_TABLE
on the
testing dataset previously created.
mysql> CALL sys.ML_EXPLAIN_TABLE('classification_data.Loan_Testing', @model, 'classification_data.Loan_Testing_explanations',
JSON_OBJECT('prediction_explainer', 'permutation_importance'));
Query OK, 0 rows affected (12.2957 sec)
Where:
classification_data.Loan_Testing
is the fully qualified name of the test dataset.
@model
is the session variable
for the model handle.
classification_data.Loan_Testing_explanations
is the fully qualified name of the output table with
explanations.
permutation_importance
is the
selected prediction explainer to use to generate
explanations.
Query Notes
and
ml_results
from the output table to
review which column contributed the most against or had
the largest impact towards the prediction. You can also
review individual attribution values for each column.
Use \G
to view the output in an
easily readable format.
mysql> SELECT Notes, ml_results FROM Loan_Testing_explanations\G
*************************** 1. row ***************************
Notes: Debt (18000.0) had the largest impact towards predicting Approved
ml_results: {"attributions": {"Debt": 0.87, "Liabilities": -0.0, "ClientAge": 0.0, "LoanAmount": 0.0},
"predictions": {"Approved": "Approved"}, "notes": "Debt (18000.0) had the largest impact towards predicting Approved"}
*************************** 2. row ***************************
Notes: ClientAge (29) had the largest impact towards predicting Rejected, whereas Debt (12000.0) contributed the most against predicting Rejected
ml_results: {"attributions": {"Debt": -0.01, "Liabilities": 0.02, "ClientAge": 0.17, "LoanAmount": 0.08},
"predictions": {"Approved": "Rejected"}, "notes": "ClientAge (29) had the largest impact towards predicting Rejected, whereas Debt (12000.0) contributed the most against predicting Rejected"}
*************************** 3. row ***************************
Notes: Debt (25000.0) had the largest impact towards predicting Approved
ml_results: {"attributions": {"Debt": 0.87, "Liabilities": -0.0, "ClientAge": 0.0, "LoanAmount": 0.0},
"predictions": {"Approved": "Approved"}, "notes": "Debt (25000.0) had the largest impact towards predicting Approved"}
*************************** 4. row ***************************
Notes: ClientAge (56) had the largest impact towards predicting Rejected, whereas Debt (35000.0) contributed the most against predicting Rejected
ml_results: {"attributions": {"Debt": -0.07, "Liabilities": 0.52, "ClientAge": 0.75, "LoanAmount": 0.01},
"predictions": {"Approved": "Rejected"}, "notes": "ClientAge (56) had the largest impact towards predicting Rejected, whereas Debt (35000.0) contributed the most against predicting Rejected"}
*************************** 5. row ***************************
Notes: LoanAmount (90000.0) had the largest impact towards predicting Rejected
ml_results: {"attributions": {"Debt": 0.0, "Liabilities": 0.01, "ClientAge": 0.1, "LoanAmount": 0.14},
"predictions": {"Approved": "Rejected"}, "notes": "LoanAmount (90000.0) had the largest impact towards predicting Rejected"}
*************************** 6. row ***************************
Notes: ClientAge (27) had the largest impact towards predicting Rejected
ml_results: {"attributions": {"Debt": -0.0, "Liabilities": 0.01, "ClientAge": 0.16, "LoanAmount": 0.08},
"predictions": {"Approved": "Rejected"}, "notes": "ClientAge (27) had the largest impact towards predicting Rejected"}
*************************** 7. row ***************************
Notes: Debt (15000.0) had the largest impact towards predicting Approved, whereas ClientAge (49) contributed the most against predicting Approved
ml_results: {"attributions": {"Debt": 0.49, "Liabilities": -0.07, "ClientAge": -0.43, "LoanAmount": 0.0},
"predictions": {"Approved": "Approved"}, "notes": "Debt (15000.0) had the largest impact towards predicting Approved, whereas ClientAge (49) contributed the most against predicting Approved"}
*************************** 8. row ***************************
Notes: ClientAge (53) had the largest impact towards predicting Rejected, whereas Debt (30000.0) contributed the most against predicting Rejected
ml_results: {"attributions": {"Debt": -0.13, "Liabilities": 0.56, "ClientAge": 0.68, "LoanAmount": -0.07},
"predictions": {"Approved": "Rejected"}, "notes": "ClientAge (53) had the largest impact towards predicting Rejected, whereas Debt (30000.0) contributed the most against predicting Rejected"}
*************************** 9. row ***************************
Notes: Debt (22000.0) had the largest impact towards predicting Approved
ml_results: {"attributions": {"Debt": 0.87, "Liabilities": -0.0, "ClientAge": 0.0, "LoanAmount": 0.0},
"predictions": {"Approved": "Approved"}, "notes": "Debt (22000.0) had the largest impact towards predicting Approved"}
*************************** 10. row ***************************
Notes: No features had a significant impact on model prediction
ml_results: {"attributions": {"Debt": 0.0, "Liabilities": 0.0, "ClientAge": 0.0, "LoanAmount": 0.0},
"predictions": {"Approved": "Approved"}, "notes": "No features had a significant impact on model prediction"}
10 rows in set (0.0461 sec)
To generate prediction explanations for one or more rows of data, see Generate Prediction Explanations for a Row of Data.
Learn how to Score a Classification Model.