MySQL HeatWave User Guide
After the ML_TRAIN
routine, use
the ML_EXPLAIN
routine to train
model explainers for MySQL HeatWave AutoML. By default, the
ML_TRAIN
routine trains the
Permutation Importance model explainer.
This topic has the following sections.
Review the following:
Explanations help you understand which features have the most influence on a prediction. Feature importance is presented as an attribution value. A positive value indicates that a feature contributed toward the prediction. A negative value can have different interpretations depending on the specific model explainer used for the model. For example, a negative value for the permutation importance explainer means that the feature is not important.
Model explainers are used when you run the
ML_EXPLAIN
routine to explain
what the model learned from the training dataset. The model
explainer provides a list of feature importances to show what
features the model considered important based on the entire
training dataset. The
ML_EXPLAIN
routine can train
these model explainers:
The Permutation Importance model explainer, specified as
permutation_importance
, is the default
model explainer. ML_TRAIN
generates this model explainer when it runs.
The Partial Dependence model explainer, specified as
partial_dependence
, shows how changing
the values of one or more columns changes the value that
the model predicts. When you train this model explainer,
you need to specify some additional options. See
ML_EXPLAIN to
learn more.
The SHAP model explainer, specified as
shap
, produces feature importance
values based on Shapley values.
The Fast SHAP model explainer, specified as
fast_shap
, is a subsampling version of
the SHAP model explainer, which usually has a faster
runtime.
The model explanation is stored in the model catalog along
with the machine learning model in the
model_explanation
column. See
The Model
Catalog. If you run
ML_EXPLAIN
again for the same
model handle and model explainer, the field is overwritten
with the new result.
You cannot generate model explanations for the following model types:
Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling
Before running ML_EXPLAIN
, you
must train, and then load the model you want to use.
The following example trains a dataset with the classification machine learning task.
mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);
The following example loads the trained model.
mysql> CALL sys.ML_MODEL_LOAD(@census_model, NULL);
For more information about training and loading models, see Train a Model and Load a Model.
After training and loading the model, you can generate model explanations. For option and parameter descriptions, see ML_EXPLAIN.
After training and loading a model, you can retrieve the
default model explanation using the
permutation_importance
explainer from the
model catalog. See The
Model Catalog.
mysql> SELECT column
FROM ML_SCHEMA_user1
.MODEL_CATALOG WHERE model_handle=model_handle
;
The following example retrieves the model explainer column
from the model catalog of the previously trained model. The
JSON_PRETTY
parameter displays the output
in an easily readable format.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"permutation_importance": {
"age": 0.0292,
"sex": 0.0023,
"race": 0.0019,
"fnlwgt": 0.0038,
"education": 0.0008,
"workclass": 0.0068,
"occupation": 0.0223,
"capital-gain": 0.0479,
"capital-loss": 0.0117,
"relationship": 0.0234,
"education-num": 0.0352,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Replace user1
and
@census_model
with your own user name and
session variable.
The explanation displays values of permutation importance for each column.
To generate a model explanation, run the
ML_EXPLAIN
routine.
mysql> CALL sys.ML_EXPLAIN ('table_name
', 'target_column_name
', model_handle
, [options]);
The following example generates a model explanation on the
trained and loaded model with the shap
model explainer.
mysql> CALL sys.ML_EXPLAIN('census_data.census_train', 'revenue', @census_model, JSON_OBJECT('model_explainer', 'shap'));
Where:
census_data.census_train
is the fully
qualified name of the table that contains the training
dataset
(schema_name.table_name
).
revenue
is the name of the target
column, which contains ground truth values.
@census_model
is the session variable
for the trained model.
model_explainer
is set to
shap
for the SHAP model explainer.
After running ML_EXPLAIN
, you
can view the model explanation in the Model Catalog. See
The Model
Catalog. The following example views the model
explanation for the previous command. It provides values for
each column representing importance values with the
shap
explainer.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"shap": {
"age": 0.0467,
"sex": 0.033,
"race": 0.0155,
"fnlwgt": 0.0185,
"education": 0.016,
"workclass": 0.0255,
"occupation": 0.0001,
"capital-gain": 0.0217,
"capital-loss": 0.0001,
"relationship": 0.0426,
"education-num": 0.0186,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
},
"permutation_importance": {
"age": -0.0057,
"sex": 0.0002,
"race": 0.0001,
"fnlwgt": 0.0103,
"education": 0.0108,
"workclass": 0.0189,
"occupation": 0.0,
"capital-gain": 0.0304,
"capital-loss": 0.0,
"relationship": 0.0195,
"education-num": 0.0152,
"hours-per-week": 0.0235,
"marital-status": 0.0099,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Review ML_EXPLAIN for parameter descriptions and options.
Learn how to Generate Prediction Explanations.
Learn more about the The Model Catalog.