6.5.5 Generate Model Explanations

Before You Begin

Review the following:

Explanations Overview

Explanations help you understand which features have the most influence on a prediction. Feature importance is presented as an attribution value. A positive value indicates that a feature contributed toward the prediction. A negative value can have different interpretations depending on the specific model explainer used for the model. For example, a negative value for the permutation importance explainer means that the feature is not important.

Model Explainers

Model explainers are used when you run the ML_EXPLAIN routine to explain what the model learned from the training dataset. The model explainer provides a list of feature importances to show what features the model considered important based on the entire training dataset. The ML_EXPLAIN routine can train these model explainers:

The Permutation Importance model explainer, specified as permutation_importance, is the default model explainer. ML_TRAIN generates this model explainer when it runs.
The Partial Dependence model explainer, specified as partial_dependence, shows how changing the values of one or more columns changes the value that the model predicts. When you train this model explainer, you need to specify some additional options. See ML_EXPLAIN to learn more.
The SHAP model explainer, specified as shap, produces feature importance values based on Shapley values.
The Fast SHAP model explainer, specified as fast_shap, is a subsampling version of the SHAP model explainer, which usually has a faster runtime.

The model explanation is stored in the model catalog along with the machine learning model in the model_explanation column. See The Model Catalog. If you run ML_EXPLAIN again for the same model handle and model explainer, the field is overwritten with the new result.

Unsupported Model Types

You cannot generate model explanations for the following model types:

Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling

Prepare to Generate a Model Explanation

Before running ML_EXPLAIN, you must train, and then load the model you want to use.

The following example trains a dataset with the classification machine learning task.

mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);

The following example loads the trained model.

mysql> CALL sys.ML_MODEL_LOAD(@census_model, NULL);

For more information about training and loading models, see Train a Model and Load a Model.

After training and loading the model, you can generate model explanations. For option and parameter descriptions, see ML_EXPLAIN.

Retrieve the Default Permutation Importance Explanation

After training and loading a model, you can retrieve the default model explanation using the permutation_importance explainer from the model catalog. See The Model Catalog.

mysql> SELECT column FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=model_handle;

The following example retrieves the model explainer column from the model catalog of the previously trained model. The JSON_PRETTY parameter displays the output in an easily readable format.

mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation)                                                                                                  |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
  "permutation_importance": {
    "age": 0.0292,
    "sex": 0.0023,
    "race": 0.0019,
    "fnlwgt": 0.0038,
    "education": 0.0008,
    "workclass": 0.0068,
    "occupation": 0.0223,
    "capital-gain": 0.0479,
    "capital-loss": 0.0117,
    "relationship": 0.0234,
    "education-num": 0.0352,
    "hours-per-week": 0.0148,
    "marital-status": 0.024,
    "native-country": 0.0
  }
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)

Replace user1 and @census_model with your own user name and session variable.

The explanation displays values of permutation importance for each column.

Generate a Model Explanation

To generate a model explanation, run the ML_EXPLAIN routine.

mysql> CALL sys.ML_EXPLAIN ('table_name', 'target_column_name', model_handle, [options]);

The following example generates a model explanation on the trained and loaded model with the shap model explainer.

mysql> CALL sys.ML_EXPLAIN('census_data.census_train', 'revenue', @census_model, JSON_OBJECT('model_explainer', 'shap'));

Where:

census_data.census_train is the fully qualified name of the table that contains the training dataset (schema_name.table_name).
revenue is the name of the target column, which contains ground truth values.
@census_model is the session variable for the trained model.
model_explainer is set to shap for the SHAP model explainer.

After running ML_EXPLAIN, you can view the model explanation in the Model Catalog. See The Model Catalog. The following example views the model explanation for the previous command. It provides values for each column representing importance values with the shap explainer.

mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation)                                                                                                  |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
  "shap": {
    "age": 0.0467,
    "sex": 0.033,
    "race": 0.0155,
    "fnlwgt": 0.0185,
    "education": 0.016,
    "workclass": 0.0255,
    "occupation": 0.0001,
    "capital-gain": 0.0217,
    "capital-loss": 0.0001,
    "relationship": 0.0426,
    "education-num": 0.0186,
    "hours-per-week": 0.0148,
    "marital-status": 0.024,
    "native-country": 0.0
  },
  "permutation_importance": {
    "age": -0.0057, 
    "sex": 0.0002, 
    "race": 0.0001, 
    "fnlwgt": 0.0103, 
    "education": 0.0108, 
    "workclass": 0.0189, 
    "occupation": 0.0, 
    "capital-gain": 0.0304, 
    "capital-loss": 0.0, 
    "relationship": 0.0195, 
    "education-num": 0.0152, 
    "hours-per-week": 0.0235, 
    "marital-status": 0.0099, 
    "native-country": 0.0
   }
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)

What's Next

Review ML_EXPLAIN for parameter descriptions and options.
Learn how to Generate Prediction Explanations.
Learn more about the The Model Catalog.