MySQL AI User Guide
ML_EXPLAIN_TABLE
explains predictions for an entire table of unlabeled data.
Explanations are performed in parallel.
ML_EXPLAIN_TABLE
is a very memory-intensive process. We recommend limiting
the input table to a maximum of 100 rows. If the input table
has more than ten columns, limit it to ten rows.
Review the following:
You cannot generate prediction explanations on a table for the following model types:
Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling
You can specify the output table and the input table as the same table if all the following conditions are met:
The input table does not have the columns that are created for the output table when generating predictions. Output columns are specific to each machine learning task. Some of these columns include:
Prediction
ml_results
[input_column_name]_attribution
The input table does not have a primary key, and it does
not have a column named
_4aad19ca6e_pk_id
. This is because
ML_EXPLAIN_TABLE
adds a column as the primary key with the name
_4aad19ca6e_pk_id
to the output
table.
If you specify the output table and the input table as the same name, the predictions are inserted into the input table.
Before running
ML_EXPLAIN_TABLE
,
you must train, and then load the model you want to use.
The following example trains a dataset with the classification machine learning task.
mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);
The following example loads the trained model.
mysql> CALL sys.ML_MODEL_LOAD(@census_model, NULL);
For more information about training and loading models, see Train a Model and Load a Model.
After training and loading the model, you can generate prediction explanations for a table. For parameter and option descriptions, see ML_EXPLAIN_TABLE.
After training and loading a model, you can run
ML_EXPLAIN_TABLE
to generate a table of prediction explanations with the
default Permutation Importance explainer>. However, if you
train the shap
prediction explainer with
ML_EXPLAIN
,
you need to run
ML_EXPLAIN
again with the permutation_importance
explainer before running
ML_EXPLAIN_TABLE
with the same explainer.
Run the
ML_EXPLAIN_TABLE
routine.
mysql> CALL sys.ML_EXPLAIN_TABLE(table_name
, model_handle
, output_table_name
, [options
]);
The following example runs
ML_EXPLAIN_TABLE
with the permutation_importance
explainer.
mysql> CALL sys.ML_EXPLAIN_TABLE('census_data.census_train', @census_model, 'census_data.census_train_permutation', JSON_OBJECT('prediction_explainer', 'permutation_importance'));
Where:
census_data.census_train
is the
fully qualified name of the table that contains the
training dataset
(schema_name.table_name
).
@census_model
is the session
variable for the trained model.
census_data.census_train_permutation
is the fully qualified name of the output table that
contains the explanations
(schema_name.table_name
).
prediction_explainer
is set to
permutation_importance
for the
Permutation Importance prediction explainer.
Query the output table to review a sample of the results.
mysql> SELECT * FROM table_name
LIMIT N
;
The following example queries the top three rows of the output table.
mysql> SELECT * FROM census_train_permutation LIMIT 3;
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+---------------------------+----------------------------+-----------------------+----------------------------+--------------------------+------------------+-----------------+-----------------------+--------------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| _4aad19ca6e_pk_id | age | workclass | fnlwgt | education | education-num | marital-status | occupation | relationship | race | sex | capital-gain | capital-loss | hours-per-week | native-country | revenue | Prediction | age_attribution | education-num_attribution | marital-status_attribution | education_attribution | hours-per-week_attribution | relationship_attribution | race_attribution | sex_attribution | workclass_attribution | fnlwgt_attribution | capital-gain_attribution | Notes | ml_results |
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+---------------------------+----------------------------+-----------------------+----------------------------+--------------------------+------------------+-----------------+-----------------------+--------------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | 37 | Private | 99146 | Bachelors | 13 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 1977 | 50 | United-States | >50K | <=50K | -0.1 | -0.08 | -0.05 | -0.05 | -0.03 | -0.03 | 0.02 | -0.02 | 0.01 | 0 | 0 | race (White) had the largest impact towards predicting <=50K, whereas age (37) contributed the most against predicting <=50K | {"attributions": {"age": -0.1, "education-num": -0.08, "marital-status": -0.05, "education": -0.05, "hours-per-week": -0.03, "relationship": -0.03, "race": 0.02, "sex": -0.02, "workclass": 0.01, "fnlwgt": 0.0, "capital-gain": 0.0}, "predictions": {"revenue": "<=50K"}, "notes": "race (White) had the largest impact towards predicting <=50K, whereas age (37) contributed the most against predicting <=50K"} |
| 2 | 34 | Private | 27409 | 9th | 5 | Married-civ-spouse | Craft-repair | Husband | White | Male | 0 | 0 | 50 | United-States | <=50K | <=50K | 0 | 0 | -0.04 | 0.06 | -0.03 | 0.02 | 0.02 | -0.02 | 0.01 | 0 | 0 | education (9th) had the largest impact towards predicting <=50K, whereas marital-status (Married-civ-spouse) contributed the most against predicting <=50K | {"attributions": {"age": 0.0, "education-num": 0.0, "marital-status": -0.04, "education": 0.06, "hours-per-week": -0.03, "relationship": 0.02, "race": 0.02, "sex": -0.02, "workclass": 0.01, "fnlwgt": 0.0, "capital-gain": 0.0}, "predictions": {"revenue": "<=50K"}, "notes": "education (9th) had the largest impact towards predicting <=50K, whereas marital-status (Married-civ-spouse) contributed the most against predicting <=50K"} |
| 3 | 30 | Private | 299507 | Assoc-acdm | 12 | Separated | Other-service | Unmarried | White | Female | 0 | 0 | 40 | United-States | <=50K | <=50K | 0 | 0 | 0 | 0 | 0 | 0.03 | 0.01 | 0.02 | 0 | 0 | 0 | relationship (Unmarried) had the largest impact towards predicting <=50K | {"attributions": {"age": 0.0, "education-num": 0.0, "marital-status": 0.0, "education": 0.0, "hours-per-week": 0.0, "relationship": 0.03, "race": 0.01, "sex": 0.02, "workclass": 0.0, "fnlwgt": -0.0, "capital-gain": 0.0}, "predictions": {"revenue": "<=50K"}, "notes": "relationship (Unmarried) had the largest impact towards predicting <=50K"} |
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+---------------------------+----------------------------+-----------------------+----------------------------+--------------------------+------------------+-----------------+-----------------------+--------------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
The results display information on the columns that had the largest impact towards the predictions and the columns that contributed the most against the prediction.
A warning displays if the model is of low quality.
To generate a table of prediction explanations with the SHAP
explainer, you must first run the SHAP explainer with
ML_EXPLAIN
.
Run the
ML_EXPLAIN
routine.
mysql> CALL sys.ML_EXPLAIN ('table_name
', 'target_column_name
', model_handle
, [options]);
The following example run the shap
prediction explainer.
mysql> CALL sys.ML_EXPLAIN('census_data.census_train', 'revenue', @census_model, JSON_OBJECT('prediction_explainer', 'shap'));
Where:
census_data.census_train
is the
fully qualified name of the table that contains the
training dataset
(schema_name.table_name
).
revenue
is the name of the target
column, which contains ground truth values.
@census_model
is the session
variable for the trained model.
prediction_explainer
is set to
shap
for the SHAP prediction
explainer.
Run the
ML_EXPLAIN_TABLE
routine.
mysql> CALL sys.ML_EXPLAIN_TABLE(table_name
, model_handle
, output_table_name
, [options
]);
The following example runs the shap
prediction explainer.
mysql> CALL sys.ML_EXPLAIN_TABLE('census_data.census_train', @census_model, 'census_data.census_train_explanations', JSON_OBJECT('prediction_explainer', 'shap'));
Where:
census_data.census_train
is the
fully qualified name of the table that contains the
training dataset
(schema_name.table_name
).
@census_model
is the session
variable for the trained model.
census_data.census_train_explanations
is the fully qualified name of the output table that
contains the explanations
(schema_name.table_name
).
prediction_explainer
is set to
shap
for the SHAP prediction
explainer.
Query the output table to review a sample of the results.
mysql> SELECT * FROM table_name
LIMIT N
;
The following example queries the top three rows of the output table.
mysql> SELECT * FROM census_train_explanations LIMIT 3;
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+--------------------------+-----------------------+---------------------------+--------------------+----------------------------+----------------------------+------------------+--------------------------+-----------------+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| _4aad19ca6e_pk_id | age | workclass | fnlwgt | education | education-num | marital-status | occupation | relationship | race | sex | capital-gain | capital-loss | hours-per-week | native-country | revenue | Prediction | age_attribution | capital-gain_attribution | education_attribution | education-num_attribution | fnlwgt_attribution | hours-per-week_attribution | marital-status_attribution | race_attribution | relationship_attribution | sex_attribution | workclass_attribution | ml_results |
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+--------------------------+-----------------------+---------------------------+--------------------+----------------------------+----------------------------+------------------+--------------------------+-----------------+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | 37 | Private | 99146 | Bachelors | 13 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 1977 | 50 | United-States | >50K | <=50K | -0.0302428 | 0.0105781 | -0.0250973 | -0.0307275 | -0.0394281 | -0.0115214 | -0.0417846 | 0.00787589 | -0.0510622 | -0.0265045 | 0.0125372 | {"predictions": {"revenue": "<=50K"}, "attributions": {"age_attribution": -0.03024279141207572, "capital-gain_attribution": 0.01057805203258716, "education_attribution": -0.025097336669631397, "education-num_attribution": -0.03072748073245629, "fnlwgt_attribution": -0.03942808683149933, "hours-per-week_attribution": -0.01152140445039615, "marital-status_attribution": -0.041784639464290424, "race_attribution": 0.00787588729783209, "relationship_attribution": -0.05106219218656989, "sex_attribution": -0.026504470317501083, "workclass_attribution": 0.012537165912050216}} |
| 2 | 34 | Private | 27409 | 9th | 5 | Married-civ-spouse | Craft-repair | Husband | White | Male | 0 | 0 | 50 | United-States | <=50K | <=50K | 0.0208705 | 0.0108355 | 0.0445232 | 0.00589651 | -0.042798 | -0.0120815 | -0.0419652 | 0.00772465 | -0.0284059 | -0.0263092 | 0.0133885 | {"predictions": {"revenue": "<=50K"}, "attributions": {"age_attribution": 0.020870525610603933, "capital-gain_attribution": 0.010835454035912382, "education_attribution": 0.044523246311996556, "education-num_attribution": 0.005896512933976113, "fnlwgt_attribution": -0.04279802962032341, "hours-per-week_attribution": -0.012081540592359136, "marital-status_attribution": -0.04196518165858853, "race_attribution": 0.007724645581626567, "relationship_attribution": -0.028405864878017332, "sex_attribution": -0.026309175183378983, "workclass_attribution": 0.013388466297739862}} |
| 3 | 30 | Private | 299507 | Assoc-acdm | 12 | Separated | Other-service | Unmarried | White | Female | 0 | 0 | 40 | United-States | <=50K | <=50K | 0.0415136 | 0.0104095 | 0.0205295 | 0.00435879 | -0.00299281 | 0.0095282 | 0.0237676 | 0.00823583 | 0.0288422 | 0.0345606 | 0.00340862 | {"predictions": {"revenue": "<=50K"}, "attributions": {"age_attribution": 0.04151362477220458, "capital-gain_attribution": 0.01040951100227322, "education_attribution": 0.020529478225449593, "education-num_attribution": 0.004358785226045173, "fnlwgt_attribution": -0.0029928127703645266, "hours-per-week_attribution": 0.009528199786418556, "marital-status_attribution": 0.023767622241075377, "race_attribution": 0.008235833643263954, "relationship_attribution": 0.028842206656526016, "sex_attribution": 0.03456064630884248, "workclass_attribution": 0.0034086183477553336}} |
+-------------------+-----+-----------+--------+------------+---------------+--------------------+-----------------+--------------+-------+--------+--------------+--------------+----------------+----------------+---------+------------+-----------------+--------------------------+-----------------------+---------------------------+--------------------+----------------------------+----------------------------+------------------+--------------------------+-----------------+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
The results display feature importance values for each column.
A warning displays if the model is of low quality.
Review ML_EXPLAIN_TABLE for parameter descriptions and options.
Learn how to Score a Model to get insight into the quality of the model.