MySQL AI User Guide
The model_metadata
column in the model
catalog allows you to view detailed information on trained
models. For example, you can view the algorithm used to train
the model, the columns in the training table, and values for
the model explanation.
When you run the
ML_MODEL_IMPORT
routine, the imported table has a
model_metadata
column that stores the
metadata for the table. If you import a model from a table,
model_metadata
stores the name of the
database and table. If you import a model object,
model_metadata
stores a JSON_OBJECT that
contains key-value pairs of the metadata See
Section 7.1.4, “ML_MODEL_IMPORT” to learn more.
The default value for model_metadata
is
NULL
.
This topic has the following sections.
model_metadata
contains the following
metadata as key-value pairs in JSON format:
task:
string
The task type specified in the
ML_TRAIN
query. The default is classification
when used with
ML_MODEL_IMPORT
.
build_timestamp:
number
A timestamp indicating when the model was created (UNIX
epoch time). A model is created when the
ML_TRAIN
routine finishes executing.
target_column_name:
string
The name of the column in the training table that was specified as the target column.
train_table_name:
string
The name of the input table specified in the
ML_TRAIN
query.
column_names:
JSON
array
The feature columns used to train the model.
model_explanation:
JSON object
literal
The model explanation generated during training. See Generate Model Explanations.
notes:
string
The notes
specified in the
ML_TRAIN
query. It also records any error messages that occur
during model training.
format:
string
The model can be in one of the following formats:
HWMLv1.0
HWMLv2.0
ONNXv1.0
ONNXv2.0
status:
string
The status of the model. The default is
Ready
when used with
ML_MODEL_IMPORT
.
Creating
: The model is being
created.
Ready
: The model is trained and
active.
Error
: Either training was
canceled or an error occurred during training. Any
error message appears in the
notes
column. The error message
also appears in model_metadata
notes
.
model_quality:
string
The quality of the model object for classification and
regression tasks. For other tasks, this value is
NULL
. The value is either
low
or high
.
training_time:
number
The time in seconds taken to train the model.
algorithm_name:
string
The name of the chosen algorithm.
training_score:
number
The cross-validation score achieved for the model by training.
n_rows:
number
The number of rows in the training table.
n_columns:
number
The number of columns in the training table.
n_selected_rows:
number
The number of rows selected by adaptive sampling.
n_selected_columns:
number
The number of columns selected by feature selection.
optimization_metric:
string
The optimization metric used for training. See Section 7.1.14, “Optimization and Scoring Metrics” to review available metrics.
selected_column_names:
JSON
array
The names of the columns selected by feature selection.
contamination:
number
The contamination factor for the anomaly detection task. See Anomaly Detection Options to learn more.
options:
JSON object
literal
The options
specified in the
ML_TRAIN
query.
training_params:
JSON object
literal
Internal task dependent parameters used during
ML_TRAIN
.
onnx_inputs_info:
JSON object
literal
Information about the format of the ONNX model inputs. This only applies to ONNX models. See Manage External ONNX Models.
Do not provide onnx_inputs_info
if
the model is not ONNX format. This generates an error.
data_types_map:
JSON object
literal
This maps the data type of each column to an ONNX model data type. The default value is:
JSON_OBJECT("tensor(int64)": "int64", "tensor(float)": "float32", "tensor(string)": "str_")
onnx_outputs_info:
JSON object
literal
Information about the format of the ONNX model outputs. This only applies to ONNX models. See Manage External ONNX Models.
Do not provide onnx_outputs_info
if
the model is not ONNX format, or if
task
is NULL
. This
generates an error.
predictions_name:
string
This name determines which of the ONNX model outputs is associated with predictions.
prediction_probabilities_name:
string
This name determines which of the ONNX model outputs is associated with prediction probabilities.
labels_map:
JSON object
literal
This maps prediction probabilities to predictions, known as labels.
training_drift_metric:
JSON object
literal
Contains data drift information about the training data. See Analyze Data Drift. This only applies to classification and regression models.
mean:
number
The mean value of drift metrics of all the training
data. ≥ 0
.
variance:
number
The variance value of drift metrics of all the
training data. ≥ 0
.
Both mean
and
variance
should be low.
chunks:
number
The total number of chunks that the model has been split into.
You can query the model metadata in the model catalog with
the following command. Replace user1
with
your own user name.
mysql> SELECT JSON_PRETTY(model_metadata) FROM ML_SCHEMA_user1.MODEL_CATALOG\G
*************************** 1. row ***************************
JSON_PRETTY(model_metadata): {
"task": "regression",
"notes": null,
"chunks": 1,
"format": "HWMLv2.0",
"n_rows": 407284,
"status": "Ready",
"options": {
"task": "regression",
"model_explainer": "permutation_importance",
"prediction_explainer": "permutation_importance"
},
"n_columns": 14,
"column_names": [
"VendorID",
"store_and_fwd_flag",
"RatecodeID",
"PULocationID",
"DOLocationID",
"passenger_count",
"extra",
"mta_tax",
"tolls_amount",
"improvement_surcharge",
"trip_type",
"lpep_pickup_datetime_day",
"lpep_pickup_datetime_hour",
"lpep_pickup_datetime_minute"
],
"contamination": null,
"model_quality": "high",
"training_time": 515.13427734375,
"algorithm_name": "RandomForestRegressor",
"training_score": -5.610334873199463,
"build_timestamp": 1730395944,
"n_selected_rows": 130931,
"training_params": {
"recommend": "ratings",
"force_use_X": false,
"recommend_k": 3,
"remove_seen": true,
"ranking_topk": 10,
"lsa_components": 100,
"ranking_threshold": 1,
"feedback_threshold": 1
},
"train_table_name": "heatwaveml_bench.nyc_taxi_train",
"model_explanation": {
"permutation_importance": {
"extra": 0.0,
"mta_tax": 0.0019,
"VendorID": 0.0048,
"trip_type": 0.0003,
"RatecodeID": 0.0152,
"DOLocationID": 0.4178,
"PULocationID": 0.2714,
"tolls_amount": 0.0851,
"passenger_count": 0.0,
"store_and_fwd_flag": 0.0,
"improvement_surcharge": 0.0015,
"lpep_pickup_datetime_day": 0.0,
"lpep_pickup_datetime_hour": 0.0161,
"lpep_pickup_datetime_minute": 0.0
}
},
"n_selected_columns": 9,
"target_column_name": "tip_amount",
"optimization_metric": "neg_mean_squared_error",
"selected_column_names": [
"DOLocationID",
"PULocationID",
"RatecodeID",
"VendorID",
"improvement_surcharge",
"lpep_pickup_datetime_hour",
"mta_tax",
"tolls_amount",
"trip_type"
],
"training_drift_metric": {
"mean": 0.3326,
"variance": 3.2482
}
}
*************************** 2. row ***************************
JSON_PRETTY(model_metadata): {
"task": "regression",
"notes": null,
"chunks": 0,
"format": "HWMLv2.0",
"n_rows": null,
"status": "Error",
"options": {},
"n_columns": null,
"column_names": null,
"contamination": null,
"model_quality": null,
"training_time": null,
"algorithm_name": null,
"training_score": null,
"build_timestamp": 1730403865,
"n_selected_rows": null,
"training_params": null,
"train_table_name": "nyc_taxi.nyc_taxi_train",
"model_explanation": {},
"n_selected_columns": null,
"target_column_name": "tip_amount",
"optimization_metric": null,
"selected_column_names": null,
"training_drift_metric": {
"mean": null,
"variance": null
}
}
*************************** 3. row ***************************
JSON_PRETTY(model_metadata): {
"task": "regression",
"notes": null,
"chunks": 0,
"format": "HWMLv2.0",
"n_rows": null,
"status": "Creating",
"options": {},
"n_columns": null,
"column_names": null,
"contamination": null,
"model_quality": null,
"training_time": null,
"algorithm_name": null,
"training_score": null,
"build_timestamp": 1730404027,
"n_selected_rows": null,
"training_params": null,
"train_table_name": "nyc_taxi.nyc_taxi_train",
"model_explanation": {},
"n_selected_columns": null,
"target_column_name": "tip_amount",
"optimization_metric": null,
"selected_column_names": null,
"training_drift_metric": {
"mean": null,
"variance": null
}
}
3 rows in set (0.0859 sec)