MySQL AI User Guide
        Run the
        ML_TRAIN
        routine on a training dataset to produce a trained machine
        learning model.
      
Review how to Prepare Data.
Review Additional AutoML Requirements.
          ML_TRAIN
          supports training of the following models:
        
Classification: Assign items to defined categories.
Regression: Generate a prediction based on the relationship between a dependent variable and one or more independent variables.
Forecasting: Use a timeseries dataset to generate forecasting predictions.
Anomaly Detection: Detect unusual patterns in data.
Recommendation: Generate user and product recommendations.
Topic Modeling: Generate words and similar expressions that best characterize a set of documents.
          The training dataset used with
          ML_TRAIN
          must reside in a table on the MySQL server.
        
          ML_TRAIN
          stores machine learning models in the
          MODEL_CATALOG table. See
          The Model
          Catalog to learn more.
        
The time required to train a model can take a few minutes to a few hours depending on the following:
The number of rows and columns in the dataset. AutoML supports tables up to 10 GB in size with a maximum of 100 million rows and or 1017 columns.
              The specified
              ML_TRAIN
              parameters.
            
          To learn more about
          ML_TRAIN
          requirements and options, see
          ML_TRAIN or
          Machine Learning Use
          Cases.
        
          The quality and reliability of a trained model can be assessed
          using the
          ML_SCORE
          routine. For more information, see
          Score a Model.
          ML_TRAIN
          displays the following message if a trained model has a low
          score: Model Has a low training score, expect low
          quality model explanations.
        
Before training a model, it is good practice to define your own model handle instead of automatically generating one. This allows you to easily remember the model handle for future routines on the trained model instead of having to query it, or depending on the session variable that can no longer be used when the current connection terminates. See Defining Model Handle to learn more.
To train a machine learning model:
Optionally, set the value of the session variable, which sets the model handle to this same value.
mysql> SET @variable = 'model_handle';
              Replace @variable and
              model_handle with your own
              definitions. For example:
            
mysql> SET @census_model = 'census_test';
              The model handle is set to census_test.
            
              Run the
              ML_TRAIN
              routine.
            
mysql> CALL sys.ML_TRAIN('table_name', 'target_column_name', JSON_OBJECT('task', 'task_name'), @variable);
              Replace table_name,
              target_column_name,
              task_name, and
              variable with your own values.
            
              The following example runs
              ML_TRAIN
              on the census_data.census_train
              training dataset.
            
mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);
Where:
                  census_data.census_train is the
                  fully qualified name of the table that contains the
                  training dataset
                  (schema_name.table_name).
                
                  revenue is the name of the target
                  column, which contains ground truth values.
                
                  JSON_OBJECT('task',
                  'classification') specifies the machine
                  learning task type.
                
                  @census_model is the session
                  variable previously set that defines the model handle
                  to the name defined by the user:
                  census_test. If you do not define
                  the model handle before training the model, the model
                  handle is automatically generated, and the session
                  variable only stores the model handle for the duration
                  of the connection. User variables are written as
                  @.
                  Any valid name for a user-defined variable is
                  permitted. See
                  Work with
                  Model Handles to learn more.
                var_name
              When the training completes, query the model catalog for
              the model handle and the name of the trained table to
              confirm the model handle is correctly set. Replace
              user1 with your own user name.
            
mysql> SELECT model_handle, train_table_name FROM ML_SCHEMA_user1.MODEL_CATALOG;
+-----------------------------------------------------+---------------------------------+
| model_handle                                        | train_table_name                |
+-----------------------------------------------------+---------------------------------+
| census_test                                         | census_data.census_train        |
+-----------------------------------------------------+---------------------------------+
1 row in set (0.0450 sec)
When done working with a trained model, it is good practice to unload it. See Unload a Model.
For details on all training options and to view more examples for task-specific models, see ML_TRAIN.
Learn how to Load a Model.