MySQL AI User Guide
The table used to train a model cannot exceed 10 GB, 100 million rows, or 1017 columns.
Each dataset must reside in a single table on the MySQL server. AutoML routines operate on a single table.
Table columns must use supported data types. See Supported Data Types for AutoML to learn more.
            NaN (Not a Number) values are not recognized by MySQL and
            should be replaced by NULL.
          
Refer to the following requirements for specific machine learning models.
Classification models: Must have at least two distinct values, and each distinct value should appear in at least five rows.
Regression models: The target column must be numeric.
          The
          ML_TRAIN
          routine ignores columns missing more than 20% of its values
          and columns with the same value in each row. Missing values in
          numerical columns are replaced with the average value of the
          column, standardized to a mean of 0 and with a standard
          deviation of 1. Missing values in categorical columns are
          replaced with the most frequent value, and either one-hot or
          ordinal encoding is used to convert categorical values to
          numeric values. The input data as it exists in the MySQL
          database is not modified by
          ML_TRAIN.
        
        To use AutoML, ensure that the MySQL user name that trains a
        model does not have a period character ("."). For example, a
        user named
        'joesmith'@' is
        permitted to train a model, but a user named
        %''joe.smith'@' is
        not. The model catalog schema created by the
        %'ML_TRAIN
        procedure incorporates the user name in the schema name (for
        example, ML_SCHEMA_joesmith), and a period is
        not a permitted schema name character.
      
Learn more about the following:
Learn how to Create a Machine Learning Model.
Review Machine Learning Use Cases to create machine learning models with sample datasets.