2.35 Changes in MySQL HeatWave 8.0.30 (2022-07-26, General Availability)

Advisor

MySQL HeatWave Advisor Auto Encoding, which recommends string column encodings, now provides encoding recommendations that optimize query performance. Recommendations are based on performance models that use query execution data. Previously, string column encoding recommendations were optimized for cluster memory usage only. A performance improvement estimate is provided with string column encoding recommendations. (Bug #34145862)

MySQL HeatWave AutoML

You can now train MySQL HeatWave AutoML models on tables containing DATE, TIME, DATETIME, TIMESTAMP, and YEAR data types. (Bug #33895503)
MySQL HeatWave AutoML now generates a model explanation when you train a machine learning model. Model explanations help identify the features that are most important to a model. For more information, see The Model Catalog.
The following columns were added to the MODEL_CATALOG table:
- column_names: The feature columns used to train the model.
- last_accessed: The last time the model was accessed. MySQL HeatWave AutoML routines update this value to the current timestamp when accessing the model.
- model_explanation: The model explanation generated during training.
- model_type: The type of model (algorithm) selected by ML_TRAIN to build the model.
- task: The task type specified in the ML_TRAIN query (classification or regression).
ML_PREDICT_* and ML_EXPLAIN_* routine performance was improved, resulting in faster prediction and explanation processing. (WL #15088, WL #15014)
The following MySQL HeatWave AutoML enhancements were implemented:
- ML_TRAIN options for advanced users. These options permit users to customize various aspects of the ML training pipeline including algorithm selection, feature selection, and hyperparameter optimization.
  - The model_list option permits specifying the type of model to be trained.
  - The exclude_model_list option specifies models types to exclude from consideration during model selection.
  - The optimization_metric option specifies the scoring metric to optimize for when training a machine learning model.
  - The exclude_column_list option specifies feature columns to exclude from consideration when training a machine learning model.
  For more information, see ML_TRAIN.
- Support was added for Support Vector Machine SVC and LinearSVC classification and regression models. For a complete list of supported model types, see Model Types.
- The ML_TRAIN routine now reports a message if a trained model does not meet expected quality criteria.
- ML_EXPLAIN_ROW and ML_EXPLAIN_TABLE routines now provide information to help interpret explanations. The routines also report a warning when a model quality issue is detected, enabling users to revisit their data in order to improve model quality.
(WL #15089)

Functionality Added or Changed

The amount of heap memory allocated on the MySQL node for each table loaded into MySQL HeatWave was reduced, increasing the maximum number of tables that can be loaded. For MySQL.HeatWave.VM.E3.Standard shapes, the maximum was raised from 100k tables to 400k tables. For MySQL.HeatWave.BM.E3.Standard shapes, the maximum number was raised from 400k tables to 1600k tables. The actual number of tables that can be loaded is dependent on the table's data. (Bug #33951708)
The performance_schema.rpd_column_id table was modified to remove redundant data. The NAME, SCHEMA_NAME, TABLE_NAME columns were removed, and a TABLE_ID column was added. (Bug #33899183)
Support was added for the FROM_DAYS() temporal function, and GREATEST() and LEAST() comparison and string functions which now support DATE, DATETIME, TIME, and TIMESTAMP columns. (WL #14956)
Support was added for built-in server-side data masking and de-identification to help protect sensitive data from unauthorized uses by hiding and replacing real values with substitutes. Data masking and de-identification operations are performed on the server, and queries involving data masking and de-identification functions are accelerated by MySQL HeatWave. The following data masking and de-identification functions are supported:
See Data Masking and De-Identification Functions. (WL #15143)
Optimizations were implemented to improve performance for JOIN and GROUP BY queries with execution plans involving multiple consecutive rounds of data partitioning. (WL #15143)
expr IN (value,...) comparisons, where the expression is a single value and compared values are constants of the same data type and encoding, have been optimized. For example, the following IN() comparison has been optimized:
```
SELECT * FROM Customers WHERE Country IN ('Germany', 'France', 'Spain');
```
(WL #14952)