MySQL HeatWave User Guide
After preparing the data for a forecasting model, you can train the model.
This topic has the following sections.
Review and complete all the tasks to Prepare Data for a Forecasting Model.
Define the following required parameters to train a forecasting model.
Set the task
parameter to
forecasting
.
datetime_index
: Define the column
that has date and time data. The model uses this column
as an index for the forecast variable. The following
data types for this column are supported:
DATETIME
,
TIMESTAMP
, DATE
,
TIME
, and YEAR
, or
an auto-incrementing index.
The forecast models
SARIMAXForecaster
,
VARMAXForecaster
, and
DynFactorForecaster
cannot back test,
that is forecast into training data, when using
exogenous_variables
. Therefore, the
predict table must not overlap the
datetime_index
with the training
table. The start date in the predict table must be a
date immediately following the last date in the training
table when exogenous_variables
are
used. For example, the predict table has to start with
year 2024 if the training table with
YEAR
data type
datetime_index
ends with year 2023.
The predict table cannot start with year, for example,
2025 or 2030, because that would miss out 1 and 6 years,
respectively.
When options
do not include
exogenous_variables
, the predict
table can overlap the datetime_index
with the training table. This supports back testing,
with the exception of the following models:
SARIMAXForecaster, VARMAXForecaster, and
DynFactorForecaster.
The valid range of years for
datetime_index
dates must be between
1678 and 2261. An error generates if any part of the
training table or predict table has dates outside this
range. The last date in the training table plus the
predict table length must still be inside the valid year
range. For example, if the
datetime_index
in the training table
has YEAR
data type, and the last date
is year 2023, the predict table length must be less than
238 rows: 2261 minus 2023 equals 238 rows.
endogenous_variables
: Define the
column or columns to be forecast. One of these columns
must also be specified as the
target_column_name
.
Based on the type of forecasting model you train, set the
appropriate JSON
options:
exogenous_variables
: Define the
column or columns that have independent, non-forecast,
predictive variables. These optional variables are not
forecast, but help to predict the future values of the
forecast variables. These variables affect a model
without being affected by it. For example, for sales
forecasting these variables might be advertising
expenditure, occurrence of promotional events, weather,
or holidays. Review
Forecasting
Models to see which models support exogenous
variables.
model_list
: Set the type of
forecasting model algorithm. See
Forecasting
Models to review supported algorithms.
include_column_list
: Define the
columns of exogenous_variables
that
must be included for training and should not be dropped.
You cannot run the following routines for a trained forecasting model:
After following the steps to
Prepare
Data for Forecasting Model, train the model with the
ML_TRAIN
routine and use the
electricity_demand_training
table
previously created. Before training the model, it is good
practice to define the model handle instead of automatically
creating one. See
Define Model Handle.
Optionally, set the value of the session variable, which sets the model handle to this same value.
mysql> SET @variable
= 'model_handle
';
Replace @variable
and
model_handle
with your own
definitions. For example:
mysql> SET @model='forecasting_use_case';
The model handle is set to
forecasting_use_case
.
Run the ML_TRAIN
routine.
mysql> CALL sys.ML_TRAIN('table_name
', 'target_column_name
', JSON_OBJECT('task', 'task_name
'), model_handle
);
Replace table_name
,
target_column_name
,
task_name
, and
model_handle
with your own
values.
The following example runs
ML_TRAIN
on the training
dataset previously created.
mysql> CALL sys.ML_TRAIN('forecasting_data.electricity_demand_train', 'demand',
JSON_OBJECT('task', 'forecasting', 'datetime_index', 'date', 'endogenous_variables', JSON_ARRAY('demand')), @model);
Where:
forecasting_data.electricity_demand_train
is the fully qualified name of the table that
contains the training dataset
(database_name.table_name
).
demand
is the name of the target
column, which contains ground truth values.
The JSON_OBJECT
defines the
following:
'task', 'forecasting'
specifies the machine learning task type.
'datetime_index', 'date'
defines the date
column as
the one with data and time data.
'endogenous_variables',
JSON_ARRAY('demand')
defines the
endogenous variables in a
JSON_ARRAY
. Since it is a
univariate model, the only endogenous variable
is demand
.
@model
is the session variable
previously set that defines the model handle to the
name defined by the user:
forecasting_use_case
. If you do
not define the model handle before training the
model, the model handle is automatically generated,
and the session variable only stores the model
handle for the duration of the connection. User
variables are written as
@
.
Any valid name for a user-defined variable is
permitted. See
Work with
Model Handles to learn more.
var_name
When the training operation finishes, the model handle
is assigned to the @model
session
variable, and the model is stored in the model catalog.
View the entry in the model catalog with the following
query. Replace user1
with
your MySQL account name.
mysql> SELECT model_id, model_handle, train_table_name FROM ML_SCHEMA_user1
.MODEL_CATALOG WHERE model_handle = 'forecasting_use_case';
+----------+----------------------------------------------+--------------------------------------------+
| model_id | model_handle | train_table_name |
+----------+----------------------------------------------+--------------------------------------------+
| 3 | forecasting_use_case | forecasting_data.electricity_demand_train |
+----------+----------------------------------------------+--------------------------------------------+
Learn how to Generate Predictions for a Forecasting Model.
Review additional Syntax Examples for Forecast Training