MySQL HeatWave User Guide
Lakehouse Auto Parallel Load, which extends the Auto Parallel Load feature of MySQL HeatWave, facilitates the process of loading data from Object Storage into MySQL HeatWave by automating many of the steps involved, including:
Excluding schemas, tables, and columns that Auto Parallel Load cannot load.
Verifying that there is sufficient memory available for the data.
Optimizing load parallelism based on machine learning models.
Loading data into MySQL HeatWave.
Defining LAKEHOUSE
as the engine for
tables that MySQL HeatWave loads.
Defining the ENGINE_ATTRIBUTE
for tables
that MySQL HeatWave loads.
Lakehouse Auto Parallel Load also includes Lakehouse Incremental Load, which can refresh tables after an initial load.
Lakehouse Auto Parallel Load includes schema inference, and uses it in one of two ways:
Lakehouse Auto Parallel Load analyzes the data, infers the table structure, and
creates the database and all tables. This only requires
the name of the database, the names of each table, the
external file parameters, and then Lakehouse Auto Parallel Load generates the
CREATE DATABASE
and
CREATE TABLE
statements.
Lakehouse Auto Parallel Load uses header information from the external files to
define the column names. If this is not available, Lakehouse Auto Parallel Load
defines the column names sequentially:
col_1
, col_2
,
col_3
...
If the tables already exist, Lakehouse Auto Parallel Load analyzes the data,
infers the table structure, and then modifies the
structure to avoid errors during data load. For example,
if a table defines a column with
TINYINT
, but Lakehouse Auto Parallel Load infers
that the data requires
SMALLINT
MEDIUMINT
,
INT
, or
BIGINT
, then Lakehouse Auto Parallel Load
modifies the structure accordingly. If the inferred data
type is incompatible with the table definition, Lakehouse Auto Parallel Load
raises an error, and specifies the column as NOT
SECONDARY
.
If you are on a version earlier than MySQL 8.4.0, refer to Lakehouse Auto Parallel Load with The external_tables Option for more information on the appropriate syntax to use.