MySQL HeatWave User Guide

4.3.1 About MySQL HeatWave Lakehouse

The Lakehouse feature of MySQL HeatWave enables query processing on data in Object Storage. MySQL HeatWave Lakehouse reads the source data from Object Storage, transforms it to the memory optimized MySQL HeatWave format, saves it in the MySQL HeatWave persistence storage layer in Object Storage, and then loads the data to MySQL HeatWave Cluster memory. While Lakehouse provides in-memory query processing on data in Object Storage, it does not load data into a DB System table.

To learn how to use Lakehouse with MySQL HeatWave AutoML, see Use MySQL HeatWave AutoML with Lakehouse.

External Tables

When MySQL HeatWave loads data and creates a table from a file in Object Storage, it is an external Lakehouse table. External tables are tables that do not store any data, but refer to data stored externally in Object Storage. The external table stores the location of the data.

See the following to learn more:

Lakehouse Engine

The lakehouse storage engine allows you to create tables that point to external data sources.

For MySQL HeatWave Lakehouse, lakehouse is the primary engine, and rapid is the secondary engine.

As of MySQL 9.4.0, you can use the CREATE EXTERNAL TABLE statement, which automatically sets ENGINE to lakehouse, and SECONDARY_ENGINE to rapid.

For example:

mysql> CREATE EXTERNAL TABLE table_1 (col_1 INT)

In versions earlier than MySQL 9.4.0, you must use the CREATE TABLE statement, and manually set ENGINE to lakehouse, and SECONDARY_ENGINE to rapid.

For example:

mysql> CREATE TABLE table_1 (col_1 INT)
ENGINE=LAKEHOUSE
SECONDARY_ENGINE=RAPID;

Data Storage

When using Lakehouse, you do not directly access the source data stored in Object Storage. This memory optimized internal version of the data is retained only as long as MySQL HeatWave Cluster has the Lakehouse option enabled.

The data in external tables is deleted if you do any of the following:

Workflows

You have three options to load data into MySQL HeatWave Cluster: uniform resource identifiers (URIs) (as of MySQL 9.3.1), resource principals, and pre-authenticated requests (PAR). Refer to the following workflows and resources showing how to use each method to load data.

The workflow to use URIs:

  1. Prepare the files to load in the proper format. See Supported File Formats.

  2. Upload the files to load into Object Storage. See Uploading an Object Storage Object to a Bucket in Oracle Cloud Infrastructure Documentation.

  3. Load the external files into Lakehouse. See the following:

  4. Once Lakehouse successfully creates the external table from the external files, you can use the data in MySQL HeatWave. See the following:

The workflow to use Resource Principals:

  1. Prepare the files to load in the proper format. See Supported File Formats.

  2. Upload the files to load into Object Storage. See Uploading an Object Storage Object to a Bucket in Oracle Cloud Infrastructure Documentation.

  3. Load the external files into MySQL HeatWave Cluster. See the following:

  4. Once Lakehouse successfully creates the external table from the external files, you can use the data in MySQL HeatWave. See the following:

The workflow to use PARs:

  1. Prepare the files to load in the proper format. See Supported File Formats.

  2. Upload the files to load into Object Storage. See Uploading an Object Storage Object to a Bucket in Oracle Cloud Infrastructure Documentation.

  3. Create the PARs for the files previously uploaded. You have three options to create PARs: through the OCI console, the command line, and API. See Creating a Pre-Authenticated Request in Object Storage in Oracle Cloud Infrastructure Documentation.

  4. Load the external files into MySQL HeatWave Cluster. See the following:

  5. Once Lakehouse successfully creates the external table from the external files, you can use the data in MySQL HeatWave. See the following:

What's Next