MySQL HeatWave User Guide
To use pre-authenticated requests (PAR) for loading data with Lakehouse Auto Parallel Load, review the following examples for different ways to set them up.
This topic contains the following sections:
Review the requirements and recommendations to Access Object Storage with Pre-Authenticated Requests.
Review How To Load Data From Object Storage Using Auto Parallel Load.
You can create the following types of PARs:
Bucket or prefix PAR: This type of PAR ends with
/o/
. You can load multiple files with
this PAR. You can also use name
,
prefix
, or pattern
parameters to specify files in the bucket or Object
Storage folder.
Object PAR: This type of PAR ends with
/o/object_name
. This PAR specifies
individual files. You cannot use
name
, prefix
, or
pattern
parameters with this PAR.
When creating PARs consider the following recommendations:
Only use read-only PARs.
Set a short expiration date for the PAR URL that matches the data loading plan.
Do not make a PAR URL publicly accessible.
If the target defines a bucket or uses a prefix or pattern:
Use Enable Object Listing when creating the PAR in the Oracle Cloud Infrastructure (OCI) console.
When creating the PAR from the command line, include
the --access-type AnyObjectRead
parameter.
Use a resource principal for access to more sensitive data in Object Storage as it is more secure. See Access Object Storage with Resource Principals.
To set up PARs, you configure the following parameters:
par
: Provide the PAR URL.
name
: Use this to specify a file for
an object or bucket PAR.
pattern
: Use this to set a regular
expression that defines a set of Object Storage files.
The pattern follows the modified
Modified
ECMAScript regular expression grammar.
prefix
: Use this to define a set of
Object Storage folders and files.
The following examples use these parameters and commands to set up the loading of data with PARs. Replace the values in the examples with your own.
SET @input_list
creates a session
variable that stores all the parameters for the loading
of the file.
db_name
identifies the database name
to store the table. MySQL HeatWave automatically creates the
database if it does not exist.
table_name
sets the table name to
store the data. MySQL HeatWave automatically creates the table
if it does not exist.
engine_attribute
defines the
parameters of the external file.
dialect
defines the format options of
the file.
file
defines the file or files to
load.
CALL sys.HEATWAVE_LOAD(CAST(@input_list AS
JSON), NULL);
runs Auto Parallel Load to load the external
tables from the parameters set in
@input_list
and with no additional
options.
If you are on MySQL 9.1.2 and earlier, you need to update
dialect
with the field
delimiter
and record
delimiter
parameters. As of MySQL 9.2.0, MySQL HeatWave Lakehouse
can automatically detect these values. See
Lakehouse
External Table Syntax to learn more.
To review all syntax options for loading external files, see Auto Parallel Load Syntax and Lakehouse External Table Syntax.
To load a single file with a pre-authenticated request, you can create a pre-authenticated request for that individual file or specify the file in the command.
The following example loads a single file by using a PAR for that file.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/data_file_1.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
The following example loads a single file by using a PAR for
a bucket and specifying the file in the
name
parameter. The file is in the
data_files
Object Storage folder.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/", "name": "data_files/data_file_1.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
If you want to load multiple files into one external table, you can do one of the following:
Create a folder in the Object Storage bucket and upload the required files into that folder. See Managing Folders in an Object Storage Bucket in Oracle Cloud Infrastructure Documentation.
Create a PAR for a bucket or folder, and then specify
each file to load with the name
parameter.
Create a PAR for each file to load, and then specify the
files as separate items in the
ENGINE_ATTRIBUTE
parameter.
To load multiple external files into one external table, you can create a folder in the Object Storage bucket and upload the required files into that folder. See Managing Folders in an Object Storage Bucket in Oracle Cloud Infrastructure Documentation.
If you create a folder with the files to load, create a PAR for that folder and do the following:
For the Pre-Authenticated Request
Target
, select Objects with
prefix.
Use Enable Object Listing.
The following example loads all the files uploaded to the folder in the Object Storage bucket into one table.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
The following example uses one PAR for a bucket and
specifies two files to load with the name
parameter.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/", "name": "data_files/data_file_1.csv"}, {"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/", "name": "data_files/data_file_2.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
The following example loads two separate PARs to load two files.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/data_files/data_file_1.csv"}, {"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/data_files/data_file_2.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
To load multiple external files into multiple tables with
one command, create a pre-authenticated for each required
file and create the necessary number of
tables
items.
The following example loads two files into two tables.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": { "dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/data_files/data_file_1.csv"}]}}, {"table_name": "table_2", "engine_attribute": { "dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/data_files/data_file_2.csv"}] }}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
You can add a prefix
parameter to a PAR
to load specific files. To do this, create a PAR for the
folder that stores the files to load.
The following example loads all files in the
data_files
folder that begin with
data_file_
. For example, MySQL HeatWave loads
the files data_file_1
,
data_file_2
, and
data_file_3
into
table_1
.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/", "prefix": "data_files/data_file_"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
You can add a pattern
parameter to a PAR
to use regular expression to load specific files.
The regular expression syntax requires certain characters to have an escape character.
The escape character is the backslash character, and it is a
reserved character in both JSON and MySQL. Therefore, it is
necessary to escape the backslash character twice, and
specify \\
for both JSON and MySQL.
However, the regular expression escape sequence depends upon
the NO_BACKSLASH_ESCAPES
SQL mode:
Use \\.
to escape a period if
NO_BACKSLASH_ESCAPES
is enabled.
Use \\\\.
to escape a period if
NO_BACKSLASH_ESCAPES
is not enabled. The following examples use this sequence
because it is the default mode.
See the following to learn more:
The following example loads all files in the
data_files
folder that have a numerical
suffix of one or more digits. For example, MySQL HeatWave loads the
files data_file_1.csv
,
data_file_2.csv
, and
data_file_3.csv
into
table_1
.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/", "pattern": "data_files/data_file_\\\\d+\\\\.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
The following examples load all files in the
bank_data
folder that have an
alphabetical suffix of one or more lowercase characters. For
example, MySQL HeatWave loads the files
data_file_a.csv
,
data_file_b.csv
, and
data_file_c.csv
.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/", "pattern": "data_files/data_file_[a-z]+\\\\.csv"}]} }]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
To update the previous example to include uppercase and
lowercase characters, replace [a-z]
with
[A-Za-z]
.
The following example loads all files in the
bank_data
folder that have a numerical
suffix that end in 0
with one preceding
digit. For example, MySQL HeatWave loads the files
data_file_10.csv
,
data_file_20.csv
, and
data_file_30.csv
.
mysql>SET @input_list = '[{"db_name": "lakehouse_db", "tables": [{"table_name": "table_1", "engine_attribute": {"dialect": {"format": "csv"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/", "pattern": "data_files/data_file_\\\\d0\\\\.csv"}] }}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
To update the previous example to load files with one or
more digits preceding the 0, update
\\\\d0\\\\
with
\\\\d+0\\\\
.
After successfully loading data and creating external tables, learn how to do the following: