MySQL HeatWave User Guide
As of MySQL 9.4.0, you can set external table options using
SQL syntax. The following options apply to
CREATE TABLE
and ALTER
TABLE
statements. For ALTER TABLE
statements, ADD COLUMN
is not supported for
external tables.
The SQL syntax is supported only for
loading
data manually. To
load data
using Auto Parallel Load, you must use
JSON
Syntax for setting the
ENGINE_ATTRIBUTE
options.
This topic contains the following sections:
The following syntax shows how to specify these external table options with SQL syntax.
CREATE EXTERNAL TABLE table_name (col_name data_type ['Col_Options_SQL
'], ...)Tbl_Options_SQL
;
The CREATE EXTERNAL TABLE
statement
automatically sets ENGINE
to
lakehouse
and
SECONDARY_ENGINE
to
rapid
. If you use the CREATE
TABLE
statement, you need to manually set
ENGINE
to lakehouse
and SECONDARY_ENGINE
to
rapid
.
To learn how to load data into tables created with the
CREATE
TABLE
statement, see
Load
Structured Data Manually.
Tbl_Options_SQL
(table_option
[table_option
]...)table_option
: { | FILE_FORMAT [=]format_spec
| FILES [=] (file_spec
[,file_spec
] ...) | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT} | VERIFY_KEY_CONSTRAINTS [=] {0 | 1 | DEFAULT} | STRICT_LOAD [=] {0 | 1 | DEFAULT} } format_spec: { [general_options
] [field_options
] [LINES [TERMINATED BY 'string']] [IGNORE number {LINES | ROWS}]general_options
general_option
[general_option
] ... general_option: { [FORMAT {csv | avro | json | parquet}] | [COMPRESSION {zip | gzip | bzip2 | auto}] | [CHARACTER SETcharset name
] | [HEADER {ON | OFF}] } field_options: {FIELDS | COLUMNS} field_option [field_option] field_option: DATE FORMAT 'date_format
' | DATETIME FORMAT 'datetime_format
' | TIME FORMAT 'time_format
' | ESCAPED BY 'char
' | TERMINATED BY | { [OPTIONALLY] ENCLOSED BY 'char
'}file_spec
: {file_attribute
[file_attribute
] ...} file_attribute: { {URI | URL} [=] 'uri_string
' | FILE_NAME [=] 'object_name
' | FILE_PATTERN [=] 'pattern
' | FILE_PREFIX [=] 'prefix_name
' | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT} | STRICT_LOAD [=] {0 | 1 | DEFAULT} }
For a complete list of table options, see
CREATE TABLE
.
For FILES
options, you can set the
following:
Uniform Resource Identifiers (URI)
Available for MySQL HeatWave on OCI. You can provide an OCIFS URI, PAR URI, or Native URI.
Pre-Authenticated Requests (PAR)
See Object Storage Pre-Authenticated Requests in Oracle Cloud Infrastructure Documentation.
You cannot use resource principals to specify details for Object Storage files. To use resource principals, you must use External Table ENGINE_ATTRIBUTE Options. Alternatively, you can convert resource principals to URIs.
You can override some options at the column level by
specifying column-level parameters with the
EXTERNAL FORMAT
statement. You can set
DATE
, TIME
,
DATETIME
, and
TIMESTAMP
as optional column-level
parameters.
EXTERNAL_FORMAT {'date_format
' | 'time_format
' | 'timestamp_format
'}
The following example loads a table from a file specified
with a uniform resource identifier (URI) and sets the
date_format
,
time_format
, and
timestamp_format
at the column level.
mysql>CREATE EXTERNAL TABLE table_1( col_1 date EXTERNAL FORMAT '%W %M %Y', col_2 time EXTERNAL FORMAT 'hh:mm:ss', col_3 timestamp EXTERNAL FORMAT '%Y-%m-%d %H:%i:%s', col_4 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
For more information about column definitions, see File Format Options.
The following table parameters are optional and specify options at the table level.
ALLOW_MISSING_FILES
: Ignores the
missing files and completes the data loading based on
the selected option. This overrides the table parameter
STRICT_LOAD
for missing files. The
supported options are:
1
: An error does not occur for
any missing file, and data loading continues with
the existing files. An error occurs if all files are
not available.
0
: If any file is missing, an
error occurs and data is not loaded.
If you set no value or DEFAULT
,
the setting is defined by
STRICT_LOAD
. If no value is set
for STRICT_LOAD
, the setting is
defined by the
sql_mode
.
Review Table 4.6, “Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD”.
A missing file is defined as:
With the FILE_NAME
parameter:
There is no file with that name.
With the FILE_PATTERN
parameter:
There are no files that match the pattern.
With the FILE_PREFIX
parameter:
There are no files with that prefix.
The following example loads a table and sets
ALLOW_MISSING_FILES
to
1
in the table options. If any of the
specified files are not available, an error does not
occur and data continues loading for the existing files.
If all files are missing, an error occurs. If a
file-specific setting is set, this may override the
ALLOW_MISSING_FILES
setting.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv', URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv', URI = 'oci://mybucket@mynamespace/data_files/data_file_3.csv') ALLOW_MISSING_FILES = 1;
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
STRICT_LOAD
: Loads data in strict
mode or non-strict mode, based on the selected option.
By default, STRICT_LOAD
is set to the
sql_mode
value. See
Strict SQL Mode. The
FILES
parameter
STRICT_LOAD
can override this
setting. The supported options are:
1
: Loads the data in strict mode.
The data loading stops if there is an error due to
missing files, empty columns, formatting errors or
parsing errors.
0
: Loads the data in non-strict
mode. Missing files, empty columns, formatting
errors or parsing errors display a warning, and data
is loaded.
If you set no value or DEFAULT
,,
the setting is defined by the
sql_mode
.
Review Table 4.6, “Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD”.
This setting overrides the global
sql_mode
setting for
handling missing files. The default is the value of
sql_mode
. See
Strict SQL Mode. The
FILES
parameter
STRICT_LOAD
can override this
setting.
The following example loads a table from a file
specified with a uniform resource identifier (URI) and
sets STRICT_LOAD
to
0
in the table options.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv') STRICT_LOAD = 0;
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
VERIFY_KEY_CONSTRAINTS
: Lakehouse
validates primary key and unique key constraints during
the initial load based on the selected option. The
supported options are:
1
: The default value. Lakehouse
validates primary key and unique key constraints
only during the initial load of the table. If there
are subsequent loads or refreshes of the table,
validation does not occur.
0
: Lakehouse does not validate
primary key and unique key constraints.
DEFAULT
: Lakehouse does not
validate primary key and unique key constraints.
The following example loads a table from a file
specified with a uniform resource identifier (URI) and
sets VERIFY_KEY_CONSTRAINTS
to
0
in the table options.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv') VERIFY_KEY_CONSTRAINTS = 0;
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Refer to the following table to review how different
combinations of settings for
ALLOW_MISSING_FILES
and
STRICT_LOAD
affect errors for missing
files.
Table 4.6 Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD
Setting for ALLOW_MISSING_FILES and STRICT_LOAD | Description of Possible Errors |
---|---|
|
No error generated unless all files are missing. |
|
Error generated for any missing files. |
|
No error generated unless all files are missing. |
|
Error generated for any missing files. |
|
Error generated for any missing files. |
|
No error generated unless all files are missing. |
|
No error generated unless all files are missing. |
|
Error generated for any missing files. |
|
Error generation depends on setting for
sql_mode . |
File specifications are enclosed in
FILES
.
A FILES
clause is required if you specify
any additional external table options. The following
parameters are available:
FILES
: Defines the Object Storage
files. You can define the file locations using uniform
resource identifier (URI) or PAR URL.
Lakehouse supports a maximum of 170 file locations. To
define more than the maximum number of files, store the
files under the same bucket or use
FILE_PREFIX
or
FILE_PATTERN
.
FILES
parameters for
uniform
resource identifiers:
URI
: The URL of the URI you
specify. Do not provide a pattern, prefix, or name
parameter with the URI
. These
details are inferred by the
object_path
you define.
The following examples use the URI syntax for OCIFS:
oci://bucket_name
@namespace_name
/object_path
Name: If the object_path
is
neither a glob pattern nor prefix.
The following example loads a table with a single file specified with an OCIFS URI.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Prefix: If the object_path
is
not a glob pattern and ends with an unencoded
/
character, such as a folder
path.
The following example loads a table with a prefix from a file specified with an OCIFS URI. The prefix specifies files in an Object Storage folder.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Glob pattern: If the
object_path
contains at least
one of the following characters:
?
, *
, or
[
. To use these characters as
literals, you need to escape them or encode them
as needed depending on the URI syntax. Regex
patterns are not supported. See
Glob
Patterns from the Oracle Cloud Infrastructure
Documentation to learn more.
The following example loads a table by using an
unencoded *
character to
create a glob pattern.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_files/data_file_*.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Review additional examples that use URIs to load data manually.
FILES
parameters for
pre-authenticated
requests:
URL
: The PAR URL. Review
PAR
Recommendations.
The following example loads a table with a single file specified with a PAR.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
FILE_NAME
: A specific Object
Storage file name. For
pre-authenticated
requests, the file name provided in the PAR
URL is considered.
The following example uses the
FILE_NAME
parameter to specify a
file in an Object Storage bucket. The PAR points to
the Object Storage bucket.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/' FILE_NAME = 'data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
FILE_PATTERN
: A regular
expression that defines a set of Object Storage
files.
The following example uses a pattern to load files that have a numerical suffix.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/' FILE_PATTERN = 'data_files/data_file_\\\\d+\\\\.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
See the following to learn more:
FILE_PREFIX
: The prefix for a set
of Object Storage files. The prefix or bucket name
present in the PAR URL is considered.
The following example uses a prefix to specify files to load in an Object Storage folder.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/' FILE_PREFIX = 'data_files/');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Review additional examples that use PARs to load data manually.
The following parameters enclosed in
FILES
are optional:
ALLOW_MISSING_FILES
: Ignores the
missing files and completes the data loading based on
the selected option. This overrides the table option
parameter STRICT_LOAD
for missing
files. To learn about the default value, see the
ALLOW_MISSING_FILES
table option
parameter in
Table Parameters.
The supported options are:
1
: If any file is missing, no
error occurs and data loading continues with the
existing files. An error occurs if all the files are
not available.
0
: If any file is missing, an
error occurs and data is not loaded.
The following example loads a table from a glob pattern
specified with a uniform resource identifier (URI) and
sets ALLOW_MISSING_FILES
to
1
in the FILES
parameters. If any of the specified files in the pattern
are not available, an error does not occur and data
continues loading for the existing files. If all files
are missing, an error occurs.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-10].csv' ALLOW_MISSING_FILES = 1);
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
The following example loads a table and sets
ALLOW_MISSING_FILES
to
0
for the first file and
ALLOW_MISSING_FILES
to
1
for the second file. If the first
file is missing, an error occurs. If the second file is
missing, the load continues with the loading of the
first file.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv' ALLOW_MISSING_FILES = 0, URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv' ALLOW_MISSING_FILES = 1);
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
STRICT_LOAD
: Loads data in strict
mode or non-strict mode, based on the selected option.
This overrides the STRICT_LOAD
table
option parameter. To learn about the default value, see
the STRICT_LOAD
table option
parameter in
Table Parameters.
The supported options are:
1
: Loads the data in strict mode.
The data loading stops if there is an error due to
missing files, empty columns, formatting errors or
parsing errors.
0
: Loads the data in non-strict
mode. Missing files, empty columns, formatting
errors or parsing errors display a warning, and data
is loaded.
The following example loads a table from a file
specified with a uniform resource identifier (URI) and
sets STRICT_LOAD
to
0
in the FILES
parameters.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv' STRICT_LOAD = 0);
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
File format options are enclosed in
FILE_FORMAT
.
The following file format options are available:
FORMAT
: The file format defined in a
table. You can define only one file format per table.
The supported file formats are:
csv
: The default file format.
The following example loads a CSV file into a table with a uniform resource identifier (URI).
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
json
: Only Newline Delimited JSON
files are supported. Tables created with
json
format must only have a
single column that conforms to the
JSON data type.
The following example loads a JSON file into a table with a uniform resource identifier (URI).
mysql>CREATE EXTERNAL TABLE table_1( col_1 json) FILE_FORMAT = (FORMAT json) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
parquet
: The
Parquet
data type.
The following example loads a Parquet file into a table with a uniform resource identifier (URI).
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT parquet) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
avro
: The
Avro
data type.
The following example loads an Avro file into a table with a uniform resource identifier (URI).
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT avro) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
COMPRESSION
: Use this to load
compressed files.
Compression is supported for all file formats:
csv
: The default value is no
compression. You can set
compression
to
zip
, gzip
, or
bzip2
.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and uses a CSV file compressed in
zip
format.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv COMPRESSION 'zip') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
json
: The default value is no
compression. You can set
compression
to
zip
, gzip
, or
bzip2
.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and uses a JSON file compressed in
gzip
format.
mysql>CREATE EXTERNAL TABLE table_1( col_1 json) FILE_FORMAT = (FORMAT json COMPRESSION 'gzip') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json.gz');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
avro
: The default value is
auto
. You cannot set any other
options for COMPRESSION
, as Avro
files declare their compression format in the file
metadata.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and uses an Avro file with the
auto
compression option.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT avro COMPRESSION 'auto') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
parquet
: The default value is
auto
. You cannot set any other
options for COMPRESSION
, as
Parquet files declar their compression format in the
file metadata.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and uses an Parquet file with the
auto
compression option.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT parquet COMPRESSION 'auto') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Note the following when loading compressed files:
Each compressed file must only contain a single original file. You cannot provide a compressed file with archives or folders with multiple files.
Files in zip
format must be
compressed with the Deflate algorithm.
If loading multiple files, all the files must be in the same compression format.
When loading multiple compressed files, it is best to prepare files of similar size, and in the range of 80-120 MB each. Otherwise, you may experience longer loading times.
CHARACTER SET
: Defines the character
encoding. The default is 'utf8mb4'
.
HEADER
: Adds a header row to the
CSV
file, based on the selected
option (ON
or
OFF
). The default is
OFF
.
If HEADER
and
IGNORE
are both defined, Lakehouse
first skips the number of rows, and then uses the next
row as the header row.
The following field options are available:
The FIELDS
and
COLUMNS
clauses define the format for
fields and columns. The following parameters are
available:
DATE FORMAT
: The date format,
see:
date_format
.
You can also set date formats for each column by
using the EXTERNAL_FORMAT
statement, which overrides the format in the table
parameter.
TIME FORMAT
: The time format,
see:
String and Numeric Literals in Date and Time Context.
You can also set time formats for each column by
using the EXTERNAL_FORMAT
statement, which overrides the format in the table
parameter.
DATETIME FORMAT
: The datetime
format, see:
date_format
. You can
also set datetime formats for each column by using
the EXTERNAL_FORMAT
statement,
which overrides the format in the table parameter.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and sets the DATE FORMAT
,
TIME FORMAT
, and
DATETIME FORMAT
.
mysql>CREATE EXTERNAL TABLE table_1( col_1 date, col_2 time, col_3 timestamp) FILE_FORMAT = (FORMAT csv FIELDS DATE FORMAT '%W %M %Y' TIME FORMAT 'hh:mm:ss' DATETIME FORMAT '%Y-%m-%d %H:%i:%s') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
TERMINATED BY
: Defines one or
more characters used to enclose fields. The maximum
field delimiter length is 64 characters. You can add
any string as a
custom_field_delimiter
.
When FIELDS TERMINATED BY
is set
to auto
, Lakehouse
automatically detects the field delimiters.
The following field delimiters are detected automatically by Lakehouse:
|
: Pipe.
,
: Comma
\t
: Tab
;
: Semicolon
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and sets , and sets FIELDS TERMINATED
BY
and LINES TERMINATED
BY
.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv FIELDS TERMINATED BY '\r\n' LINES TERMINATED BY ';') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
[OPTIONALLY] ENCLOSED BY
: Defines
one or more characters used to enclose fields. The
default is '\''
. You can add any
string as a
custom_quotation_marks
.
ESCAPED BY
: Defines one or more
characters used to escape special characters. The
default is '\'
. You might have to
use an additional escape character depending on the
client you use. You can add any string as a
custom_escape_character
.
The following example loads a table from a file
specified with a uniform resource identifier (URI)
and sets ESCAPED BY
, and
[OPTIONALLY] ENCLOSED BY
.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv FIELDS ESCAPED BY '\\' OPTIONALLY ENCLOSED BY '\"') FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
The following additional file format options are available:
The LINES
clause defines the format
for each new line. The following parameters are
available:
TERMINATED BY
: Defines one or
more characters used to delimit records. The maximum
record delimiter length is 64 characters. You can
add any string as a delimiter.
The default record delimiter for is
'\n'
.
For CSV
files, if you set
LINES TERMINATED BY
to
auto
, Lakehouse automatically
detects the record delimiters and composite record
delimiters with field delimiters as prefixes.
The following record delimiters detected automatically by Lakehouse:
\r
: Carriage return.
\n
: Line feed. This is the
default.
\r\n
: Carriage return and
line feed
IGNORE
: The number of rows to skip at
the start of the file. The default value is
0
and the maximum value is
20
.
If HEADER
and
IGNORE
are both defined, Lakehouse
first skips the number of rows, and then uses the next
row as the header row.
The following example loads a table from a file
specified with a uniform resource identifier (URI) and
sets IGNORE
and
HEADER
. Since both parameters are
set, the first row is skipped and the second row is set
as the header row.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv HEADER ON IGNORE 1 LINES) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
The following examples load tables with additional options and show the proper order of options in the commands.
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 DATE EXTERNAL_FORMAT '%e/%c %Y') FILE_FORMAT = (FORMAT csv COMPRESSION 'zip' CHARACTER SET utf8mb4 HEADER OFF FIELDS DATE FORMAT '%Y%M%D' TERMINATED BY ',' LINES TERMINATED BY '\\n' IGNORE 2 LINES) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip' ALLOW_MISSING_FILES = 0);
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv HEADER ON FIELDS ESCAPED BY '\\' IGNORE 1 LINES) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-5].csv' STRICT_LOAD = 1) ALLOW_MISSING_FILES = 1;
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
mysql>CREATE EXTERNAL TABLE table_1( col_1 int, col_2 int, col_3 int) FILE_FORMAT = (FORMAT csv HEADER ON FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' IGNORE 1 LINES) FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv') VERIFY_KEY_CONSTRAINTS = 0;
mysql>ALTER TABLE table_1 SECONDARY_LOAD;
Learn how to Load Structured Data Manually