Loading a File

2.10 Loading a File

Use the LOAD command in SQLcl to load a comma-separated value file from a local directory or cloud storage location into a table.

2.10.1 LOAD Command

Loads a comma-separated value (csv) file from a local directory or cloud storage location into a table.

Syntax

LOAD [TABLE] [schema.]table_name { <file-specification> | <cloud-storage-specification> } 
[NEW | SHOW | SHOW_DDL | CREATE |CREATE_DDL]

where

[schema.]table_name identifies the table to load. If the schema is omitted, the table for the connected user schema is loaded.

file-specification has the following syntax:

{ <fully-qualified-file-name> | <file-name> }

fully-qualified-file-name: Identifies the full path to the file to load.
file-name: Identifies the file to load. The file must be located in the default path.

cloud-storage-specification has the following syntax:

{ CLOUDSTORAGE | CS | CLOUD_STORAGE } [ <url> | <qualified-name> ]

url: Complete URL for the cloud storage file if a default cloud storage URL is not set using the Cloud Storage command.
qualified-name: Name of the object, optionally qualified by the namespace and bucket. The qualified name combined with the URL specified by the Cloud Storage command must fully identify the object URL. If url and qualified-name are omitted, the default Cloud Storage URL must be set to the object.

NEW creates a table and loads data.

[SHOW | SHOW_DDL] executes the DDL generation phase and shows the DDL.

[CREATE | CREATE_DDL] executes the DDL generation phase and creates the table.

Use SET LOAD and SET LOADFORMAT to specify properties for DDL analysis and generation.

Create table DDL generation pre-scans the data file to determine column properties. Use SET LOAD SCAN <n> to specify the number of rows to scan for DDL. 100 is the default. To turn off scanning, set to 0.

Use SET LOAD COL_SIZE to change column sizes that are generated. Use SET LOAD MAP_NAMES to map file column names to table column names.

For more information about the Cloud Storage command, see Using Cloud Storage.

The defaults for the file format are:

The columns are delimited by a comma and may optionally be enclosed in double quotes.
Lines are terminated with standard line terminators for Windows, UNIX or Mac.
File is encoded UTF8.

The default load:

Processes with 50 rows per batch.
If AUTOCOMMIT is set in SQLcL, a commit is done every 10 batches.
The load is terminated if more than 50 errors are found.

Use SET LOADFORMAT options for reading the file (delimiter, enclosures).

Use SET LOAD options for loading the data (rows per batch, date formats).

Example

The following example shows how to load a file from local storage into a table.

--Create Table "countries"
create table countries(countries_id NUMBER(5),countries_name VARCHAR2(40));
Table COUNTRIES created

--Load file COUNTRIES_DATA_TABLE.csv in local storage to "countries" table
load countries C:\Users\JDOE\SQLcl\COUNTRIES_DATA_TABLE.csv

format csv

column_names on
delimiter ,
enclosure_left "
enclosure_right "
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names

--Number of rows processed: 30
--Number of rows in error: 
0 - SUCCESS: Load processed without errors

--Check the number of rows in countries table
select count(*) from countries;

COUNT(*)
–--------
30

The following example shows how to load data into a new table EMP.

load emp empfile.csv new

--Create new table and load data into table HR.EMP

csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names

#INFO DATE format detected: RRRR-MM-DD


CREATE TABLE HR.EMP
 (
  EMPLOYEE_ID NUMBER(5),
  FIRST_NAME VARCHAR2(26),
  LAST_NAME VARCHAR2(26),
  EMAIL VARCHAR2(26),
  PHONE_NUMBER VARCHAR2(26),
  HIRE_DATE DATE,
  JOB_ID VARCHAR2(26),
  SALARY NUMBER(9, 2),
  COMMISSION_PCT VARCHAR2(26),
  MANAGER_ID NUMBER(5),
  DEPARTMENT_ID NUMBER(5)
 )
;

#INFO Table created
#INFO Number of rows processed: 21
#INFO Number of rows in error: 0
#INFO Last row processed in final committed batch: 21
SUCCESS: Processed without errors

The following example shows how to create a new table from a local file.

load emp1 empfile.csv create_ddl

--Create new table HR.EMP1

csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names
batch_rows 50
batches_per_commit 10
clean_names transform
column_size rounded
commit on
date_format 
errors 50
map_column_names off
method insert
timestamp_format 
timestamptz_format 
locale English United States
scan_rows 100
truncate off
unknown_columns_fail on

--Pre-scans the date format
#INFO DATE format detected: RRRR-MM-DD

CREATE TABLE SYSTEM.EMP1
 (
  EMPLOYEE_ID NUMBER(5),
  FIRST_NAME VARCHAR2(26),
  LAST_NAME VARCHAR2(26),
  EMAIL VARCHAR2(26),
  PHONE_NUMBER VARCHAR2(26),
  HIRE_DATE DATE,
  JOB_ID VARCHAR2(26),
  SALARY NUMBER(9, 2),
  COMMISSION_PCT VARCHAR2(26),
  MANAGER_ID NUMBER(5),
  DEPARTMENT_ID NUMBER(5)
 )
;

#INFO Table created
SUCCESS: Processed without errors

2.10.1.1 SET LOAD Command

SET LOAD enables you to set options for loading data.

Syntax

SET LOAD default | [options...]

where

default means load method properties return to default values.

options represents the following:

```
BATCH_ROWS|BATCHROWS <number_of_rows>
```
Data loading is done in batches. Specifies the number of rows to include in each batch.
```
BATCHES_PER_COMMIT|BATCHESPERCOMMIT <batches_per_commit>
```
Commit after processing number_of_batches. If the number is equal to 0, commit happens at the end of the load. If the number is greater than or equal to 0, COMMIT ON is set.
```
CLEAN_NAMES [ TRANSFORM | TRANSFORM128 | QUOTE | QUOTE128 | UNIQUE ]
```
Identifies the rule for making table and column names compliant with database identifiers. Names are cleaned before they are mapped to provide consistency with previous releases. If both CLEAN_NAMES and MAP_COLUMN_NAMES are used, then clean names should be specified.

The standard identifiers are:
- No longer than 30 or 128 characters.
- Not a reserved word.
- Starts with a letter and contains only letters, digits, or one of _$#.
- Uppercase
- Names that do not comply must be quoted. Length rules always apply.
Note:
Data that is enclosed in quotes will have quotes in the header row removed before names are cleaned.

TRANSFORM (default)

Indicates that names are transformed as follows:
- Names are in uppercase.
- If the name starts and ends with the quote character, the quotes are removed.
- Names that are reserved words are appended with a dollar sign ($).
- Names that start with a number or special character is prefixed with an X.
- Spaces and hyphens are replaced with underscores (_). $ and # characters are retained.
- Special characters other than $ and # is replaced with the number sign (#).
- Names are truncated to 30 or 128 characters depending on database MAX_STRING_SIZE.
- After names are cleaned, non-unique names within the column set are appended with a unique sequence number. If truncation is required, the sequence number is maintained.
TRANSFORM (default)

Applies all transform rules. Names may be 128 characters.

QUOTE

Quote non-compliant names and shorten to 30 or 128 characters depending on database MAX_STRING_SIZE.

QUOTE128

Quote non-compliant names. Names may be 128 characters.

UNIQUE

Compatibility option with previous releases of load service. Names that are not unique within the column set are appended with a unique sequential number. Truncation is not provided.
```
COLUMN_SIZE|COLUMNSIZE|COLSIZE {ACTUAL|ROUND|ROUNDED|MAX|MAXIMUM}
```
Create table column size strategy.

ACTUAL uses the largest size found during the scan.

ROUND|ROUNDED uses a size a little larger than the largest size found during the scan.

MAX|MAXIMUM uses the database maximum size for the data type that was detected.
```
COMMIT {ON|OFF}
```
Enable or disable data commits.
```
DATE|DATE_FORMAT|DATEFORMAT format_mask 
```
The format of all DATE data type columns loaded. Specify no format_mask or DEFAULT to use database default.

For DATE columns, if format is not set and SCAN_ROWS = 0, the data is not scanned for a valid mask.

ERRORS {number_of_rows | UNLIMITED}|-1: Indicates the number of error rows allowed.

If this number is exceeded, the load will be terminated.

-1 and UNLIMITED indicate no error limit.

All rows in a batch may be in error if any row fails.
```
LOCALE { <language country> | DEFAULT | "" }
```
Specify locale language and optionally country.

DEFAULT|"" : Set to default locale.
```
MAP_COLUMN_NAMES|MAPCOLUMNNAMES|MAPNAMES { OFF| (<file-col-name>=<table-col-name>,...) }
```
Provide a mapping from the column names specified in the file to column names in the table.
```
METHOD INSERT
```
Method to use for data loads.
```
SCAN_ROWS|SCANROWS|SCAN <1-5000> 
```
Identify the number of rows to scan for create table generation. Default is 100 rows.
```
TIMESTAMP|TIMESTAMP_FORMAT|TIMESTAMPFORMAT
```
The format of all TIMESTAMP data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMP columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
```
TIMESTAMPTZ|TIMESTAMPTZ_FORMAT|TIMESTAMPTZFORMAT
```
The format of all TIMESTAMPTZ data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMPTZ columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
```
TRUNCATE {OFF|ON}
```
Truncate ON truncates the table before loading
```
UNKNOWN_COLUMNS_FAIL|UNKNOWNCOLUMNSFAIL|UNKNOWNFAIL {ON|OFF}
```
ON: Terminates the load if any columns in the file do not map to a column in the table.

OFF: Allows the load to proceed when columns in the file do not map to a column in the table.