Loading a File

2.11 Loading a File

Use the LOAD command in SQLcl to load a comma-separated value file from a local directory or cloud storage location into a table.

2.11.1 LOAD Command

Loads a comma-separated value (csv) file from a local directory or cloud storage location into a table.

Syntax

LOAD [TABLE] [schema.]table_name { <file-specification> | <cloud-storage-specification> } 
[NEW | SHOW | SHOW_DDL | CREATE |CREATE_DDL]

where

[schema.]table_name identifies the table to load. If the schema is omitted, the table for the connected user schema is loaded.

file-specification has the following syntax:

{ <fully-qualified-file-name> | <file-name> }

fully-qualified-file-name: Identifies the full path to the file to load.
file-name: Identifies the file to load. The file must be located in the default path.

cloud-storage-specification has the following syntax:

{ CLOUDSTORAGE | CS | CLOUD_STORAGE } [ <url> | <qualified-name> ]

url: Complete URL for the cloud storage file if a default cloud storage URL is not set using the Cloud Storage command.
qualified-name: Name of the object, optionally qualified by the namespace and bucket. The qualified name combined with the URL specified by the Cloud Storage command must fully identify the object URL. If url and qualified-name are omitted, the default Cloud Storage URL must be set to the object.

NEW creates a table and loads data.

[SHOW | SHOW_DDL] executes the DDL generation phase and shows the DDL.

[CREATE | CREATE_DDL] executes the DDL generation phase and creates the table.

Use SET LOAD and SET LOADFORMAT to specify properties for DDL analysis and generation.

Create table DDL generation pre-scans the data file to determine column properties. Use SET LOAD SCAN <n> to specify the number of rows to scan for DDL. 100 is the default. To turn off scanning, set to 0.

Use SET LOAD COL_SIZE to change column sizes that are generated. Use SET LOAD MAP_NAMES to map file column names to table column names.

For more information about the Cloud Storage command, see Using Cloud Storage.

The defaults for the file format are:

The columns are delimited by a comma and may optionally be enclosed in double quotes.
Lines are terminated with standard line terminators for Windows, UNIX or Mac.
File is encoded UTF8.

The default load:

Processes with 50 rows per batch.
If AUTOCOMMIT is set in SQLcL, a commit is done every 10 batches.
The load is terminated if more than 50 errors are found.

Use SET LOADFORMAT options for reading the file (delimiter, enclosures).

Use SET LOAD options for loading the data (rows per batch, date formats).

Example

The following example shows how to load a file from local storage into a table.

--Create Table "countries"
create table countries(countries_id NUMBER(5),countries_name VARCHAR2(40));
Table COUNTRIES created

--Load file COUNTRIES_DATA_TABLE.csv in local storage to "countries" table
load countries C:\Users\JDOE\SQLcl\COUNTRIES_DATA_TABLE.csv

format csv

column_names on
delimiter ,
enclosure_left "
enclosure_right "
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names

--Number of rows processed: 30
--Number of rows in error: 
0 - SUCCESS: Load processed without errors

--Check the number of rows in countries table
select count(*) from countries;

COUNT(*)
–--------
30

The following example shows how to load data into a new table EMP.

load emp empfile.csv new

--Create new table and load data into table HR.EMP

csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names

#INFO DATE format detected: RRRR-MM-DD


CREATE TABLE HR.EMP
 (
  EMPLOYEE_ID NUMBER(5),
  FIRST_NAME VARCHAR2(26),
  LAST_NAME VARCHAR2(26),
  EMAIL VARCHAR2(26),
  PHONE_NUMBER VARCHAR2(26),
  HIRE_DATE DATE,
  JOB_ID VARCHAR2(26),
  SALARY NUMBER(9, 2),
  COMMISSION_PCT VARCHAR2(26),
  MANAGER_ID NUMBER(5),
  DEPARTMENT_ID NUMBER(5)
 )
;

#INFO Table created
#INFO Number of rows processed: 21
#INFO Number of rows in error: 0
#INFO Last row processed in final committed batch: 21
SUCCESS: Processed without errors

The following example shows how to create a new table from a local file.

load emp1 empfile.csv create_ddl

--Create new table HR.EMP1

csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names
batch_rows 50
batches_per_commit 10
clean_names transform
column_size rounded
commit on
date_format 
errors 50
map_column_names off
method insert
timestamp_format 
timestamptz_format 
locale English United States
scan_rows 100
truncate off
unknown_columns_fail on

--Pre-scans the date format
#INFO DATE format detected: RRRR-MM-DD

CREATE TABLE SYSTEM.EMP1
 (
  EMPLOYEE_ID NUMBER(5),
  FIRST_NAME VARCHAR2(26),
  LAST_NAME VARCHAR2(26),
  EMAIL VARCHAR2(26),
  PHONE_NUMBER VARCHAR2(26),
  HIRE_DATE DATE,
  JOB_ID VARCHAR2(26),
  SALARY NUMBER(9, 2),
  COMMISSION_PCT VARCHAR2(26),
  MANAGER_ID NUMBER(5),
  DEPARTMENT_ID NUMBER(5)
 )
;

#INFO Table created
SUCCESS: Processed without errors

2.11.1.1 SET LOAD Command

SET LOAD enables you to set options for loading data when using the LOAD command.

Syntax

SET LOAD default | [options...]

where

default means load method properties return to default values.

options represents the following:

```
BATCH_ROWS|BATCHROWS <number_of_rows>
```
Data loading is done in batches. Specifies the number of rows to include in each batch.
```
BATCHES_PER_COMMIT|BATCHESPERCOMMIT <batches_per_commit>
```
Commit after processing number_of_batches. If the number is equal to 0, commit happens at the end of the load. If the number is greater than or equal to 0, COMMIT ON is set.
```
CLEAN_NAMES [ TRANSFORM | TRANSFORM128 | QUOTE | QUOTE128 | UNIQUE ]
```
Identifies the rule for making table and column names compliant with database identifiers. Names are cleaned before they are mapped to provide consistency with previous releases. If both CLEAN_NAMES and MAP_COLUMN_NAMES are used, then clean names should be specified.

The standard identifiers are:
- No longer than 30 or 128 characters.
- Not a reserved word.
- Starts with a letter and contains only letters, digits, or one of _$#.
- Uppercase
- Names that do not comply must be quoted. Length rules always apply.
Note:
Data that is enclosed in quotes will have quotes in the header row removed before names are cleaned.

TRANSFORM (default)

Indicates that names are transformed as follows:
- Names are in uppercase.
- If the name starts and ends with the quote character, the quotes are removed.
- Names that are reserved words are appended with a dollar sign ($).
- Names that start with a number or special character is prefixed with an X.
- Spaces and hyphens are replaced with underscores (_). $ and # characters are retained.
- Special characters other than $ and # is replaced with the number sign (#).
- Names are truncated to 30 or 128 characters depending on database MAX_STRING_SIZE.
- After names are cleaned, non-unique names within the column set are appended with a unique sequence number. If truncation is required, the sequence number is maintained.
TRANSFORM (default)

Applies all transform rules. Names may be 128 characters.

QUOTE

Quote non-compliant names and shorten to 30 or 128 characters depending on database MAX_STRING_SIZE.

QUOTE128

Quote non-compliant names. Names may be 128 characters.

UNIQUE

Compatibility option with previous releases of load service. Names that are not unique within the column set are appended with a unique sequential number. Truncation is not provided.
```
COLUMN_SIZE|COLUMNSIZE|COLSIZE {ACTUAL|ROUND|ROUNDED|MAX|MAXIMUM}
```
Create table column size strategy.

ACTUAL uses the largest size found during the scan.

ROUND|ROUNDED uses a size a little larger than the largest size found during the scan.

MAX|MAXIMUM uses the database maximum size for the data type that was detected.
```
COMMIT {ON|OFF}
```
Enable or disable data commits.
```
DATE|DATE_FORMAT|DATEFORMAT format_mask 
```
The format of all DATE data type columns loaded. Specify no format_mask or DEFAULT to use database default.

For DATE columns, if format is not set and SCAN_ROWS = 0, the data is not scanned for a valid mask.

ERRORS {number_of_rows | UNLIMITED}|-1: Indicates the number of error rows allowed.

If this number is exceeded, the load will be terminated.

-1 and UNLIMITED indicate no error limit.

All rows in a batch may be in error if any row fails.
```
LOCALE { <language country> | DEFAULT | "" }
```
Specify locale language and optionally country.

DEFAULT|"" : Set to default locale.
```
MAP_COLUMN_NAMES|MAPCOLUMNNAMES|MAPNAMES { OFF| (<file-col-name>=<table-col-name>,...) }
```
Provide a mapping from the column names specified in the file to column names in the table.
```
METHOD INSERT
```
Method to use for data loads.
```
SCAN_ROWS|SCANROWS|SCAN <1-5000> 
```
Identify the number of rows to scan for create table generation. Default is 100 rows.
```
TIMESTAMP|TIMESTAMP_FORMAT|TIMESTAMPFORMAT
```
The format of all TIMESTAMP data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMP columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
```
TIMESTAMPTZ|TIMESTAMPTZ_FORMAT|TIMESTAMPTZFORMAT
```
The format of all TIMESTAMPTZ data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMPTZ columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
```
TRUNCATE {OFF|ON}
```
Truncate ON truncates the table before loading
```
UNKNOWN_COLUMNS_FAIL|UNKNOWNCOLUMNSFAIL|UNKNOWNFAIL {ON|OFF}
```
ON: Terminates the load if any columns in the file do not map to a column in the table.

OFF: Allows the load to proceed when columns in the file do not map to a column in the table.

2.11.1.2 SET LOADFORMAT Command

SET LOADFORMAT enables you to set format properties for loading data when using the LOAD command.

Syntax

SET LOADFORMAT [ default|csv|delimited|html|insert|json|json-formatted|loader|t2|xml] [options...]

where

default: Load format properties return to default values.
csv: Comma-separated values.
delimited : (csv synonym) Delimited format, comma separated values by default.
html : Hypertext Markup Language. For the UNLOAD command only.
insert : SQL insert statements. For the UNLOAD command only.
json : JavaScript Object Notation. For the UNLOAD command only.
json-formatted : Pretty-formatted JSON. For the UNLOAD command only.
loader : Oracle SQL Loader format. For the UNLOAD command only.
t2 : T2 Metrics. For the UNLOAD command only.
xml : Extensible Markup Language. For the UNLOAD command only.

options represent the following clauses:

COLUMN_NAMES|COLUMNNAMES|NAMES {ON|OFF}: Header row with column names.
DELIMITER {separator}: Delimiter separating fields in the record.
DOUBLE [OFF] : (Import only) Embedded right enclosures are doubled. OFF indicates embedded right enclosures are not doubled and embedded right enclosures can lead to unexpected results.
ENCLOSURES {enclosures|OFF} : Optional left and right enclosures.
- OFF indicates no enclosures
- If 1 character is specified, sets left and right enclosures to this value.
- If 2 or more characters are specified, sets left to the first character, right to the second character and ignores the remaining characters.
- To set multiple character enclosures, use Set ENCLOSURE_LEFT and ENCLOSURE_RIGHT.
ENCODING {encoding|OFF|""}: Encoding of load file. OFF and "" reset to default encoding for environment.
LEFT|ENCLOSURE_LEFT|ENCLOSURELEFT {enclosure|OFF}: Set a 1 or more character left enclosure. If no ENCLOSURE_RIGHT is specified, it is used for both left and right. OFF indicates no enclosures.
RIGHT|ENCLOSURE_RIGHT|ENCLOSURERIGHT {enclosure|OFF}: Set a 1 or more character right enclosure. OFF indicates no right enclosure.
ROW_LIMIT|ROWLIMIT|LIMIT} {number_of_rows|OFF|""}: Maximum number of rows to read, including the header. OFF and "" set to not limit.
SKIP|SKIP_ROWS|SKIPROWS {number_of_rows|OFF|""}: Number of rows to skip.
[[SKIP_AFTER_NAMES|SKIPAFTERNAMES|AFTER]|[SKIP_BEFORE_NAMES|SKIPBEFORENAMES|BEFORE]]: Skip the rows before or after the (header) Column Names row.
TERM|ROW_TERMINATOR {terminator|""|DEFAULT|CR|CRLF|LF}: Character(s) indicating end of row. If the file contains standard line end characters, the line_end does not need to be specified.
- "" or DEFAULT specifies the default (any standard terminator) for the LOAD command.
- "" or DEFAULT specifies the environment default for the UNLOAD command.
- CRLF specifies WINDOWS terminator, generally for the UNLOAD command.
- LF specifies UNIX terminator, generally for the UNLOAD command.
- CR specifies MAC terminator, generally for the UNLOAD command.

Examples

SQL> set loadformat delimited
7369,"SMITH","CLERK",7902,17-DEC-80,800,,20,5555555555554444

SQL> set loadformat delimited enclosures <> line_end {eol}
7369,<SMITH>,<CLERK>,7902,17-DEC-80,800,,20,5555555555554444{eol}

SQL> set loadformat default (restore default settings)
7369,"SMITH","CLERK",7902,17-DEC-80,800,,20,5555555555554444