access_protocol |
Specifies the type of Apache Iceberg table, such as AWS or OCI Object Storage, and what information is used to create the external table, for example information from a data catalog or from a direct metadata URI. |
See CREATE_EXTERNAL_TABLE Procedure for Apache Iceberg, for details on the access_protocol syntax. |
blankasnull |
When set to true, loads fields consisting of spaces as null. |
blankasnull : true
Default value: False |
characterset
Valid with format JSON and COPY_DATA |
Specifies the characterset of source files |
characterset: string
Default value: Database characterset |
columnpath
Only use with format JSON and COPY_DATA |
Array of JSON path expressions that correspond to the fields that need to be extracted from the JSON records. Each of the JSON path expressions in the array should follow the rules described in SQL/JSON Path Expressions. Only use with format JSON and DBMS_CLOUD.COPY_DATA. |
JSON Array of json path expressions expressed in string format. For example: ‘columnpath’ value'["$.WEATHER_STATION_ID", "$.WEATHER_STATION_NAME"]' |
compression
Option valid with JSON data |
Specifies the compression type of the source file. ZIP archiving format is not supported. Specifying the value auto checks for the compression types: gzip, zlib, zstd, bzip2. |
compression: auto\|gzip\|zlib\|zstd\|bzip2
Default value: Null value meaning no compression. |
conversionerrors |
If a row is rejected because of data type conversion errors, the related columns are stored as null or the row is rejected. |
conversionerrors : reject_record \| store_null
Default value: reject_record |
dateformat |
Specifies the date format in the source file. The format option AUTO searches for the following formats: J MM-DD-YYYYBC MM-DD-YYYY YYYYMMDD HHMISS YYMMDD HHMISS YYYY.DDD YYYY-MM-DD |
dateformat : string
Default value: Database date format |
delimiter |
Specifies the field delimiter. To use a special character as the delimiter, specify the HEX value of the ASCII code of the character. For example, the following specifies the TAB character as the delimiter: format => json_object('delimiter' value 'X''9''') |
delimiter : character
Default value \| (pipe character) |
detectfieldorder |
Specifies that the fields in the external data files are in a different order than the columns in the table. Detect the order of fields using the first row of each external data file and map it to the columns of the table. The field names in external data files are compared in case insensitive manner with the names of the table columns. This format option is applicable for the following procedures: DBMS_CLOUD.COPY_DATA
DBMS_CLOUD.CREATE_EXTERNAL_TABLE
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE
DBMS_CLOUD.CREATE_HYBRID_PART_TABLE
Restrictions for detectfieldorder: Field names in the data file must appear in the first record line and must not contain any white spaces between the field names. The field delimiter in the field names record must be the same as the field delimiter for the data in the file. Quoted field names are not supported. The field names in data files are compared, in case insensitive manner, with the names of the external table columns. Embedded field delimiters are not allowed in the field names. The number of columns in the table must match the number of fields in the data files. This format option is not applicable for Bigdata or Oracle Data Pump formats, as those formats have precise column metadata information in the binary file format. The text formats, CSV, JSON, Parquet, or XML can benefit from this automatic field order detection when the first line contains the field names. See FIELD NAMES and the description for ALL FILES for more information. |
detectfieldorder: true
Default value: false |
enablelogs |
The format option enablelogs is used with the following DBMS_CLOUD procedures: COPY_DATA
COPY_COLLECTION
EXPORT_DATA
enablelogs specifies a boolean value, when set to TRUE, logs are generated. When set to FALSE, logs are not generated.
For example: format => JSON_OBJECT('enablelogs' value FALSE) |
enablelogs: false
Default value: true |
encryption |
The format option encryption specifies the encryption and decryption options to export and import data to and from the Object Store. Use encryption to specify the following parameters to encrypt and decrypt: user_defined_function: Specifies a fully qualified user defined function to decrypt or encrypt the specified BLOB (binary large object). It returns a decrypted or encrypted BLOB. user_defined_function is mutually exclusive with other parameters for encryption.
For example, ADMIN.DECRYPTION_CALLBACK. type: Specifies the DBMS_CRYPTO encryption algorithm to decrypt or encrypt.
type accepts values in the Block Cipher Algorithms + Block Cipher Chaining Modifiers + Block Cipher Padding Modifiers format.
Supported Block Cipher Algorithms are: Supported Block Cipher Chaining Modifiers are: DBMS_CRYPTO.CHAIN_CBC
DBMS_CRYPTO.CHAIN_CFB
DBMS_CRYPTO.CHAIN_ECB
DBMS_CRYPTO.CHAIN_OFB
Supported Block Cipher Padding Modifiers are: DBMS_CRYPTO.PAD_PKCS5
DBMS_CRYPTO.PAD_NONE
DBMS_CRYPTO.PAD_ZERO
DBMS_CRYPTO.PAD_ORCL
The Block Cipher Chaining Modifiers and Block Cipher Padding Modifiers values defaults to DBMS_CRYPTO.CHAIN_CBC and DBMS_CRYPTO.PAD_PKCS5, if you do not specify values for these parameters. The format option encryption is used with the following DBMS_CLOUD procedures: For example: format => JSON_OBJECT('encryption' value json_object ('type' value DBMS_CRYPTO.ENCRYPT_AES256 + DBMS_CRYPTO.CHAIN_CBC + DBMS_CRYPTO.PAD_PKCS5, 'credential_name' value 'ENCRYPTION_CRED')) |
encryption:valueWhere value is a JSON string that provides additional parameters for encryption: type: value
Specifies the encryption type. credential_name: value
Specifies the credential used to store the encryption key. user_defined_function: valueSpecifies a fully qualified user-defined function to decrypt or encrypt the specified BLOB (binary large object). |
endian |
The endian format option is used to explicitly specify the byte order (endianness) of the platform on which the data file was created. Valid values for endian are: Example: format => json_object ('endian' VALUE 'big')
Specifying the byte order ensures Oracle consistently interprets data types whose storage can differ across systems. The following data types are affected by the endian option: INTEGER
UNSIGNED INTEGER
FLOAT
BINARY_FLOAT
DOUBLE
BINARY_DOUBLE
VARCHAR (numeric count only)
VARRAW (numeric count only)
Any character data type in the UTF16 character set Any string specified by the recorddelimiter format option, when using the UTF16 character set Microsoft Windows-based platforms generate little-endian data, while big-endian platforms include Oracle Solaris and IBM zSeries-based Linux. If the endian format option is not specified, the data is assumed to have the same endianness as the platform where the database is running. Oracle Autonomous AI Database runs on Linux with default endian value of little. For UTF-16 data files, a byte order mark (BOM) at the beginning of the file can indicate the file's endianness; if present, this mark overrides any value specified by the endian format option. See The ORACLE_LOADER Access Driver for more information. |
endian: little
Default value: little for Autonomous AI Database. For other Oracle databases, the default value varies depending on the platform. |
header |
Skips the first row of the file because that row contains the column headings. Refer to DBMS_CLOUD.EXPORT_DATA for instructions on exporting data in CSV format with a column header row. The valid values are: false: Includes the header row as any other data row. This is the default value.
true: Skips the header row for external table and data loading.
String to define custom header names: This value is supported to maintain compatibility with data exporting using DBMS_CLOUD.EXPORT_DATA, so that the same header syntax can be used for external tables. For external table creation, specifying HEADERS= behaves the same as specifying HEADERS=TRUE.
The HEADERS=TRUE or HEADERS= option functions identically to SKIPHEADERS=1, treating the first row in the file as a column header and excluding it from data loading. Refer to the SKIPHEADERS format option in this page. Note: The HEADERS option applies only to CSV or character-delimited files. For example: format => JSON_OBJECT('type' value 'csv', 'delimiter' value ':', 'compression' value 'gzip', 'header' value true)format => JSON_OBJECT('delimiter' value ',', 'compression' value 'gzip', 'header' value 'name , age, salary')format => JSON_OBJECT('type' value 'csv', 'compression' value 'gzip', 'header' value false) |
headers: true | false | <list of delimited column names>Default value: `false` |
endquote |
Data can be enclosed between two delimiters, specified with quote and endquote. The quote and endquote characters are removed during loading when specified. For example: format => JSON_OBJECT('quote' value '(', 'endquote' value ')') |
endquote:character
Default value: Null, meaning no endquote. |
escape |
The character “” is used as the escape character when specified. |
escape : true
Default value: false |
field_list |
Identifies the fields in the source files and their data types. The default value is NULL meaning the fields and their data types are determined by the column_list parameter. For more information see field_list in Oracle Database 19c Utilities or Oracle AI Database 26ai Utilities. |
Default value: NULL |
ignoreblanklines
Option valid with JSON data |
Blank lines are ignored when set to true. |
ignoreblanklines : true
Default value: False |
ignoremissingcolumns |
If there are more columns in the field_list than there are in the source files, the extra columns are stored as null. |
ignoremissingcolumns : true
Default value False |
implicit_partition_config |
The implicit_partition_config option enables implicit partitioning when partition_type is set to "hive". Implicit partition optimization is enabled when strict_column_order is set to "true". The criteria for using this optimization includes: - There are no files with full paths with extra strings before the file uri, and between the partition column name path.
- For example, you cannot have files like
https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/oraclebigdatadb/b/test-partitions/xyz/.. or https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/oraclebigdatadb/b/test-partitions/tenant=tenant_01/xyz/dbtype=dbtype_01/year=... The xyz represents extra strings. These files are skipped if the list optimization is specified and the predicate includes the implicit partition columns after these extra strings. - The order of the partition columns in the specification must match the real file paths, and the file paths must include every column in
partition_columns. - For example, you cannot have files like
https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/oraclebigdatadb/b/test-partitions/year=2023/month=01/tenant=tenant_01/dbtype=... . It's not in the correct order of the implicit_columns specification. These files are skipped if the list specification is specified and the predicates include the out of order columns (for example, where tenant='tenant_01' and dbtype='dbtype_1' and year='2023' and ..) - You cannot have files like
https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/oraclebigdatadb/b/test-partitions/tenant=tenant_01/year=2023/month=01/day=01/myfile.csv. Note that dbtype is missing here. If optimization is enabled, these files are skipped, or the query receives an error message depending on the supplied predicates. - The file paths must use the Hive style partitioning naming.
Sub-options for the implicit_partition_config format option include: partition_type: "hive" is the only accepted value, the default is the non-Hive filename path.partition_columns: ["column1","column2",...]strict_column_order: false (default), true For example: dbms_cloud.create_external_table ( table_name => 'partitions1', credential_name => 'MY_CRED', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/oraclebigdatadb/b/test-partitions/', column_list => 'payload varchar2(100), y number, total_sales number, tenant varchar2(10), dbtype varchar(10), year varchar(4), month varchar(2) day varchar(2)', format => '{"type":"parquet", "implicit_partition_config":{ "partition_type":"hive", "strict_column_order":true, "partition_columns":["tenant","dbtype","year","month","day"] } }'); |
implicit_partition_config has the following sub-options:
partition_type: hive
Default value is non-Hive filename path. The only valid value is hive. partition_columns : array of strings
Default value: If partition_type is specified, the column names are derived through automatic discovery of the partition keys in HIVE-style partitioned data. Otherwise, the default is null and implicit partitioning is not enabled. strict_column_order: true
Default value: false |
implicit_partition_columns |
Enable implicit partitioning and specify the partition column names by using the implicit_partition_columns format option with DBMS_CLOUD.CREATE_EXTERNAL_TABLE. Implicit partitioning is enabled in the following ways: -
Use implicit_partition_columns to provide a list of partition columns and specify the implicit_partition_type. For example: format => '{"implicit_partition_type":"hive", "implicit_partition_columns":["country","year","month"]}'
-
Use implicit_partition_columns to provide a list of partition columns without providing the partition type. The partition type is automatically detected as hive or non-hive. For example: format => '{"implicit_partition_columns":["country","year","month"]}'
-
Use implicit_partition_type to provide the type of partition columns without providing a list of partition columns. The automatic discovery of the partition keys in HIVE-style partitioned data is triggered to determine column names. For example: format => '{"partition_type":"hive"}' See **implicit_partition_type** in the following row for the option description and syntax. |
implicit_partition_columns : array of strings
Default value: If implicit_partition_type is specified, the column names are derived through automatic discovery of the partition keys in HIVE-style partitioned data. Otherwise, the default is null and implicit partitioning is not enabled. |
implicit_partition_type |
Enable implicit partitioning and specify the data types of partition columns by using the implicit_partition_type format option with DBMS_CLOUD.CREATE_EXTERNAL_TABLE. See **implicit_partition_columns** in the earlier row for further information on enabling implicit partitioning and examples. |
implicit_partition_type : hive
Default value: If implicit_partition_columns is specified, the type is automatically detected as hive or non-hive. Otherwise, the default is null and implicit partitioning is not enabled. |
jsonpath
Only use with COPY_COLLECTION |
JSON path to identify the document to load. This option is valid only for JSON collection data with DBMS_CLOUD.COPY_COLLECTION. |
jsonpath: string
Default value: Null |
keyassignment Only use with COPY_COLLECTION |
Specifies whether a new collection is created as a mongo-compatible collection or as a SODA collection. When the value is set to embedded_oid, a new collection is created as a mongo-compatible collection. By default this parameter is not set, meaning a new collection is created as a SODA collection. |
keyassignment: embedded_oid
Default: keyassignment is not set |
keypath
Only use with COPY_COLLECTION |
Specifies an attribute in the data to be loaded as the '_id' value. If keypath is specified then you must also specify the keyassignment value as embedded_oid. Set the value to a path, for example, '$.mykey', to pick the value of the path as '_id' value. This parameter is optional and is only valid for loading into mongo-compatible collections. If not specified, Oracle generates a 12-byte unique system ID and populates that as the '_id' attribute, if an '_id' attribute is not already present in the data being loaded. |
keypath: string
Default: keypath is not set. When keypath is set, the default string value is NULL. |
language |
Specifies a language name (for example, FRENCH), from which locale-sensitive information can be derived. |
language: string
Default value: Null See Locale Data in Oracle AI Database Globalization Support Guide for a listing of Oracle-supported languages. |
logdir |
Specifies a string value that determines the directory object name where the logfile_table or badfile_table files are saved. By default, the logdir is not case-sensitive, but the case is reserved when the specified value is enclosed in double-quotes. For example: format => JSON_OBJECT ('logdir' value 'test_log')
The logdir format option specified in the above example saves the logfile_table or badfile_table files in the TEST_LOG directory object. format => JSON_OBJECT ('logdir' value '"test_log"')
The logdir format option specified in the above example saves the logfile_table or badfile_table files in the test_log directory object. |
logdir: string
Default value: DATA_PUMP_DIR |
logprefix |
Specifies a string value that determines the prefix for the logfile_table and badfile_table files. The log table name format is: logprefix$operation_id By default, the logprefix is in upper case, but the case is reserved when the specified value is enclosed in double-quotes. For example: format => JSON_OBJECT ('logprefix' value 'TEST')Log files then use the TEST prefix, such as: TEST$2_LOG and TEST$2_BAD. |
logprefix: string
Default value: COPY |
logretention |
Specifies a positive integer duration, in days, for which the logfile_table and badfile_table files are retained. Valid values: 0 to 99999 For example: format => JSON_OBJECT ('logretention' value 7) |
logretention: number
Default value: 2 |
maxdocsize
This option is valid only with JSON data |
Maximum size of JSON documents. |
maxdocsize: number
Default value: 1 Megabyte Maximum allowed value: 2 Gigabytes |
numericcharacters |
Specifies the characters to use as the group separator and decimal character. decimal_character: The decimal separates the integer portion of a number from the decimal portion. group_separator: The group separator separates integer groups (that is, thousands, millions, billions, and so on). |
numericcharacters: 'decimal_character group_separator'
Default value: ".," See NLS_NUMERIC_CHARACTERS in Oracle AI Database Globalization Support Guide for more information. |
numberformat |
Specifies the number format model. Number format models cause the number to be rounded to the specified number of significant digits. A number format model is composed of one or more number format elements. This is used in combination with numericcharacters. |
numberformat: number_format_model
Default value: is derived from the setting of the NLS_TERRITORY parameter See Number Format Models in SQL Language Reference for more information. |
partition_columns |
The format option partition_columns is used with DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE to specify the column names and data types of partition columns when the partition columns are derived from the file path, depending on the type of data file, structured or unstructured: When the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE includes the column_list parameter and the data file is unstructured, such as with CSV text files, partition_columns does not include the data type. For example, use a format such as the following for this type of partition_columns specification: '"partition_columns":["state","zipcode"]'
The data type is not required because it is specified in the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE column_list parameter. When the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE does not include the column_list parameter and the data files are structured, such as Avro, ORC, or Parquet files, the partition_columns option includes the data type. For example, the following shows a partition_columns specification: '"partition_columns":[ {"name":"country", "type":"varchar2(10)"}, {"name":"year", "type":"number"}, {"name":"month", "type":"varchar2(10)"}]' If the data files are unstructured and the type sub-clause is specified with partition_columns, the type sub-clause is ignored. For object names that are not based on hive format, the order of the partition_columns specified columns must match the order as they appear in the object name in the file_uri_list. |
|
preview_mode |
The preview_mode format option is used with DBMS_CLOUD.CREATE_EXTERNAL_TABLE and DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE to explicitly specify that the external table should be created using the ORACLE_BIGDATA access driver. For example: `sql format => json_object ('preview_mode' VALUE 'true')`The preview_mode format option is supported only for CSV text files. The create table operation will error out when: preview_mode is set to TRUE and the source data files are not CSV text files.
preview_mode is set to TRUE and additional parameters are specified that are incompatible with the ORACLE_BIGDATA access driver or with CSV text file processing. The error message identifies the incompatible option that caused the failure.
|
preview_mode:true
Default value: false |
quote |
Specifies the quote character for the fields, the quote characters are removed during loading when specified. |
quote: character
Default value: Null meaning no quote |
readsize |
Specifies the size of the read buffer to process records. The readsize parameter must be at least as large as the largest record in the data set that's being read. For example: format => JSON_OBJECT ('readsize' value 1000) |
readsize: number
Default value: 10 Megabytes |
recorddelimiter
Option valid with JSON data |
Specifies the record delimiter. By default, DBMS_CLOUD tries to automatically find the correct newline character as the delimiter. It first searches the file for the Windows newline character "\r\n". If it finds the Windows newline character, this is used as the record delimiter for all files in the procedure. If a Windows newline character is not found, it searches for the UNIX/Linux newline character "\n" and if it finds one it uses "\n" as the record delimiter for all files in the procedure. Specify this argument explicitly if you want to override the default behavior, for example: format => json_object('recorddelimiter' VALUE '''\r\n''')
To indicate that there is no record delimiter you can specify a recorddelimiter that does not occur in the input file. For example, to indicate that there is no delimiter, specify the control character 0x01 (SOH) as a value for the recorddelimiter and set the recorddelimiter value to "0x''01''" (this character does not occur in JSON text). For example: format => '{"recorddelimiter" : "0x''01''"}' While importing from CSV files if the end of a line is NULL or not specified, your database does not copy the data successfully because detected newline cannot interpret this as a delimiter. In such cases, set recorddelimiter value to 'X''0A'''. format => '{"recorddelimiter" : 'X''0A'''}'
The recorddelimiter is set once per procedure call. If you are using the default value, detected newline, then all files use the same record delimiter, if one is detected. |
recorddelimiter: character
Default value: detected newline |
regexuri |
The format option regexuri is used with the following DBMS_CLOUD procedures: When the value of regexuri is set to TRUE, you can use wildcards as well as regular expressions in the file names in Cloud source file URIs. The characters "*" and "?" are considered wildcard characters when the regexuri parameter is set to FALSE. When the regexuri parameter is set to TRUE the characters "*" and "?" are part of the specified regular expression pattern. Regular expression patterns are only supported for the file name or subfolder path in your URIs and the pattern matching is identical to that performed by the REGEXP_LIKE function. Regular expression patterns are not supported for directory names. For external tables, this option is only supported with the tables that are created on a file in the Object Storage. For example: format => JSON_OBJECT('regexuri' value TRUE)
See REGEXP_LIKE Condition for more information on REGEXP_LIKE condition. |
regexuri: True
Default value : False |
rejectlimit |
The operation will error out after specified number of rows are rejected. |
rejectlimit: number
Default value: 0 |
removequotes |
Removes any quotes that are around any field in the source file. |
removequotes: true
Default value: False |
skipheaders |
Specifies how many rows should be skipped from the start of the file. |
skipheaders: number
Default value: 0 if not specified, 1 if specified without a value |
territory |
Specifies a territory name to further determine input data characteristics. |
territory: string
Default value: Null See Locale Data in Oracle AI Database Globalization Support Guide for a listing of Oracle-supported territories. |
timestampformat |
Specifies the timestamp format in the source file. The format option AUTO searches for the following formats: YYYY-MM-DD HH:MI:SS.FF YYYY-MM-DD HH:MI:SS.FF3 YYYY-MM-DD HH24:MI:SS.FF3 MM/DD/YYYY HH:MI:SS.FF3 |
timestampformat : string
Default value: Database timestamp format The string can contain wildcard characters such as "$". |
timestampltzformat |
Specifies the timestamp with local timezone format in the source file. The format option AUTO searches for the following formats: DD Mon YYYY HH:MI:SS.FF TZR MM/DD/YYYY HH:MI:SS.FF TZR YYYY-MM-DD HH:MI:SS+/-TZR YYYY-MM-DD HH:MI:SS.FF3 DD.MM.YYYY HH:MI:SS TZR |
timestampltzformat : string
Default value: Database timestamp with local timezone format |
timestamptzformat |
Specifies the timestamp with timezone format in the source file. The format option AUTO searches for the following formats: DD Mon YYYY HH:MI:SS.FF TZR MM/DD/YYYY HH:MI:SS.FF TZR YYYY-MM-DD HH:MI:SS+/-TZR YYYY-MM-DD HH:MI:SS.FF3 DD.MM.YYYY HH:MI:SS TZR |
timestamptzformat: string
Default value: Database timestamp with timezone format |
trimspaces |
Specifies how the leading and trailing spaces of the fields are trimmed. See the description of trim_spec. |
trimspaces: rtrim\| ltrim\| notrim\| lrtrim\| ldrtrim
Default value: notrim |
truncatecol |
If the data in the file is too long for a field, then this option will truncate the value of the field rather than reject the row. |
truncatecol:true
Default value: False |
type |
Specifies the source file type. See the description of CSV in field_definitions Clause If the type is datapump, then the only other valid format option is rejectlimit. If the type is datapump, then the only Object Stores supported are Oracle Cloud Infrastructure Object Storage and Oracle Cloud Infrastructure Object Storage Classic. See DBMS_CLOUD Package Format Options for Avro, ORC, or Parquet for type values avro, orc, or parquet. For JSON data with DBMS_CLOUD.COPY_COLLECTION type has two valid values: json (default) and ejson. For DBMS_CLOUD.COPY_COLLECTION these values both specify that the input is JSON data. The value ejson causes extended objects in the textual JSON input data to be translated to scalar JSON values in the native binary JSON collection. The value json does not perform this transformation and all objects in the input data are converted to binary JSON format. For JSON data with DBMS_CLOUD.COPY_DATA type has one valid value: json. This value specifies that the input is JSON data. |
type:csv\|csv with embedded\|csv without embedded\|avro\|datapump\|orc\|parquet **Note** : Not all DBMS_CLOUD procedures support all of these types.
csv is the same as csv without embedded. Default value: Null For JSON data there are two valid type values for use with DBMS_CLOUD.COPY_COLLECTION: json|ejson In this case the default value is json. For JSON data with DBMS_CLOUD.COPY_DATA, only json is valid. |
unpackarrays
Only use with COPY_COLLECTION |
When set to true, if a loaded document is an array, then the contents of the array are loaded as documents rather than the array itself. This only applies to the top-level array. When set to true, the entire array is inserted as a single document. This option is valid only for JSON collection data with DBMS_CLOUD.COPY_COLLECTION. |
unpackarrays: true
Default value: False |