DBMS_CLOUD Package Format Options for Avro, ORC, Parquet

The format argument in DBMS_CLOUD specifies the format of source files.

The two ways to specify the format argument are:

format => '{"format_option" : “format_value” }'  

And:

format => json_object('format_option' value 'format_value'))

Examples:

format => json_object('type' VALUE 'CSV')

To specify multiple format options, separate the values with a ",".

For example:

format => json_object('ignoremissingcolumns' value 'true', 'removequotes' value 'true', 'dateformat' value 'YYYY-MM-DD-HH24-MI-SS', 'blankasnull' value 'true')

Note:

Support for ORC format requires Oracle Database 19c. The format option type with value orc is not supported with Oracle Database 18c.
Format Option Description Syntax

type

Specifies the file type.

type : avro | orc | parquet

schema

When schema is set to first or all, the external table columns and data types are automatically derived from the ORC, Parquet or Avro file metadata.

The column names will match those found in ORC, Parquet, or Avro. The data types are converted from ORC, Parquet, or Avro data types to Oracle data types. All columns are added to the table.

The value first specifies to use the metadata from the first ORC, Parquet, or Avro file in the file_uri_list to auto generate the columns and their data types. Use first if all of the files have the same schema.

The value all specifies to use the metadata from all ORC, Parquet,or Avro files in the file_uri_list to auto generate the columns and their data types. Use all (slower) if the files may have different schemas.

Default: If column_list is specified, then the schema value, if specified is ignored. If column_list is not specified then the schema default value is first.

Note: For ORC, Parquet, or Avro format files the schema format option is not available and the column_list parameter must be specified for partitioned external tables using the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE procedure.

schema : first | all