DBMS_CLOUD Package Format Options for Parquet and Avro

The format argument in DBMS_CLOUD specifies the format of source files.

The two ways to specify the format argument are:

format => '{"format_option" : “format_value” }'  

And:

format => json_object('format_option' value 'format_value'))

Examples:

format => json_object('type' VALUE 'CSV')

To specify multiple format options, separate the values with a ",".

For example:

format => json_object('ignoremissingcolumns' value 'true', 'removequotes' value 'true', 'dateformat' value 'YYYY-MM-DD-HH24-MI-SS', 'blankasnull' value 'true')

Table A-2 DBMS_CLOUD Parquet and Avro File Format Options

Format Option Description Syntax

type

Specifies the file type.

type : parquet | avro

schema

When schema is set to first or all, the external table columns and data types are automatically derived from the Parquet or Avro file. The column names will match those found in Parquet or Avro. The data types are converted from Parquet or Avro data types to Oracle data types. All columns are added to the table.

Specifies to use the metadata from the first Parquet or Avro file or from all Parquet or Avro files in the file_uri_list to auto generate the columns and their data types. Use first if all of the files have the same schema. Use all (slower) if the files may have different schemas.

Default: If column_list is specified, then the schema value, if specified is ignored. If column_list is not specified then the schema default value is first.

Note:

The schema format option is not available and the column_list parameter must be specified for partitioned external tables using the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE procedure.

schema : first | all