Oracle NoSQL to Parquet Data Type Mapping

Describes the mapping of Oracle NoSQL data types to Parquet data types.

NoSQL Type Parquet Type
BOOLEAN BOOLEAN
INTEGER INT32
LONG INT64
FLOAT DOUBLE
DOUBLE DOUBLE
BINARY BINARY
FIXED_BINARY BINARY
STRING BINARY(STRING)
ENUM BINARY(STRING)

or

BINARY(ENUM), if the logical ENUM is configured

UUID BINARY(STRING)

or

FIXED_BINARY(16), if the logical UUID is configured

TIMESTAMP(p) INT64(TIMESTAMP(p))
NUMBER DOUBLE
field_name ARRAY(T)
group field_name(LIST) {
  repeated group list {
      required T element
  }
}
field_name MAP(T)
group field_name (MAP) {
    repeated group key_value (MAP_KEY_VALUE) {
       required binary key (STRING);
        required T value;
    }
}
field_name RECORD(K₁ T₁ N₁, Kٖ₂ T₂ N₂, ....)

where:

K = Key name

T = Type

N = Nullable or not

group field_name {
    ni == true ? optional Ti ki : required Ti ki   
}
JSON BINARY(STRING)

or

BINARY(JSON), if logical JSON is configured

Note:

When the NoSQL Number type is converted to Parquet Double type, there may be some loss of precision in case the value cannot be represented in Double. If the number is too big to represent as Double, it is converted to Double.NEGATIVE_INFINITY or Double.POSITIVE_INFINITY.