3.1 About Storing to Hadoop and Hybrid Partitioned Tables

Oracle Big Data SQL resources support different methods for off-loading Oracle Database tables. These methods include storing to hybrid partitioned tables (HPT), and storing to Oracle Database tablespaces in HDFS.

You can use the Oracle Copy to Hadoop utility to efficiently move data from Oracle Database to Hadoop. That data will be stored in datapump format and can then be queried by both Big Data SQL and native Hadoop tools on the cluster. See Use Copy to Hadoop.

The table below compares the off-loading methods of storing data to hybrid partitioned tables with storing data to Oracle Database tablespaces in HDFS.

Table 3-1 Comparison between off-loading methods

Store to Hybrid Partitioned Tables Oracle Tablespaces in HDFS
Classical internal partitioned tables are combined with Oracle external partitioned tables to form a hybrid partitioned table. Partitioning breaks up tables into smaller more manageable pieces. This increases performance and allows off-loading data to less expensive storage. Oracle Database tables or partitions are stored within the tablespace in HDFS in their original Oracle-internal format.
Access to external partitions support all existing table types and the following access drivers: ORACLE_HDFS, ORACLE_HIVE, ORACLE_BIGDATA.

Not only can HPT point to Copy to Hadoop data pump files, it can also point to data that is in more traditional Hadoop file types such as Parquet, ORC, and CSV.

Access is directly though the original Oracle Database tables. External tables are not needed.
Data can be accessed by non-Oracle Database tools. Since the data can be in CSV, Parquet, Avro, and ORC formats any big data technology is able to access seamlessly. With HPT, you store data in the best format for your application.

See Enable Access to Hybrid Partitioned Tables.

Data is directly available to Oracle Database only. Data is not accessible to other processes in Hadoop.

See Store Oracle Tablespaces in HDFS.