Changes in Oracle Big Data SQL 4.0

Support for Oracle Database 18c as well as Backward Compatibility for Oracle Database 12.2 and 12.1

To take advantage of the new capabilities in Oracle Big Data SQL 4.0, you need use Oracle Database 18c or later. However, use of Oracle Database 12.1 and 12.2 is fully supported (even though you can't leverage the new 4.0 capabilities with these database versions). This backward compatibility enables you to install and administer release 4.0 in a mixed environment that includes both Oracle Database 18c and 12c.

Big Data SQL Query Server

Big Data SQL Query Server is a lightweight, zero-maintenance Oracle Database. It gives you an easy way to query data in Hadoop without the need for a full-fledged Oracle Database service. The services consist of the Oracle SQL query engine only. It provides no persistent storage except for certain categories of metadata that are useful to retain across sessions.

Installs Automatically and Requires no Maintenance

Big Data SQL Query Server is included as part of the standard Oracle Big Data SQL installation. The only thing you need to provide is the address of an edge node where you would like the service installed. The installation itself is fully automated and requires no post-installation configuration.
Provides Single and Multi-User Modes

The service provides two modes – single-user and multi-user. Single-user mode utilizes a single user for accessing the Query Server. All users connect to the Query Server as the BDSQL user with the password specified during the installation. In multi-user mode Hadoop cluster users log into the Query Server using their Kerberos principal.
Works with Kerberos, Automatically Imports Kerberos Principals

A Kerberos-secured cluster can support both single user and multi-user mode.

During installation on a secured cluster, the installer automatically queries the KDC to identify the Kerberos principals and then sets up externally identified users based on the principals. After installation, the administrator can manually add or remove principals.
Resets to Initial State After Each Query Server Restart

Each time Big Data SQL Query Server is restarted, the database instance is reset to the original state. This also happens if a fatal error occurs. This reset enables you to start again from a “clean slate.” A restart preserves external tables (both ORACLE_HIVE and HDFS types), associated statistics, and user-defined views. A restart deletes regular tables containing user data
Can be Managed Through Hortonworks Ambari or Cloudera Manager

Big Data SQL Query Service is automatically set up as a service in Ambari or Cloudera Manager. You can use these administrative tools to monitor and stop/start the process, view warning, error, and informational messages, and perform some Big Data SQL Query Service operations such as statistics gathering and Hive metadata import.

Query Server is provided under a limited use license described in Oracle Big Data SQL Licensing in Oracle Big Data SQL Installation Guide.

New ORACLE_BIGDATA Driver for Accessing Object Stores

In addition to ORACLE_HIVE and ORACLE_HDFS, release 4.0 also includes the new ORACLE_BIGDATA driver. This driver enables you to create external tables over data within object stores in the cloud. Currently Oracle Object Store and Amazon S3 are supported. ORACLE_BIGDATA enables you to create external tables over Parquet, Avro, and text files in these environments. For development and testing, you can also use it to access local data files through Oracle Database directory objects. The driver is written in C and does not execute any Java code.

In release 4.0, ORACLE_BIGDATA supports the return of scalar fields from Parquet files. More complex data types as well as multi-part Parquet files are not supported at this time. Because the reader does not support complex data types in the Parquet file, the column list generated omits any complex columns from the external table definition. Most types stored in Parquet files are not directly supported as types for columns in Oracle tables.

Oracle Big Data SQL's Smart Scan, including the new aggregation offload capability, work with object stores by offloading data from the object store to processing cells on the Hadoop cluster where Oracle Big Data SQL is installed.

Authentication against object stores is accomplished through a credential object that you create using the DBMS_CREDENTIAL package. You include the name of the credential object as well as a location URI as parameters of the external table create statement.