Preface

The Oracle Big Data SQL User's Guide describes how to use and manage the Oracle Big Data SQL product.

Audience

This guide is intended for administrators and users of Oracle Big Data SQL, including:

Application developers
Data analysts
Data scientists
Database administrators
System administrators

The guide assumes that the reader has basic knowledge of Oracle Database single-node and multinode systems, the Hadoop framework, the Linux operating system, and networking concepts.

Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.

Access to Oracle Support

Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.

Conventions

The following text conventions are used in this document:

Convention	Meaning
boldface	Boldface type indicates graphical user interface elements associated with an action, or terms defined in text or the glossary.
italic	Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values.
`monospace`	Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter.
`#` prompt	The pound (`#`) prompt indicates a command that is run as the Linux `root` user.

Backus-Naur Form Syntax

The syntax in this reference is presented in a simple variation of Backus-Naur Form (BNF) that uses the following symbols and conventions:

Symbol or Convention	Description
[ ]	Brackets enclose optional items.
{ }	Braces enclose a choice of items, only one of which is required.
\|	A vertical bar separates alternatives within brackets or braces.
...	Ellipses indicate that the preceding syntactic element can be repeated.
delimiters	Delimiters other than brackets, braces, and vertical bars must be entered as shown.
boldface	Words appearing in boldface are keywords. They must be typed as shown. (Keywords are case-sensitive in some, but not all, operating systems.) Words that are not in boldface are placeholders for which you must substitute a name or value.

Changes in Oracle Big Data SQL 4.0

The following are new features and updates in Oracle Big Data SQL Release 4.0.

Support for Oracle Database 18c as well as Backward Compatibility for Oracle Database 12.2 and 12.1

To take advantage of the new capabilities in Oracle Big Data SQL 4.0, you need use Oracle Database 18c or later. However, use of Oracle Database 12.1 and 12.2 is fully supported (even though you can't leverage the new 4.0 capabilities with these database versions). This backward compatibility enables you to install and administer release 4.0 in a mixed environment that includes both Oracle Database 18c and 12c.

Big Data SQL Query Server

Big Data SQL Query Server is a lightweight, zero-maintenance Oracle Database. It gives you an easy way to query data in Hadoop without the need for a full-fledged Oracle Database service. The services consist of the Oracle SQL query engine only. It provides no persistent storage except for certain categories of metadata that are useful to retain across sessions.

Installs Automatically and Requires no Maintenance

Big Data SQL Query Server is included as part of the standard Oracle Big Data SQL installation. The only thing you need to provide is the address of an edge node where you would like the service installed. The installation itself is fully automated and requires no post-installation configuration.
Provides Single and Multi-User Modes

The service provides two modes – single-user and multi-user. Single-user mode utilizes a single user for accessing the Query Server. All users connect to the Query Server as the BDSQL user with the password specified during the installation. In multi-user mode Hadoop cluster users log into the Query Server using their Kerberos principal.
Works with Kerberos, Automatically Imports Kerberos Principals

A Kerberos-secured cluster can support both single user and multi-user mode.

During installation on a secured cluster, the installer automatically queries the KDC to identify the Kerberos principals and then sets up externally identified users based on the principals. After installation, the administrator can manually add or remove principals.
Resets to Initial State After Each Query Server Restart

Each time Big Data SQL Query Server is restarted, the database instance is reset to the original state. This also happens if a fatal error occurs. This reset enables you to start again from a “clean slate.” A restart preserves external tables (both ORACLE_HIVE and HDFS types), associated statistics, and user-defined views. A restart deletes regular tables containing user data
Can be Managed Through Hortonworks Ambari or Cloudera Manager

Big Data SQL Query Service is automatically set up as a service in Ambari or Cloudera Manager. You can use these administrative tools to monitor and stop/start the process, view warning, error, and informational messages, and perform some Big Data SQL Query Service operations such as statistics gathering and Hive metadata import.

Query Server is provided under a limited use license described in Oracle Big Data SQL Licensing in Oracle Big Data SQL Installation Guide.

New ORACLE_BIGDATA Driver for Accessing Object Stores

In addition to ORACLE_HIVE and ORACLE_HDFS, release 4.0 also includes the new ORACLE_BIGDATA driver. This driver enables you to create external tables over data within object stores in the cloud. Currently Oracle Object Store and Amazon S3 are supported. ORACLE_BIGDATA enables you to create external tables over Parquet, Avro, and text files in these environments. For development and testing, you can also use it to access local data files through Oracle Database directory objects. The driver is written in C and does not execute any Java code.

In release 4.0, ORACLE_BIGDATA supports the return of scalar fields from Parquet files. More complex data types as well as multi-part Parquet files are not supported at this time. Because the reader does not support complex data types in the Parquet file, the column list generated omits any complex columns from the external table definition. Most types stored in Parquet files are not directly supported as types for columns in Oracle tables.

Oracle Big Data SQL's Smart Scan, including the new aggregation offload capability, work with object stores by offloading data from the object store to processing cells on the Hadoop cluster where Oracle Big Data SQL is installed.

Authentication against object stores is accomplished through a credential object that you create using the DBMS_CREDENTIAL package. You include the name of the credential object as well as a location URI as parameters of the external table create statement.

Audience

Documentation Accessibility

Related Documents

Conventions

Backus-Naur Form Syntax

Changes in Oracle Big Data SQL 4.0