G.5 Changes in Oracle Big Data Appliance Release 4 (4.6)

The following are changes in Oracle Big Data Appliance Release 4 (4.6):

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.8

  • CM (Cloudera Manager) 5.8.1

  • Oracle Big Data Connectors 4.6

  • Oracle Data Integrator Agent 12.2.1.1 (for Oracle Big Data Connectors)

  • Oracle R Advanced Analytics for Hadoop (ORAAH) 2.6.0 

  • Oracle's R Distribution (ORD) 3.2.0

  • Perfect Balance 2.8

  • Oracle Linux 6.8

  • Big Data Discovery 1.2.2 or 1.3.x (optional)

  • Big Data SQL 3.0.1 (optional)

  • Java JDK 8u101

New Features

  • Cloudera CDH 5.8 and Cloudera Manager 5.8.1

    See the Cloudera Enterprise 5.8.x Documentation for information about CDH and Cloudera Manager 5.8.x

  • Oracle Big Data SQL Updates

    Oracle Big Data Appliance Release 4.6 includes Oracle Big Data SQL 3.0.1 as a Mammoth installation option. You do not need to download it from the Oracle Software Delivery Cloud to install it on Oracle Big Data Appliance 4.6.

    Oracle Big Data SQL 3.1 will be downloadable from the Oracle Software Delivery Cloud when available.

  • Networking Changes for Greater Configuration Flexibility

    The release provides more modular control over Oracle Big Data Appliance networks by separately storing the network configuration settings for each rack and cluster: <rack_name>-rack-network.json and <cluster-name>cluster-network.json. When you are using the Oracle Big Data Appliance Configuration Generation Utility, these changes enable you to reconfigure the client network or private network of a cluster without affecting the configuration of other servers on the rack. They also allow you to expand a cluster or rack without affecting servers that are not part of the configuration.

    In previous releases, all such information was stored in a single network.json file. This file still exists for backward compatibility with some scripts.

  • Encryption for Data Spills and some Intermediate Files

    Data spills to disk outside of HDFS during the following memory-intensive processes can now be protected by encryption.

    • Spark shuffle.

    • Creation of intermediate files in MapReduce encrypted shuffle and spillage during map and reduce operations.

    • Impala SQL queries that generate extremely large result sets.

    When you enable Hadoop Network Encryption in Mammoth during a full Oracle Big Data Appliance installation, or later, via bdacli enable hadoop_network_encryption, then encryption is also enabled for Spark, Impala, as well MapReduce intermediate files and data spills.

    Note:

    In the case of an Oracle Big Data Appliance upgrade to Release 4.6, the upgrade does not automatically enable this new extension of encryption to data spills, regardless of whether or not Hadoop Network Encryption is already enabled. If you want this feature on a system that has been upgraded to Release 4.6, run bdacli enable hadoop_network_encryption after the upgrade.
  • New reset Command in bdacli

    The bdacli reset command selectively reconfigures Oracle Big Data Appliance networks. It pulls the new settings from the network configuration files generated by Oracle Big Data Appliance Configuration Generation Utility. The user controls the scope of the reset (server, network, cluster) and which networks are reset.

    See "bdacli reset" in the Oracle Big Data Appliance Utilities section of this guide.

  • Support for Oracle NoSQL Database 4.0.9

Other Changes

  • Mammoth Installation Step Changes

    Some Mammoth installation steps have been reorganized and renamed. An important change to note is that the Kerberos installation now consists of two separate pre- and post-cluster configuration steps in order to enable additional security setups on the cluster. See Mammoth Installation Steps.

  • X6-2 servers are shipped with an Oracle Big Data Appliance v4.5.0 base image.