Preface

This guide describes Oracle Big Data Appliance, which is used for acquiring, organizing, and analyzing very large data sets. It includes information about hardware operations, site planning and configuration, and physical, electrical, and environmental specifications.

This preface contains the following topics:

Audience

This guide is intended for Oracle Big Data Appliance customers and those responsible for data center site planning, installation, configuration, and maintenance of Oracle Big Data Appliance.

Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.

Access to Oracle Support

Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.

Related Documentation

Oracle Big Data Appliance Documentation Library

In the Big Data Documentation Portal of the Oracle Help Center, you can find a link to the complete Oracle Big Data Appliance library for your release of the product. The library includes the following core documents as well as documents for products that are used in conjunction with Oracle Big Data Appliance:

  • Oracle Big Data Appliance Owner’s Guide (this guide)

  • Oracle Big Data Appliance Software User’s Guide

  • Oracle Big Data Connectors User’s Guide

  • Oracle Big Data Appliance Safety and Compliance Guide

  • Oracle Big Data Appliance Licensing Information User Manual

Note:

The Oracle Big Data Appliance Licensing Information User Manual is the consolidated reference for licensing information for Oracle and third-party software included in the Oracle Big Data Appliance product. Refer to this manual or contact Oracle Support if you have questions about licensing.

Documentation for Affiliated Products

The following Oracle libraries contain hardware information for Oracle Big Data Appliance. Links to these libraries are available through the Big Data Documentation Portal at

https://docs.oracle.com/en/bigdata/

Conventions

The following text conventions are used in this document:

Convention Meaning

boldface

Boldface type indicates graphical user interface elements associated with an action, or terms defined in text or the glossary.

italic

Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values.

monospace

Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter.

# prompt

The pound (#) prompt indicates a command that is run as the Linux root user.

Backus-Naur Form Syntax

The syntax in this reference is presented in a simple variation of Backus-Naur Form (BNF) that uses the following symbols and conventions:

Symbol or Convention Description

[ ]

Brackets enclose optional items.

{ }

Braces enclose a choice of items, only one of which is required.

|

A vertical bar separates alternatives within brackets or braces.

...

Ellipses indicate that the preceding syntactic element can be repeated.

delimiters

Delimiters other than brackets, braces, and vertical bars must be entered as shown.

boldface

Words appearing in boldface are keywords. They must be typed as shown. (Keywords are case-sensitive in some, but not all, operating systems.) Words that are not in boldface are placeholders for which you must substitute a name or value.

Changes in Oracle Big Data Appliance Release 4 (4.8)

Release 4.8 includes following feature changes and software updates.

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.10.1

  • CM (Cloudera Manager) 5.10.1

  • Oracle Big Data Connectors 4.8

  • Big Data SQL 3.1 (earlier releases are not supported on Oracle Big Data Appliance 4.8)

  • Oracle Big Data Spatial & Graph 2.2.0

  • MySQL Enterprise Edition 5.7.17

  • Perfect Balance 2.10.0

  • Java JDK 8u121

  • Oracle Linux 6.8 UEK 4 (Oracle Linux Unbreakable Kernel, Release 4)

    Mammoth v4.8.0 for Oracle Linux 5 is available (only for upgrades of clusters based on Oracle Linux 5).

  • Oracle Data Integrator Agent 12.2.1.1 (for Oracle Big Data Connectors)

  • Oracle R Advanced Analytics for Hadoop (ORAAH) 2.7.0 

  • Oracle's R Distribution (ORD) 3.2.0

  • Oracle NoSQL Community Edition or Oracle NoSQL Database Enterprise Edition 4.3.10. Both optional.

    Note that support is no longer available for Oracle NoSQL Database Community Edition.

  • Kudu 1.2

  • Kafka 2.1

  • Spark 2.0

  • Cloudera Key Trustee Server 5.10.1

The Cloudera parcels for Spark, Kudu, Kafka, and Key Trustee Server are included for your convenience, but are not deployed or configured by default.

See the Cloudera Documentation for information about CDH and Cloudera Manager 5.10.1

Oracle Big Data Appliance Release 4.8 includes Oracle Big Data SQL 3.1 as a Mammoth installation option. You do not need to download this package from the Oracle Software Delivery Cloud.

New Features

In addition to software upgrades and extra Cloudera parcels, Oracle Big Data Appliance provides these new features.

  • Automated LAG Setup

    The Oracle Big Data Appliance Configuration Utility provides an option to assign groups of connectors to LAGs (Link Aggregation Groups) for cluster client network connections. LAGs can provide faster uplink times and reduce the points of potential failure in the network. See Configuring for LACP/LAG Connections in this guide for details.

  • MIT Kerberos Security Option for Oracle NoSQL Enterprise installations

    You can select MIT Kerberos security for an Oracle NoSQL cluster via the Oracle Big Data Appliance Configuration Utility.

  • Oracle Big Data SQL can Connect to the Exadata Database Machine Over Ethernet

    Ethernet is now supported for Oracle Big Data SQL networking between Oracle Exadata Database Machine and Oracle Big Data Appliance. This enables you to use Oracle Big Data SQL with these two Engineered Systems in environments where InfiniBand is not feasible, such as when the two systems are geographically distant from each other.

    The Oracle Big Data Appliance Configuration Utility now provides an option to choose between Ethernet and InfiniBand for Oracle Big Data SQL when you are setting up the configuration for a cluster.

    Ethernet connections for Oracle Big Data SQL are not supported for networking between Oracle Big Data Appliance and Oracle SPARC SuperCluster.

  • Oracle Big Data SQL Networking is Switchable Between Ethernet and InfiniBand

    If needed, you can switch between InfiniBand and Ethernet after the Mammoth installation and after the Oracle Big Data SQL installation has been completed. See Choosing Between Ethernet and InfiniBand Connections For Oracle Big Data SQL in this guide for details.

Oracle Big Data SQL 3.1 can be Enabled/Disabled Using bdacli

In Release 4.8, the Oracle Big Data Appliance Configuration Generation Utility provides an option to install Oracle Big Data SQL 3.1. If you choose this option, then the Hadoop side of Oracle Big Data SQL is enabled automatically.

If you do not choose to enable Oracle Big Data SQL in the Mammoth installation, you can enable it later using the bdacli command line interface:

# bdacli enable big_data_sql

You can also use bdacli to disable the Hadoop side Oracle Big Data SQL:

# bdacli disable big_data_sql

Note:

For Oracle Big Data Appliance Release 4.8, the instructions in the Oracle Big Data SQL 3.1 Installation Guide for downloading, installing, and uninstalling the software on the Hadoop cluster do not apply.
  • You do not need to download Oracle Big Data SQL 3.1 from the Oracle Software Delivery Cloud. The software is already included in the Mammoth bundle.

  • Do not install Oracle Big Data SQL 3.1 on Release 4.8 using ./setup-bds install bds-config.json.

    Mammoth performs this part of the installation for you. Likewise, do not use ./setup-bds uninstall bds-config.json to uninstall the software from the Hadoop cluster. To disable Oracle Big Data SQL after the Mammoth installation (or to enable it if you did not opt include it in the Mammoth installation), use the bdacli commands as described above.

Except for the install and uninstall commands, the instructions on using setup-bds in the Oracle Big Data SQL 3.1 Installation Guide apply to Oracle Big Data Appliance and should be used as documented. This is also true for the database side of Oracle Big Data SQL. Create the database installation bundle and install it on the nodes of the Oracle Database system as described in the guide.

In Release 4.8, Node Migration and the mammoth -e Command Require Extra Steps for Oracle Big Data SQL Support

  • If Oracle Big Data SQL is enabled, then you must disable it on the entire cluster prior to running bdacli admin_cluster migrate <node>. Re-enable it after the migration. Use the bdacli enable and disable commands.

    1. Disable Oracle Big Data SQL across the cluster: bdacli disable big_data_sql

    2. Perform the migration.

    3. Re-enable Oracle Big Data SQL: bdacli enable big_data_sql

  • The Mammoth command for extending an installation to additional servers, mammoth -e, does not make necessary Oracle Big Data SQL changes. Correct the problem by calling ./setup-bds extend bds-config.json from the node where Cloudera Manager is installed after mammoth -e completes.

TLS 1.0 (TLSv1) Disabled for Cloudera Configuration Manager and Hue – May Affect Client Access

In order to improve security, Oracle Big Data Appliance has disabled TLS 1.0 for Cloudera Manager and Hue and also in the system-wide Java configuration. This can affect older browsers or operating systems.

It is recommended that you reconfigure or upgrade clients using TLS 1.0 to use newer encryption, but if necessary you can re-enable TLS 1.0 for Cloudera Manager and Hue.

For details, log on to My Oracle Support and search for BDA 4.8 Disables TLSv1 by Default For Cloudera Manager/Hue/And in System-Wide Java Configurations (Doc ID 2250841.1).

Oracle Big Data Discovery Installation on Big Data Appliance 4.8 not Supported at This Time

A compatible build of Oracle Big Data Discovery is under development. This document will be updated when a compatible build is available.

Deprecated or Discontinued Features

  • The Perfect Balance automatic invocation feature is not supported in this release. If you having been using automatic invocation, please switch to the Perfect Balance API.

  • There is no longer an option to purchase support for Oracle NoSQL Database Community Edition.

Other Notices

  • Mammoth, the Oracle Big Data Appliance installation package, now consists of four ZIP files, all of which must be downloaded and extracted before running the BDAMammoth-4.8.0.run file

  • Mammoth cluster upgrades to v4.8.0 are only supported directly from Mammoth v4.4.0 or later. For upgrades from earlier Mammoth versions, it will be necessary to first upgrade to Mammoth v4.4.0.

  • Configuring the network on new racks with an installed BDA Base Image lower than BDA v4.5.0 requires special configuration steps. Log on to My Oracle Support and search for this document :

    Network Configuration Instructions for Shipped BDA racks with a BDA Base Image Less Than V4.5.0 (Doc ID 2135358.1).

  • Oracle strongly recommends that customers using Oracle Big Data Appliance Clusters with Microsoft Active Directory configure their clusters to use cross-realm trust to Microsoft Active Directory and not use Direct-Active-Directory configuration for the Hadoop cluster.

  • Before upgrading an Oracle Linux 5 cluster that has had HTTPS enabled for Cloudera Manager, it is necessary to follow these steps :
    1. Confirm that HTTPS is enabled for Cloudera Manager .

    2. Back up GridDefParams.pm:
      # cp /opt/oracle/BDAMammoth/bdaconfig/GridDefParams.pm /opt/oracle/BDAMammoth/bdaconfig/GridDefParams.pm.orig
      
    3. Update lines 141 and 142 of /opt/oracle/BDAMammoth/bdaconfig/GridDefParams.pm to these settings:

      our $CM_ADMIN_PROTO = "https";                                                                       
      our $CM_ADMIN_PORT = "7183";                                                                          
      

Changes in Previous Releases of Oracle Big Data Appliance

The following summarizes the change history of Oracle Big Data Appliance.

Changes in Oracle Big Data Appliance Release 4 (4.7)

Oracle Big Data Appliance 4.7 is focused on defect fixes and software stack version updates, including major updates to the UEK kernel, MySQL, and Cloudera (Cloudera Manager and CDH). There are no other customer-visible changes in this release.

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.9

  • CM (Cloudera Manager) 5.9

  • Oracle Big Data Connectors 4.7

  • Big Data SQL 3.0.1 or 3.1 (both optional)

  • MySQL Enterprise Edition 5.6.32

  • Perfect Balance 2.9

  • Java JDK 8u111

  • Oracle Linux 6.8 UEK 4 (Oracle Linux Unbreakable Kernel, Release 4)

  • Oracle Data Integrator Agent 12.2.1.1 (for Oracle Big Data Connectors)

  • Oracle R Advanced Analytics for Hadoop (ORAAH) 2.7.0 

  • Oracle's R Distribution (ORD) 3.2.0

  • Oracle Big Data Discovery 1.3.2 or 1.4 (1.4.0.37.1388 or greater). This is optional.

  • Oracle NoSQL Database Community Edition or Enterprise Edition 4.2.9 (optional)

See the Cloudera Documentation for information about CDH and Cloudera Manager 5.9

Oracle Big Data Appliance Release 4.7 includes Oracle Big Data SQL 3.0.1 as a Mammoth installation option. You do not need to download this package from the Oracle Software Delivery Cloud. Oracle Big Data SQL 3.1 is also compatible with Release 4.7 and is available for download from Oracle Software Delivery Cloud.

Changes in Oracle Big Data Appliance Release 4 (4.6)

The following are changes in Oracle Big Data Appliance Release 4 (4.6):

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.8

  • CM (Cloudera Manager) 5.8.1

  • Oracle Big Data Connectors 4.6

  • Oracle Data Integrator Agent 12.2.1.1 (for Oracle Big Data Connectors)

  • Oracle R Advanced Analytics for Hadoop (ORAAH) 2.6.0 

  • Oracle's R Distribution (ORD) 3.2.0

  • Perfect Balance 2.8

  • Oracle Linux 6.8

  • Big Data Discovery 1.2.2 or 1.3.x (optional)

  • Big Data SQL 3.0.1 (optional)

  • Java JDK 8u101

New Features

  • Cloudera CDH 5.8 and Cloudera Manager 5.8.1

    See the Cloudera Enterprise 5.8.x Documentation for information about CDH and Cloudera Manager 5.8.x

  • Oracle Big Data SQL Updates

    Oracle Big Data Appliance Release 4.6 includes Oracle Big Data SQL 3.0.1 as a Mammoth installation option. You do not need to download it from the Oracle Software Delivery Cloud to install it on Oracle Big Data Appliance 4.6.

    Oracle Big Data SQL 3.1 will be downloadable from the Oracle Software Delivery Cloud when available.

  • Networking Changes for Greater Configuration Flexibility

    The release provides more modular control over Oracle Big Data Appliance networks by separately storing the network configuration settings for each rack and cluster: <rack_name>-rack-network.json and <cluster-name>cluster-network.json. When you are using the Oracle Big Data Appliance Configuration Generation Utility, these changes enable you to reconfigure the client network or private network of a cluster without affecting the configuration of other servers on the rack. They also allow you to expand a cluster or rack without affecting servers that are not part of the configuration.

    In previous releases, all such information was stored in a single network.json file. This file still exists for backward compatibility with some scripts.

  • Encryption for Data Spills and some Intermediate Files

    Data spills to disk outside of HDFS during the following memory-intensive processes can now be protected by encryption.

    • Spark shuffle.

    • Creation of intermediate files in MapReduce encrypted shuffle and spillage during map and reduce operations.

    • Impala SQL queries that generate extremely large result sets.

    When you enable Hadoop Network Encryption in Mammoth during a full Oracle Big Data Appliance installation, or later, via bdacli enable hadoop_network_encryption, then encryption is also enabled for Spark, Impala, as well MapReduce intermediate files and data spills.

    Note:

    In the case of an Oracle Big Data Appliance upgrade to Release 4.6, the upgrade does not automatically enable this new extension of encryption to data spills, regardless of whether or not Hadoop Network Encryption is already enabled. If you want this feature on a system that has been upgraded to Release 4.6, run bdacli enable hadoop_network_encryption after the upgrade.
  • New reset Command in bdacli

    The bdacli reset command selectively reconfigures Oracle Big Data Appliance networks. It pulls the new settings from the network configuration files generated by Oracle Big Data Appliance Configuration Generation Utility. The user controls the scope of the reset (server, network, cluster) and which networks are reset.

    See "bdacli reset" in the Oracle Big Data Appliance Utilities section of this guide.

  • Support for Oracle NoSQL Database 4.0.9

Other Changes

  • Mammoth Installation Step Changes

    Some Mammoth installation steps have been reorganized and renamed. An important change to note is that the Kerberos installation now consists of two separate pre- and post-cluster configuration steps in order to enable additional security setups on the cluster. See Mammoth Installation Steps.

  • X6-2 servers are shipped with an Oracle Big Data Appliance v4.5.0 base image.

Changes in Oracle Big Data Appliance Release 4 (4.5)

The following are changes in Oracle Big Data Appliance Release 4 (4.5):

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.7

  • CDM (Cloudera Manager) 5.7

  • Cloudera Navigator 2.4.1

  • Java JDK 8u92

  • Oracle Big Data Connectors 4.5

  • Oracle Data Integrator Agent 12.2.1 (for Oracle Big Data Connectors)

  • Oracle NoSQL Database 4.0.5

  • Perfect Balance 2.7

  • Big Data Discovery 1.2.2 (optional)

  • Big Data SQL 3.0.1 (optional)

Hardware Updates

  • Oracle Big Data Appliance X6-2 server

    The X6-2 provides substantial increases in processing power and memory over X5–2 servers:
    • 2 x 22-core (2.2GHz) Intel® Xeon® E5-2699 v4 processors.

    • 8 x 32 GB DDR4-2400 memory (expandable to maximum of 768 GB per node).

    X6-2 servers are shipped with an Oracle Big Data Appliance v4.4.0 base image.

    X6-2 nodes can be mixed with X5-2 nodes (and older Release 4.5–compatible nodes) in a CDH or NoSQL cluster. The X6-2 server is not compatible as a node of an Oracle Big Data Appliance cluster in releases prior to Oracle Big Data Appliance Release 4.4.

    See the Oracle Big Data Appliance X6-2 Data Sheet for more details.

New Features

  • Cloudera CDH 5.7 and Cloudera Manager 5.7

    See the Cloudera Enterprise 5.7.x Documentation for information about CDH 5.7 and Cloudera Manager 5.7

  • Support for Either Local or Remote Key Trustee Servers

    Oracle Big Data Appliance supports both local and remote Key Trustee Servers for HDFS Transparent Encryption. The Oracle Big Data Configuration Utility includes HDFS Transparent Encryption as a configuration option. You can either click a checkbox to automatically install and configure active and passive Key Trustee Servers locally on the Oracle Big Data Appliance or define an “off-board” configuration, including the address of the active and passive servers, the Key Trustee organization, and the authorization code. You can also enable HDFS Transparent Encryption via the bdacli utility at any time after Mammoth installation and will be prompted to make the same choice between remote or local key trustee services.

    Athough Oracle Big Data Appliance supports local Key Trustee Servers, remote servers are still the recommended choice.

  • Support for Oracle Big Data SQL 3.0.1

    Oracle Big Data Appliance Release 4.5 includes Oracle Big Data SQL 3.0.1 as a Mammoth installation option. See the Oracle Big Data SQL User's Guide for Release 3.0.1 installation instructions.

  • Enhanced Networking

    Release 4.5 provides more flexibility in the configuration of networks on the Oracle Big Data Appliance. This includes support for the following options:

    • Separate networks for each cluster in a rack (both client and private networks).

    • Multiple client networks on the same BDA cluster.

    • VLAN tagging for client networks.

    • Partition keys for private InfiniBand networks.

  • Lower Minimum Size for CDH Clusters

    The minimum recommended CDH cluster size for a production environment is now five nodes. For development purposes, the Oracle Big Data Appliance Configuration Generation Utility now enables you to create three-node CDH clusters.

    Note that Oracle Big Data Appliance Starter Rack is still sold with six servers.

Changes in Oracle Big Data Appliance Release 4 (4.4)

The following are changes in Oracle Big Data Appliance Release 4 (4.4):

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.5.1

  • CDM (Cloudera Manager) 5.5.1

  • Cloudera Navigator 2.4.1

  • MySQL Database Enterprise Server - Advanced Edition 5.6

  • Oracle Big Data Connectors 4.4

  • Oracle Data Integrator Agent 12.2.1 (for Oracle Big Data Connectors)

  • Oracle NoSQL Database 3.5.2

  • Perfect Balance 2.6

Hardware Updates

  • Oracle Big Data Appliance X6-2 server

    The X6-2 provides substantial increases in processing power and memory over X5–2 servers:
    • 2 x 22-core (2.2GHz) Intel® Xeon® E5-2699 v4 processors.

    • 8 x 32 GB DDR4-2400 memory (expandable to maximum of 768 GB per node).

    X6-2 servers are shipped with an Oracle Big Data Appliance v4.4.0 base image.

    In Oracle Big Data Appliance Release 4.4 or greater, X6-2 nodes can be mixed with X5-2 nodes (and older Release 4.4–compatible nodes) in a CDH or NoSQL cluster. The X6-2 server is not compatible as a node of an Oracle Big Data Appliance cluster in releases previous to 4.4.

    See the Oracle Big Data Appliance X6-2 Data Sheet for more details.

New Features

  • Cloudera CDH 5.5.1 and Cloudera Manager 5.5.1

    CDH 5.5.1 is a maintenance release on top of CDH 5.5. See the Cloudera CDH 5.5 Release Notes

    For information on Cloudera Manager 5.5 and 5.5.1, see New Features and Changes in Cloudera Manager 5

  • Automated Installation for Cloudera Navigator

    Mammoth now provides an automated installation for Cloudera Navigator in both a full Mammoth installation and Mammoth upgrade. No user intervention is required and the installation occurs transparently. If Cloudera Navigator is not already installed, Mammoth installs the software on node 3 of the cluster, which is where other Cloudera Management services are hosted. If Cloudera Navigator is already installed, Mammoth skips this step and does not overwrite the existing installation.

    The Cloudera Navigator Metadata Server and Audit Server are automatically added to Cloudera Manager and auditing is enabled. Mammoth also enables Web UI encryption for the Audit Server.

    Mammoth does not enable the Cloudera Navigator key management components.

  • Support for Oracle Big Data SQL 3.0

    Oracle Big Data Appliance Release 4.4 includes Oracle Big Data SQL 2.0 as a Mammoth installation option. Oracle Big Data SQL 3.0 is also available for Release 4.4, as a patch. See the Oracle Big Data SQL User's Guide for Release 3.0 installation instructions.

    Note:

    If you want to install Oracle Big Data SQL 3.0, do not select Oracle Big Data SQL 2.0 in the Mammoth installation. If Oracle Big Data SQL 2.0 is installed, you must uninstall it prior to installing the 3.0 patch. The patch README file includes steps for removing 2.0 if you have previously installed it.

Release 4.4 as an Update to a Earlier Base Image

Mammoth 4.4.0 can run on top of any earlier Oracle Big Data Appliance 4.x base image and will update the base image software as needed.

Changes in Oracle Big Data Appliance Release 4 (4.3)

The following are changes in Oracle Big Data Appliance release 4 (4.3):

Software Updates

  • CDH (Cloudera's Distribution including Apache Hadoop) 5.4.7

  • CDM (Cloudera Manager) 5.4.7

  • Oracle Big Data Connectors 4.3

  • Oracle Big Data Discovery 1.1.1

  • Oracle Big Data SQL 2.0

  • Oracle NoSQL Database 1.3.4.7 (Community and Enterprise Edition)

  • Oracle Table Access for Hadoop and Spark

  • Perfect Balance 2.5

  • JDK 8u60

See Oracle Big Data Appliance Software User's Guide.

New Features

  • Automatic Installation for Oracle Big Data Discovery

    Customers can download Big Data Discovery 1.1.1 and then use the bdacli command line utility to install the software on a designated node of the primary CDH cluster.

    See Expanding an Oracle Big Data Appliance Starter Rack.

  • Oracle Table Access for Hadoop and Spark

    Oracle Table Access for Hadoop and Spark is an Oracle Big Data Appliance feature that converts Oracle Database tables into Hadoop or Spark data sources. This feature enables fast and secure access to data in the Oracle Database.

  • HDFS Transparent Encryption

    Oracle Big Data Appliance 4.3 provides the option to use HDFS Transparent Encryption. This replaces the eCryptfs on-disk encryption software provided with previous releases. Customers can enable HDFS Transparent Encryption for both new and pre-existing CDH clusters. When enabled, HDFS Transparent Encryption secures Hadoop operations running on the cluster (including HDFS, MapReduce on YARN, Spark on YARN, Hive, and Hbase tasks).

    • The Oracle Big Data Appliance Configuration Generation Utility provides an option to include HDFS Transparent Encryption when a new cluster is created.

    • HDFS Transparent Encryption can be enabled or disabled on a cluster via the bdacli command line interface.

  • HTTPS / Network Encryption

    • Provides HTTPS for Cloudera Manager, Hue, Oozie, and Hadoop Web UIs.

    • Enables network encryption for other internal Hadoop data transfers, such as those made through YARN shuffle and RPC.

    Like HDFS Transparent Encryption, HTTPS/ Network Encryption is an option in the Oracle Big Data Appliance Configuration Generation Utility, and can also be enabled via bdacli.

  • Zero Downtime for Upgrades, One-Off Patches, and Cluster Extensions

    In Release 4.3, Oracle Big Data Appliance leverages Cloudera’s Rolling Upgrades functionality to keep clusters operational during Mammoth upgrades, patches, and cluster extensions. This is an installation option that allows certain services on a cluster to remain continuously available while each node in the cluster is upgraded and rebooted. Zero Downtime is an option for the following tasks:

    • Upgrades of the Mammoth software (including Cloudera's Distribution Including Apache Hadoop, Cloudera Manager, and the Mammoth software itself).

    • One-off patches of Mammoth-installed software.

    • Cluster extensions. (For cluster extensions within a single rack, rolling upgrades are not optional. These extensions are always done as rolling upgrades.)

Deprecated Features

The following features are deprecated in this release, and may be desupported in a future release:

  • MapReduce 1 (MRv1)

    YARN (MRv2) supersedes MRv1. Users who want to continue to use MRv1 on Oracle Big Data Appliance versions 3.x and 4.x should contact Oracle Support before using Mammoth to patch or upgrade the software.

Desupported Features

The following features are no longer supported as of this release:

  • eCryptfs On-Disk Encryption

    This has been replaced by HDFS Transparent Encryption.

Changes in Oracle Big Data Appliance Release 4 (4.2)

The following are changes in Oracle Big Data Appliance release 4 (4.2):

New Features

  • Software Upgrades

    • Cloudera's Distribution including Apache Hadoop 5.4.0

    • Cloudera Manager 5.4.0

    • Perfect Balance 2.4.0

    • Oracle Big Data SQL 1.1

    • No SQL Database 3.2.5

    • Oracle Linux 6.6 and 5.11

    • JDK 8u45

    See Oracle Big Data Appliance Software User's Guide.

  • Hardware Upgrades

    • Oracle Big Data Appliance is now shipped with 8 TB disk drives

  • Elastic Configuration

    • Oracle Big Data Appliance now provides the flexibility of adding one or more servers on a starter rack using Big Data Appliance X5-2 High Capacity Nodes plus InfiniBand Infrastructure. You can add up to 12 additional servers on a starter rack.

      See Expanding an Oracle Big Data Appliance Starter Rack.

  • Automatic Installation Support

    • Spark-on-YARN is deployed automatically

    • Oracle Spatial and Graph is installed and configured automatically

  • Oracle Big Data SQL 1.1

    • Copy to BDA

      This utility enables you to copy relatively static tables from an Oracle database into Hadoop, with the purpose of improving query times.

      See Oracle Big Data Appliance Software User's Guide.

    • Oracle NoSQL Database Support

      Oracle databases on Oracle Exadata Database Machine can use Oracle Big Data SQL to connect to clusters running Oracle NoSQL Database.

    • Parquet Support

      CDH 5.2 and later versions include Hive 0.13, which supports the Apache Parquet file format. This file format is used by Cloudera Impala and other Hadoop software.

      See Oracle Big Data Appliance Software User's Guide.

Other Changes

  • Oracle Big Data Appliance X5-2

    Oracle Big Data Appliance 4.2 software supports Oracle Big Data Appliance X5-2 and earlier version server hardware.

    See "Server Components".

  • Oracle Big Data Appliance Configuration Generation Utility

    This utility generates two new configuration files:

    • network.json: Supersedes BdaDeploy.json. For software upgrades, Mammoth converts the existingBdaDeploy.json to network.json. New installations must have network.json.

    • networkexpansion.json: Supersedes BdaExpansion.json.

    See "About the Configuration Files".

  • CDH Deployment

    Mammoth uses parcels instead of RPMs to deploy CDH.

  • Apache Sentry

    Installation of Apache Sentry does not require sentry-provider.ini as a prerequisite.

  • Microsoft Active Directory Server in Mammoth

    Support for directly using Microsoft Active Directory named as Active Directory Kerberos in Mammoth.

  • Oracle Linux Support

    Oracle Linux 5 support for Oracle Big Data Appliance X5-2 servers.

  • Cloudera Navigator Trustee Server

    Cloudera Navigator Trustee Server installer package and documentation are now shipped in Mammoth. It must be manually installed on a separate server.

Deprecated Features

The following features are deprecated in this release, and may be desupported in a future release:

  • Mammoth Reconfiguration Utility

    The bdacli utility supersedes mammoth-reconfig. The mammoth-reconfig utility is only needed to change the disk encryption password.

    See "bdacli".

  • MapReduce 1 (MRv1)

    YARN (MRv2) supersedes MRv1. Users who want to continue to use MRv1 on Oracle Big Data Appliance versions 3.x and 4.x should contact Oracle Support before using Mammoth to patch or upgrade the software.

  • Disk Encryption

    A new encryption system that is more flexible and robust will replace the current system in an upcoming release.

Changes in Oracle Big Data Appliance Release 4 (4.1)

The following are changes in Oracle Big Data Appliance release 4 (4.1):

New Features

  • Software Upgrades

    • Cloudera's Distribution including Apache Hadoop 5.3.0

    • Cloudera Manager 5.3.0

    • Perfect Balance 2.3.0

    • Oracle Big Data SQL 1.1

    • Oracle Big Data Connectors 4.1

    • Oracle Linux 6.5

    See Oracle Big Data Appliance Software User's Guide.

  • Oracle Big Data SQL 1.1

    • Copy to BDA

      This utility enables you to copy relatively static tables from an Oracle database into Hadoop, with the purpose of improving query times.

      See Oracle Big Data Appliance Software User's Guide.

    • Oracle NoSQL Database Support

      Oracle databases on Oracle Exadata Database Machine can use Oracle Big Data SQL to connect to clusters running Oracle NoSQL Database.

    • Parquet Support

      CDH 5.2 and later versions include Hive 0.13, which supports the Apache Parquet file format. This file format is used by Cloudera Impala and other Hadoop software.

      See Oracle Big Data Appliance Software User's Guide.

  • Oracle NoSQL Database

    The bdacli admin_cluster command supports Oracle NoSQL Database nodes that require repair or replacement.

    See Oracle Big Data Appliance Software User's Guide.

Other Changes

  • Oracle Big Data Appliance X5-2

    Oracle Big Data Appliance 4.1 software supports the Oracle Big Data Appliance X5-2 server hardware.

    See "Server Components".

  • Oracle Big Data Appliance Configuration Generation Utility

    This utility generates two new configuration files:

    • network.json: Supersedes BdaDeploy.json. For software upgrades, Mammoth converts the existingBdaDeploy.json to network.json. New installations must have network.json.

    • networkexpansion.json: Supersedes BdaExpansion.json.

    See "About the Configuration Files".

  • CDH Deployment

    Mammoth uses parcels instead of RPMs to deploy CDH.

  • Apache Sentry

    Installation of Apache Sentry does not require sentry-provider.ini as a prerequisite.

Deprecated Features

The following features are deprecated in this release, and may be desupported in a future release:

  • Mammoth Reconfiguration Utility

    The bdacli utility supersedes mammoth-reconfig. The mammoth-reconfig utility is only needed to change the disk encryption password.

    See "bdacli".

  • MapReduce 1 (MRv1)

    YARN (MRv2) supersedes MRv1. Users who want to continue to use MRv1 on Oracle Big Data Appliance versions 3.x and 4.x should contact Oracle Support before using Mammoth to patch or upgrade the software.

  • Disk Encryption

    A new encryption system that is more flexible and robust will replace the current system in an upcoming release.

Changes in Oracle Big Data Appliance Release 4 (4.0)

The following are changes in Oracle Big Data Appliance release 4 (4.0):

New Features

  • Oracle Big Data SQL 1.0.0

    Oracle Big Data SQL supports queries against vast amounts of big data stored in multiple data sources, including HDFS and Hive. You can view and analyze data from various data stores together, as if it were all stored in an Oracle database. Support for Oracle Big Data SQL includes the following new features in Oracle Database:

    • DBMS_HADOOP PL/SQL package

    • Hive static data dictionary views

    • Access drivers for Hadoop and Hive

    Oracle Big Data SQL is an installation option, which you can specify using the Oracle Big Data Appliance Configuration Generation Utility.

    You can monitor and manage Oracle Big Data SQL using the bdacli command and Cloudera Manager.

    See "bdacli" and Oracle Big Data Appliance Software User's Guide.

  • Service Migration

    The bdacli utility can migrate services from a failing critical node to a healthy noncritical node. It can also remove failing critical and noncritical nodes from a cluster, and restore them to the cluster after repairs. See "bdacli" and Oracle Big Data Appliance Software User's Guide.

  • Software Upgrades

    • Cloudera's Distribution including Apache Hadoop 5.1.0

    • Cloudera Manager 5.1.1

    • Perfect Balance 2.2.0

    • Oracle Data Integrator Agent 12.1.3.0 (for Oracle Big Data Connectors)

    See Oracle Big Data Appliance Software User's Guide.

  • Oracle NoSQL Database Zone Support

    The Oracle Big Data Appliance Configuration Generation Utility and the mammoth -e command support multiple zones on Oracle NoSQL Database clusters. You can add nodes to an existing zone, or create a new primary or secondary zones.

    See "Oracle NoSQL Configuration" and "Mammoth Software Installation and Configuration Utility".

  • Multiple Rack Clusters

    You can now install a cluster on multiple racks using one cluster_name-config.json file.