Preface

This guide describes Oracle Big Data Appliance, which is used for acquiring, organizing, and analyzing very large data sets. It includes information about hardware operations, site planning and configuration, and physical, electrical, and environmental specifications.

This preface contains the following topics:

Audience

This guide is intended for Oracle Big Data Appliance customers and those responsible for data center site planning, installation, configuration, and maintenance of Oracle Big Data Appliance.

Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.

Access to Oracle Support

Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.

Related Documentation

Oracle Big Data Appliance Documentation Library

In the Big Data Documentation Portal of the Oracle Help Center, you can find a link to the complete Oracle Big Data Appliance library for your release of the product. The library includes the following core documents as well as documents for products that are used in conjunction with Oracle Big Data Appliance:

Note:

The Oracle Big Data Appliance Licensing Information User Manual is the consolidated reference for licensing information for Oracle and third-party software included in the Oracle Big Data Appliance product. Refer to this manual or contact Oracle Support if you have questions about licensing.

Documentation for Affiliated Products

The following Oracle libraries contain hardware information for Oracle Big Data Appliance. Links to these libraries are available through the Big Data Documentation Portal at

https://docs.oracle.com/en/bigdata/

Conventions

The following text conventions are used in this document:

Convention Meaning

boldface

Boldface type indicates graphical user interface elements associated with an action, or terms defined in text or the glossary.

italic

Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values.

monospace

Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter.

# prompt

The pound (#) prompt indicates a command that is run as the Linux root user.

Backus-Naur Form Syntax

The syntax in this reference is presented in a simple variation of Backus-Naur Form (BNF) that uses the following symbols and conventions:

Symbol or Convention Description

[ ]

Brackets enclose optional items.

{ }

Braces enclose a choice of items, only one of which is required.

|

A vertical bar separates alternatives within brackets or braces.

...

Ellipses indicate that the preceding syntactic element can be repeated.

delimiters

Delimiters other than brackets, braces, and vertical bars must be entered as shown.

boldface

Words appearing in boldface are keywords. They must be typed as shown. (Keywords are case-sensitive in some, but not all, operating systems.) Words that are not in boldface are placeholders for which you must substitute a name or value.

Changes in Oracle Big Data Appliance Release 4 (4.10)

Oracle Big Data Appliance 4.10 includes the following software updates and feature changes.

Software Updates

  • Cloudera Enterprise 5.12.1, including:
    • CM (Cloudera Manager) 5.12.1

    • CDH (Cloudera's Distribution including Apache Hadoop) 5.12.1

    • Cloudera Key Trustee 5.12.1

  • Oracle Big Data Connectors 4.10

  • Big Data SQL 3.2

  • Oracle Big Data Spatial & Graph 2.4

  • MySQL Enterprise Edition 5.7.19

  • Perfect Balance 2.10.0

  • Java JDK 8u141

  • Oracle Linux 6.9 UEK 4 (Oracle Linux Unbreakable Kernel, Release 4)

    Mammoth v4.10 for Oracle Linux 5 is available (only for upgrades of clusters based on Oracle Linux 5).

  • Oracle Data Integrator Agent 12.2.1.6 (for Oracle Big Data Connectors)

  • Oracle R Advanced Analytics for Hadoop (ORAAH) 2.7.0 

  • Oracle's R Distribution (ORD) 3.2.0

  • Oracle NoSQL Community Edition 4.4.6 or Oracle NoSQL Database Enterprise Edition 4.5.12. Both optional.

    Note that support is no longer available for Oracle NoSQL Database Community Edition.

  • Spark 1.6 and Spark 2

See the Cloudera Documentation for information about CDH and Cloudera Manager 5.12.1.

Oracle Big Data Appliance Release 4.10 includes Oracle Big Data SQL 3.2 as a Mammoth installation option. You do not need to download this package from the Oracle Software Delivery Cloud .

The Cloudera parcels for Kudu 1.4.0, Kafka 2.2.0, and Key Trustee Server 5.12.1 are included for your convenience, but are not deployed or configured by default.

Hardware Updates

  • Support for Oracle Big Data Appliance X-7 systems

    Oracle Big Data Appliance X7 is based on the X7–2L server. The major enhancements in X7–2L are:
    • CPU update: 2 24–core Intel Xeon processors

    • Larger drives: 12 10TB 7,200 RPM SAS drives

    • Improved boot drive setup using mirrored M.2 150GB SATA SSD drives instead of data drive system partitions and USB flash drive for recovery.

    X-7 racks are equipped with the Cisco 93108TC-EX–1G Ethernet switch (instead of the Catalyst 4948E).

    There are no in-rack cabling changes for X-7 systems.

X7-2L servers can coexist as nodes in the same cluster with X6-2L and earlier supported server models. X7–2L is supported by Oracle Big Data Appliance 4.10 (or later). If you install X7–2L nodes, you must also upgrade the cluster software so that all nodes run the same release of Oracle Big Data Appliance.

See Also:

Spark 2 Deployed by Default

Spark 2 is now deployed by default on new clusters and also during upgrade of clusters where it is not already installed.

Oracle Linux 7 with UEK 4 can be Installed on Edge Nodes

Oracle Linux 7 is now supported for installation on Oracle Big Data Appliance edge nodes running on X7–2L, X6–2L or X5–2L servers. Support for Oracle Linux 7 in this release is limited to edge nodes.

Support for Cloudera Data Science Workbench

Support for Oracle Linux 7 on edge nodes provides a way for customers to host Cloudera Data Science Workbench (CDSW) on Oracle Big Data Appliance.

CDSW is a web application that enables access from a browser to R, Python, and Scala on a secured cluster. See Cloudera's website for details.

Oracle Big Data Appliance does not include licensing or official support for CDSW. Contact Cloudera for licensing requirements.

New Features in Oracle Big Data SQL 3.2

Oracle Big Data SQL 3.2 provides a number of new features, including:
  • Enhanced CLOB processing – the workload for some CLOBs can be off-loaded to Oracle Big Data SQL processing cells in Hadoop.

  • Support for queries against Kafka topics.

  • A customer Parquet reader as well as improved processing for Parquet files.

  • Multi-user authorization – users other than the oracle user can submit queries.

  • New “Database Authentication” security option – for communications between Oracle Database and Oracle Big Data SQL cells in Hadooop.

  • Automatic upgrade of existing Oracle Big Data SQL installations. (No need to remove the old installation before installing 3.2.)

In the Oracle Big Data Appliance Configuration Utility, Oracle Big Data SQL 3.2 can be selected for installation by Mammoth. You can also install Release 3.2 later by using the bdacli utility.

Oracel Big Data SQL supports connectivity between Oracle Big Data Appliance and the Exadata Database Machine as well as several other Oracle Database platforms. See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1 in My Oracle Support) for details.

Scripts for Download and Configuration of Apache Zeppelin, Jupyter Notebook, and RStudio

The release includes scripts to assist in download and configuration of these commonly used tools.

See /opt/oracle/bda/thirdparty/README.txt for instructions.

The scripts are provided as a convenience to users. Oracle Big Data Appliance does not include official support for the installation and use of Apache Zeppelin, Jupyter Notebook, or RStudio.

Oracle Big Data Discovery 1.5 as well as 1.4 (1.4.0.37.1388 or Greater) Supported

Oracle Big Data Discovery 1.5 and 1.4.0.37.1388 are compatible with Oracle Big Data Appliance 4.10. Both releases are available for download from the Oracle Software Delivery Cloud (https://edelivery.oracle.com). The site may show Oracle Big Data Discovery 1.4.0.0.0 available for download. This is actually version 1.4.0.37.1388 or greater.

Note:

If Oracle Big Data Discovery 1.4.0.37.1388 or greater was already enabled prior to an upgrade to the current Oracle Big Data Appliance release, then after the upgrade you must update several client libraries required by Oracle Big Data Discovery. For instructions, see Document 2215083.1 in My Oracle Support (https://support.oracle.com). This manual update is not necessary if you install Oracle Big Data Discovery after installing the current Oracle Big Data Appliance release.

Earlier releases of Oracle Big Data Discovery are not supported on this release of Oracle Big Data Appliance.

Improved Configuration of Oracle's R Distribution and ORAAH

For these tools, much of the environment configuration that was previous done by the customer is now automated.

Node Migration Optimization

Node migration time has been improved by eliminating some steps.

Oracle Big Data SQL and the Node Migration and Node Reprovision Processes

In the previous release of Oracle Big Data Appliance, if Oracle Big Data SQL was enabled, you had to disable it prior to a node migration or mammoth -e (node extension). Disabling Oracle Big Data SQL is no longer necessary for these operations..

However, this requirement is still in place for node reprovision (# bdacli admin-cluster reprovision <node name>). If Oracle Big Data SQL is enabled, this command returns a message stating that it must be disabled before the reprovisioning can proceed.

IMPORTANT: Oracle Big Data Appliance 4.10 is the Last Release to Support Upgrades for Oracle Linux 5 Clusters

Oracle Big Data Appliance 4.10 is last release to provide an Oracle Linux 5 installation bundle and to support Oracle Linux 5. Migration from Oracle Linux 5 to Oracle Linux 6 will be a prerequisite for future Oracle Big Data Appliance releases.

Oracle Big Data Appliance 4.10 (like Release 4.9) includes scripts to assist with the upgrade to Oracle Linux 6, including file and configuration backup and restore. In this release, nodes using HDFS Transparent Encryption can be migrated and therefore the process is now fully supported.

Customers are encouraged to upgrade their clusters to Oracle Linux 6 in advance of the next Oracle Big Data Appliance release. The advantages are:

  • Oracle Linux 6 provides increased stability and performance, as well as access to a larger and more modern ecosystem of available applications.

  • Performing the migration in advance of the next Oracle Big Data Appliance release will reduce the time and effort required to complete the upgrade later on.

See Migrating from Oracle Linux 5 to Oracle Linux 6 in the Oracle Big Data Appliance Owner’s Guide for details.

Previous Important Notices

The following notices were published for Oracle Big Data Appliance 4.9 and 4.8 . If you did not install these releases, you may not be aware of the changes that are inherited by Oracle Big Data Appliance 4.10:

  • The network.json File is Being Replaced by Separate Files for Rack and Cluster Configuration (rack-network.json and cluster-network.json)

    In all cases where you previously used network.json, you now should instead use rack-network.json and cluster-network.json. These two files together are functionally equivalent to network.json. In the current release, network.json is still supported. In a future release, support for network.json will be dropped.

    When you generate a new configuration with the utility, copy rack-network.json and cluster-network.json to servers where you formerly would have copied network.json. For all configuration commands which formerly required network.json as a parameter, you should now specify rack-network.json and cluster-network.json together instead. For example, to re-image node 4 in a cluster, submit both files (comma-separated) as a parameter to the makebdaimage command:

    # ./makebdaimage -usbint BDABaseImage-<version>_RELEASE.iso /opt/oracle/bda/rack-network.json,/opt/oracle/bda/cluster-network.json 4 
    

    Previously, this command expected network.json as the configuration file parameter:

    # ./makebdaimage -usbint BDABaseImage-<version>_RELEASE.iso /opt/oracle/bda/network.json 4
    

    This enhancement makes it easier to make network changes to a cluster of servers without affecting other servers on the same racks and makes it easier to make admin network changes to a rack without affecting the client or private networks.

  • Perfect Balance Auto-Invocation Discontinued

    The Perfect Balance automatic invocation feature is no longer supported as of Perfect Balance 2.10. If you having been using automatic invocation, please switch to the Perfect Balance API.

  • Oracle NoSQL Database CE Support not Available

    There is no longer an option to purchase support for Oracle NoSQL Database Community Edition.

  • TLS 1.0 (TLSv1) Disabled for Cloudera Configuration Manager and Hue – May Affect Some Browsers and Operating Systems

    In order to improve security, Oracle Big Data Appliance has disabled TLS 1.0 for Cloudera Manager and Hue and also in the system-wide Java configuration. This can affect older browsers or operating systems.

    It is recommended that you reconfigure or upgrade clients using TLS 1.0 to use newer encryption, but if necessary you can re-enable TLS 1.0 for Cloudera Manager and Hue.

    For details, log on to My Oracle Support and search for BDA 4.8 Disables TLSv1 by Default For Cloudera Manager/Hue/And in System-Wide Java Configurations (Doc ID 2250841.1).

  • Configuring the Network on new Racks With an Installed Base Image Lower Than Oracle Big Data Appliance 4.5.0 Requires Special Configuration Steps

    Log on to My Oracle Support and search for this document :Network Configuration Instructions for Shipped BDA racks with a BDA Base Image Less Than V4.5.0 (Doc ID 2135358.1).

  • Cross-Realm Trust to Microsoft Active Directory is Recommended Over Direct-Active-Directory

    Oracle strongly recommends that customers using Oracle Big Data Appliance Clusters with Microsoft Active Directory configure their clusters to use cross-realm trust to Microsoft Active Directory and not use Direct-Active-Directory configuration for the Hadoop cluster.

  • Deprecated or Discontinued Features

    • Oracle Audit Vault and Database Firewall not Installed or Supported

      Oracle Audit Vault and Database Fireware (AVDF) has been dropped from the Mammoth installation and is not longer recommended for monitoring and auditing on Oracle Big Data Appliance 4.9. Cloudera Navigator is recommended as a replacement.

    • The Oracle Enterprise Manager Plug-In can no Longer be Enabled via bdacli

      The bdacli command for enabling the Oracle Enterprise Manager Plug-in (bdacli enable em) is no longer available.

Note:

The bdacli commands to check status and to disable AVDF and the Oracle Enterprise Manager Plug-in are still available.