1 Introduction
This guide describes how to install Oracle Big Data SQL, how to reconfigure or extend the installation to accommodate changes in the environment, and, if necessary, how to uninstall the software.
This installation is done in phases. The first two phases are:
-
Installation on the node of the Hadoop cluster where the cluster management server is running.
-
Installation on each node of the Oracle Database system.
If you choose to enable optional security features available, then there is an additional third phase in which you activate the security features.
The two systems must be networked together via Ethernet or InfiniBand. (Connectivity to Oracle SuperCluster is InfiniBand only).
Note:
For Ethernet connections between Oracle Database and the Hadoop cluster, Oracle recommends 10 Gb/s Ethernet.The installation process starts on the Hadoop system, where you install the software manually on one node only (the node running the cluster management software). Oracle Big Data SQL leverages the adminstration facilities of the cluster management software to automatically propagate the installation to all DataNodes in the cluster.
The package that you install on the Hadoop side also generates an Oracle Big Data SQL installation package for your Oracle Database system. After the Hadoop-side installation is complete, copy this package to all nodes of the Oracle Database system, unpack it, and install it using the instructions in this guide. If you have enabled Database Authentication or Hadoop Secure Impersonation, you then perform the third installation step.
1.1 Supported System Combinations
Oracle Big Data SQL supports connectivity between a number of Oracle Engineered Systems and commodity servers.
The current release supports Oracle Big Data SQL connectivity for the following Oracle Database platforms/Hadoop system combinations:
-
Oracle Database on commodity servers with Oracle Big Data Appliance.
-
Oracle Database on commodity servers with commodity Hadoop systems.
-
Oracle Exadata Database Machine with Oracle Big Data Appliance.
-
Oracle Exadata Database Machine with commodity Hadoop systems.
Note:
The phrase “Oracle Database on commodity systems” refers to Oracle Database hosts that are not the Oracle Exadata Database Machine. Commodity database systems may be either Oracle Linux or RHEL-based. “Commodity Hadoop systems” refers to Hortonworks HDP systems and to Cloudera CDH-based systems other than Oracle Big Data Appliance.
1.2 Oracle Big Data SQL Master Compatibility Matrix
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1 in My Oracle Support) for up-to-date information on Big Data SQL compatibility with the following:
-
Oracle Engineered Systems.
-
Other systems.
-
Linux OS distributions and versions.
-
Hadoop distributions.
-
Oracle Database releases, including required patches.
1.3 Installing on Oracle Big Data Appliance
Each Oracle Big Data Appliance software release already includes a version of Oracle Big Data SQL that is ready to install, using the utilities available on the appliance.
You can download and install the standalone Big Data SQL bundle as described in this guide on all supported Hadoop platforms, including Big Data appliance. But for Big Data Appliance, the recommended method is to install the Big Data SQL package included with your Big Data Appliance software. The instructions for doing this are in the Oracle Big Data Appliance Owner's Guide. You can find them in the same location in most versions of the Owner's Guide. For example, Big Data Appliance 4.14 includes Big Data SQL 3.2.1.2 and the instructions are here: 10.9.5 Installing Oracle Big Data SQL.
The advantages of installing the version of Big Data SQL included with the appliance are:
- The prerequisites to the installation are already met.
- You can add Big Data SQL to the Big Data Appliance release installation by checking a checkbox in the Big Data Appliance Configuration Generation Utillity. The Mammoth utility will then automatically include Big Data SQL in the installation.
- You can also install Big Data SQL later, using the bdacli utility. This is also a simple procedure. The command is
bdacli enable big_data_sql
. - When Big Data SQL is installed by the Mammoth utility, then during an upgrade to a newer Big Data Appliance software release, Mammoth will automatically upgrade the Hadoop side of the Big Data SQL installation to the version included in the release bundle.
The limitations of installing the version of Big Data SQL include with Big Data Appliance are:
- The installation is performed for the Hadoop side only. You still need to install the database side of the product using the instructions in this guide. You also must refer to this guide if you want to modify the default installation.
- The Big Data Appliance release may not include the latest available version of Big Data SQL.
Note:
If you choose to download and install a release of Big Data SQL from the Oracle Software Delivery Cloud instead of installing the version included with Big Data Appliance, then first check the Oracle Big Data SQL Master Compatibility Matrix to confirm that your current Big Data Appliance release level supports the version that you want to install.1.4 Prerequisites for Networking
The Oracle Big Data SQL installation has the following network dependencies.
1.4.1 Port Access Requirements
Oracle Big Data SQL requires that the following ports are open though firewalls protecting the Hadoop cluster and Oracle Database.
Table 1-1 Ports That Must be Open on Both the Hadoop Cluster and Oracle Database Servers
Port | Use |
---|---|
Ephemeral_range, i.e. 9000-65500 | UDP communication from the celliniteth.ora IP address |
5042 | Diskmon |
Table 1-2 Additional Ports That Must Be Open on the Hadoop Cluster
Hadoop Cluster Ports | Where | Use |
---|---|---|
50010 | All nodes on unsecured clusters | dfs.datanode.address |
1004 | All nodes on secured clusters | dfs.datanode.address |
50020 | All nodes | dfs.datanode.ipc.address |
8020 | NameNodes | fs.defaultFS |
8022 | NameNodes | dfs.namenode.servicerpc-address |
9083 | Hive Metastore & HiveServer2 node. | hive.metastore |
10000 | Hive Metastore & HiveServer2 node. | hive.server2.thrift.port |
88 | Kerberos KDC | TCP & UDP |
16000 | Where HDFS Encryption is enabled | KMS HTTP Port |
1.5 Prerequisites for Installation on the Hadoop Cluster
The following installed software package active services, tools, and environment settings are prerequisites to the Oracle Big Data SQL installation.
Platform requirements, such as supported Linux distributions and versions, as well as supported Oracle Database releases and required patches are not listed here. See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1 in My Oracle Support) for this information.
The Oracle Big Data SQL installer checks all prerequisites before beginning the installation and reports any missing requirements on each node.
Tip:
Usebds_node_check.sh
to pre-check whether or not the DataNodes of the cluster are ready for the installation.
You can manually check for them, but the easiest way is to run bds_node_check.sh
on each node. This script returns a complete readiness report. After you download the installation bundle, unzip it, and execute the run file, bds_node_check.sh
will be available, along with the tools to perform the installation. See Check for Hadoop-Side Prerequisites With bds_node_check.sh for details.
Note:
- Oracle Big Data SQL 4.0 does not support single user mode for Cloudera clusters.
- The JDK is no longer a prerequisite. JDK 8u171 is included with this release of Oracle Big Data SQL.
1.5.1 Software Package Requirements for all DataNodes
The following packages must be pre-installed on all Hadoop cluster nodes before installing Oracle Big Data SQL. These are already installed on releases of Oracle Big Data Appliance that support Oracle Big Data SQL 4.0. Several additional packages are required if Query Server will be installed.
libaio
dmidecode
net-snmp
net-snmp-utils
glibc
libgcc
libstdc++
libuuid
ntp
perl
perl-libwww-perl
perl-libxml-perl
perl-XML-LibXML
perl-Time-HiRes
perl-XML-SAX
perl-Env (only for Oracle Linux 7 and RHEL 7)
rpm
curl
unzip
zip
tar
wget
uname
The following packages are required only if you install Query Server:
expect
procmail
The yum utility is the recommended method for installing these packages. All of them can be installed with a single yum command. For example (not including expect and procmail):
# yum -y install dmidecode net-snmp net-snmp-utils perl perl-libs perl-Time-HiRes perl-libwww-perl perl-libxml-perl perl-XML-LibXML perl-XML-SAX perl-Env fuse fuse-libs rpm curl unzip zip tar wget uname -y libaio gcc
Special Prequisites for the Configuration Management Server
On the node where CM or Ambari runs (usually Node 3 on Oracle Big Data Appliance), you may also need to install a compatible version of Python as well as the Python Cryptography package. See the next section to determine whether or not this is necessary. If you do need to manually install a version of Python, then add openssl-devel to the yum parameter string:
# yum -y install dmidecode net-snmp net-snmp-utils perl perl-libs perl-Time-HiRes perl-libwww-perl perl-libxml-perl perl-XML-LibXML perl-XML-SAX perl-Env fuse fuse-libs rpm curl unzip zip tar wget uname openssl-devel -y libaio gcc
Other Prequisites
- HDFS, YARN, and Hive must be running on the cluster at Oracle Big Data SQL installation time and runtime. They can be installed as parcels or packages on Cloudera CDH and as stacks on Hortonworks HDP.
- On CDH, if you install the Hadoop services required by Oracle Big Data SQL as packages, be sure that they are installed from within CM. Otherwise, CM will not be able to manage them. This is not an issue with parcel-based installation.
1.5.2 Python Requirements for the Cluster Management Node
On the node where the CM or Ambari cluster management service is running, the Oracle Big Data SQL installer requires Python 2.7.5 or greater, but less that 3.0. You must also add the Python Cryptography package to this Python installation if it is not present.
Jaguar, the Oracle Big Data SQL installer, requires Python (>= 2.7.5 <3.0) locally on the node where you run the installer. This is the node where CM or Ambari cluster management service is running. If any installation of Python in this supported version range is already present, you can use it to run Jaguar.
- On Oracle Big Data Appliance or commodity Hadooop clusters running Oracle Linux 6 or 7:
Do not manually install Python to support the Jaguar installer. There is a compatible Python package already available on the appliance and the Jaguar installer will automatically find and use this package without prompting you.
- On commodity Hadoop clusters running Oracle Linux 6:
Install a compatible version of Python if not present.
- On Oracle Big Data Appliance or commodity Hadooop clusters running Oracle Linux 5:
Install a compatible version of Python if not present. On Oracle Big Data Appliance, install it as secondary installation only.
Important:
On Oracle Big Data Appliance do not overwrite the default Python installation with a newer version or switch the default to a newer version. This restriction may also apply other supported Hadoop platforms. Consult the documentation for the CDH or HDP platform you are using.On Oracle Linux 5 or 6 on a commmodity Hadoop platform, the Jaguar installer will prompt you for the path of the compatible Python installation.
Installing the Required Python Cryptography Module
You can use Python's pip utility to install the Python Cryptography module. Use scl if Python (>= 2.7.5 <3.0) is not the default. This example installs pip and then installs and imports the module.
# scl enable python27 "pip install -U pip"
# scl enable python27 "pip install cryptography"
# scl enable python27 "python -c 'import cryptography; print \"ok\";'"
You can then run the Jaguar installer.
1.5.2.1 Adding Python 2.7.5 or Greater as a Secondary Installation
Below is a procedure for adding the Python 2.7.5 or greater (but less than 3.0) as a secondary installation.
Note:
If you manually install Python, first ensure that the openssl-devel
package is installed:
# yum install -y openssl-devel
# pyversion=2.7.5
# cd /tmp/
# mkdir py_install
# cd py_install
# wget https://www.python.org/static/files/pubkeys.txt
# gpg --import pubkeys.txt
# wget https://www.python.org/ftp/python/$pyversion/Python-$pyversion.tgz.asc
# wget https://www.python.org/ftp/python/$pyversion/Python-$pyversion.tgz
# gpg --verify Python-$pyversion.tgz.asc Python-$pyversion.tgz
# tar xfzv Python-$pyversion.tgz
# cd Python-$pyversion
# ./configure --prefix=/usr/local/python/2.7.5
# make
# mkdir -p /usr/local/python/2.7.5
# make install
# export PATH=/usr/local/python/2.7.5/bin:$PATH
If you create a secondary installation of Python, it is strongly recommended that you apply Python update regularly to include new security fixes.
Important: On Oracle Big Data Appliance, do not update the mammoth-installed Python unless directed to do so by Oracle.
1.5.2.2 When You May Need to Use scl to Invoke the Correct Python Version
If there is more than one Python release on the cluster managerment server, then be sure that Python 2.7.5 or greater (but less than 3.0) is invoked for any operations associated with this release of Oracle Big Data SQL.
If the scl utility is available, you can use to invoke Python 2.7.5 or greater explicitly. This is necessary if a different Python installation is the default. In that case, use scl or another method to invoked the correct Python version for scripts as well as Python-based utilities such as Jaguar, the Oracle Big Data SQL installer,
[root@myclusteradminserver:BDSjaguar] # scl enable python27 "./jaguar install bds-config.json"
There is one exception to this requirement. On Oracle Big Data Appliance clusters running Oracle Linux 6 or Oracle Linux 7, it is not necessary to use scl explicitly in order to run the Jaguar installer. In this case, you can invoke Jaguar directly, as in:
[root@myclusteradminserver:BDSjaguar] # ./jaguar install bds-config.json
Jaguar itself will silently invoke scl if it is available and if scl is required to invoke a compatible Python release in this environment.
Note that this only applies to Jaguar on Big Data Appliance. To run any other Python scripts required by Oracle Big Data SQL (even on Oracle Big Data Appliance), use scl if Python 2.7.5 is not the default.
# scl enable python27 pip install cryptography
1.5.3 Environment Settings
The following environment settings are required prior to the installation.
- ntp enabled
- Minimum ratio of
shmmax
toshmall
:shmmax
=shmall
*PAGE_SIZE
shmmax
must be greater that physical memory.swappiness
set between 5 and 25.- All
*.rp_filter
instances disabled - Socket buffer size equal to or greater than 4194304
1.5.4 Proxy-Related Settings
The installation process requires Internet access in order to download some packages from Cloudera or Hortonworks sites.
If a proxy is required for Internet access, then either ensure that the following are set as Linux environment variables, or, enable the equivalent parameters in the Jaguar configuration file, bds-config.json
)
-
http_proxy
andhttps_proxy
-
no_proxy
Set
no_proxy
to include the following: "localhost,127.0.0.1,<Comma—separated list of the hostnames in the cluster (in FQDN format).>
".
On Cloudera CDH, clear any proxy settings in Cloudera Manager administration before running the installation.
See Also:
Table 2-1 describes the use ofhttp_proxy
, https_proxy
, and other parameters in the installer configuration file.
1.6 Prerequisites for Installation on Oracle Database Nodes
Installation prerequisites vary, depending on type of Hadoop system and Oracle Database system where Oracle Big Data SQL will be installed.
Patch Level
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1) in My Oracle Support for supported Linux distributions, Oracle Database release levels, and required patches.
Note:
Be sure that the correct Bundle Patch and any one-off patches identified in the Compatibility Matrix have been pre-applied before starting this installation.Before you begin the installation, review the additional environmental and user access requirements described below.
Packages Required for Kerberos
If you are installing on a Kerberos-enabled Oracle Database System, these package must be pre-installed:
-
krb5-workstation
-
krb5-libs
Packages for the “Oracle Tablespaces in HDFS” Feature
Oracle Big Data SQL provides a method to store Oracle Database tablespaces in the Hadoop HDFS file system. The following RPMs must be installed:
-
fuse
-
fuse-libs
# yum -y install fuse fuse-libs
Required Environment Variables
The following are always required. Be sure that these environment variables are set correctly.
-
ORACLE_SID
-
ORACLE_HOME
Note:
GI_HOME (which was required in Oracle Big Data SQL 3.1 and earlier) is no longer required.
Required Credentials
-
Oracle Database owner credentials (The owner is usually the
oracle
Linux account.)Big Data SQL is installed as an add-on to Oracle Database. Tasks related directly to database instance are performed through database owner account (
oracle
or other). -
Grid user credentials
In some cases where Grid infrastructure is present, it must be restarted. If the system uses Grid then you should have the Grid user credentials on hand in case a restart is required.
The Linux users grid
and oracle
(or other database owner) must both be in the same group (usually oinstall
). This user requires permission to read all files owned by the grid
user and vice versa.
All Oracle Big Data SQL files and directories are owned by the oracle:oinstall
user and group.
Required Grid Infrastructure Patches
You can run the script bds-validate-grid-patches.sh
to check that the Grid Infrastructure includes all of the patches that are required by the Oracle Big Data Installation. See Check for Required Grid Patches With bds-validate-grid-patches.sh
1.7 Downloading Oracle Big Data SQL and Query Server
You can download Oracle Big Data SQL from the Oracle Software Delivery Cloud (also known as “eDelivery”).
There are three files to download:
- The primary BDSJaguar bundle, which contains the Jaguar installer for Oracle Big Data SQL
V982738-01.zip
- The two parts of the optional Query Server bundle
V982741-01_1of2.zip V982741-01_2of2.zip
If you want to use Query Server, then download the two parts of the Query Server bundle in addition to the primary bundle.
Note:
You cannot use Query Server apart from Oracle Big Data SQL. Query Server is also not installed separately. It can be included in the Jaguar-driven installation as described below.1.8 Upgrading From a Prior Release of Oracle Big Data SQL
On the Oracle Database side, Oracle Big Data SQL can now be installed over a previous release with no need to remove the older software. The install script automatically detects and upgrades an older version of the software.
Upgrading the Oracle Database Side of the Installation
On the database side, you need to perform the installation only once to upgrade the database side for any clusters connected to that particular database. This is because the installations on the database side are not entirely separate. They share the same set of Oracle Big Data SQL binaries. This results in a convenience for you – if you upgrade one installation on a database instance then you have effectively upgraded the database side of all installations on that database instance.
Upgrading the Hadoop Cluster Side of the Installation
If existing Oracle Big Data SQL installations on the Hadoop side are not upgraded, these installations will continue to work with the new Oracle Big Data SQL binaries on the database side, but will not have access to the new features in this release.
1.9 Important Terms and Concepts
These are special terms and concepts in the Oracle Big Data SQL installation.
Oracle Big Data SQL Installation Directory
On both the Hadoop side and database side of the installation, the directory where you unpack the installation bundle is not a temporary directory which you can delete after running the installer. These directories are staging areas for any future changes to the configuration. You should not delete them and may want to secure them against accidental deletion.
Database Authentication Keys
Database Authentication uses a key that must be identical on both sides of the installation (the Hadoop cluster and Oracle Database). The first part of the key is created on the cluster side and stored in the .reqkey
file. This file is consumed only once on the database side, to connect the first Hadoop cluster to the database. Subsequent cluster installations use the configured key and the .reqkey
file is no longer required. The full key (which is completed on the database side) is stored in an .ackkey
file. This key is included in the part of the ZIP file created by the database-side installation and must be copied back to the Hadoop cluster by the user.
Request Key
By default, the Database Authentication feature is enabled in the configuration. (You can disable it by setting the parameter database_auth_enabled
to “false” in the configuration file.) When this setting is true, then the Jaguar install
and reconfigure
operations can generate a request key
(stored in a file with the extension .reqkey
). This key is part of a unique GUID-key pair used for Database Authorization. This GUID-key pair is generated during the database side of the installation. The Jaguar operation creates a request key if the command line includes the --requestdb
command line parameter along with a single database name (or a comma separated list of names). In this example, the install operation creates three keys, one for each of three different databases:
# ./jaguar --requestdb orcl,testdb,proddb install
<Oracle Big Data SQL install directory>/BDSJaguar/dbkeys
. In this example, Jaguar install
would generate these request key files:orcl.reqkey
testdb.reqkey
proddb.reqkey
Prior to the database side of the installation, you copy request key to the database node and into the path of the database-side installer, which at runtime generates the GUID-key pair.
Acknowledge Key
After you copy a request key into the database-side installation directory, then when you run the database-side Oracle Big Data SQL installer it generates a corresponding acknowledge key . The acknowledge key is the original request key, paired with a GUID. This key is stored in a file that is included in a ZIP archive along with other information that must be returned to the Hadoop cluster by the user. .
Database Request Operation (databasereq)
The Jaguar databasereq
operation is “standalone” way to generate a request key. It lets you create one or more request keys without performing an install
or reconfigure
operation:
# ./jaguar --requestdb <database name list> databasereq {configuration file | null}
Database Acknowledge ZIP File
If Database Authentication, or Hadoop Secure Impersonation is enabled for the configuration, then the database-side installer creates a ZIP bundle configuration information . If Database Authentication is enabled, this bundle includes the acknowledge key file. Information required for Hadoop Secure Impersonation is also included if that option was enabled. Copy this ZIP file back to/opt/oracle/DM/databases/conf
on the Hadoop cluster management server for processing.
Database Acknowledge is a third phase of the installation and is performed only when any of the three security features cited above are enabled.
Database Acknowledge Operation (databaseack)
If you have opted to enable any or all of three new security features (Database Authentication, or Hadoop Secure Impersonation), then after copying the Database Acknowledge ZIP file back to the Hadoop cluster, run the Jaguar Database Acknowledge operation.
The setup process for these features is a “round trip” that starts on the Hadoop cluster management server, where you set the security directives in the configuration file and run Jaguar, to the Oracle Database system where you run the database-side installation, and back to the Hadoop cluster management server where you return a copy of the ZIP file generated by the database-side installation. The last step is when you run databaseack
, the Database Acknowledge operation described in the outline below. Database Acknowledge completes the setup of these security features.
Default Cluster
The default cluster is the first Oracle Big Data SQL connection installed on an Oracle Database. In this context, the term default cluster refers to the installation directory on the database node where the connection to the Hadoop cluster is established. It does not literally refer to the Hadoop cluster itself. Each connection between a Hadoop cluster and a database has its own installation directory on the database node.
An important aspect of the default cluster is that the setting for Hadoop Secure Impersonation in the default cluster determines that setting for all other cluster connections to a given database. If you run a Jaguar reconfigure
operation some time after installation and use it to turn Hadoop Secure Impersonation in the default cluster on or off, this change is effective for all clusters associated with the database.
If you perform installations to add additional clusters, the first cluster remains the default. If the default cluster is uninstalled, then next one (in chronological order of installation) becomes the default.
1.10 Installation Overview
The Oracle Big Data SQL software must be installed on all Hadoop cluster DataNodes and all Oracle Database compute nodes.
Important: About Service Restarts
On the Hadoop-side installation, the following restarts may occur.
-
Cloudera Configuration Manager (or Ambari) may be restarted. This in itself does not interrupt any services.
-
Hive, YARN , and any other services that have a dependency on Hive or YARN (such as Impala) are restarted.
The Hive libraries parameter is updated in order to include Oracle Big Data SQL JARs. On Cloudera installations, if the YARN Resource Manager is enabled, then it is restarted in order to set cgroup memory limit for Oracle Big Data SQL and the other Hadoop services. On Oracle Big Data Appliance, the YARN Resource Manager is always enabled and therefore always restarted.
On the Oracle Database server(s), the installation may require a database and/or Oracle Grid infrastructure restart in environments where updates are required to Oracle Big Data SQL cell settings on the Grid nodes. See Potential Requirement to Restart Grid Infrastructure for details.
If a Previous Version of Oracle Big Data SQL is Already Installed
On commodity Hadoop systems (those other than Oracle Big Data Appliance) the installer automatically uninstalls any previous release from the Hadoop cluster.
You can install Oracle Big Data SQL on all supported Oracle Database systems without uninstalling a previous version.
Before installing this Oracle Big Data SQL release on Oracle Big Data Appliance, you must use bdacli to manually uninstall the older version if it had been enabled via bdacli or Mammoth. If you are not sure, try bdacli disable big_data_sql
. If the disable comment fails, then the installation was likely done with the setup-bds installer. In that case, you can install the new version Oracle Big Data SQL without disabling the old version.
How Long Does It Take?
The table below estimates the time required for each phase of the installation. Actual times will vary.
Table 1-3 Installation Time Estimates
Installation on the Hadoop Cluster | Installation on Oracle Database Nodes |
---|---|
Eight minutes to 28 minutes The Hadoop side installation may take eight minute if all resources are locally available. An additional 20 minutes or more may be required if resources must be downloaded from the Internet. |
The average installation time for the database side can be estimated as follows:
|
The installation process on Hadoop side includes installation on the Hadoop cluster as well as generation of the bundle for the second phase of the installation on the Oracle Database side. The database bundle includes Hadoop and Hive clients and other software. The Hadoop and Hive client software enable Oracle Database to communicate with HDFS and the Hive Metastore. The client software is specific to the version of the Hadoop distribution (i.e. Cloudera or Hortonworks). As explained later in this guide, you can download these packages prior to the installation, set up an URL or repository within your network, and make that target available to the installation script. If instead you let the installer download them from the Internet, the extra time for the installation depends upon your Internet download speed.
Pre-installation Steps
-
Check to be sure that the Hadoop cluster and the Oracle Database system both meet all of the prerequisites for installation. On the database side, this includes confirming that all of the required patches are in installed. Check against these sources:
- Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1 in My Oracle Support)
-
Sections 2.1 in this guide, which identifies the prerequisites for installing on the Hadoop cluster. Also see Section 3.1, which describes the prerequisites for installing the Oracle Database system component of Oracle Big Data SQL.
-
Have these login credentials available:
-
root
credentials for both the Hadoop cluster and all Oracle Grid nodes.On the grid nodes you have the option of using passwordless SSH with the root user instead.
-
oracle
Linux user (or other, if the database owner is notoracle
) -
The Oracle Grid user (if this is not the same as the database owner).
-
The Hadoop configuration management service (CM or Amabari) admin password.
-
-
On the cluster management server (where CM or Ambari is running), download the Oracle Big Data SQL installation bundle and unzip it into a permanent location of your choice. (See Downloading Oracle Big Data SQL and Query Server.)
Outline of the Installation Steps
This is an overview to familiarize you with the process. Complete installation instructions are provided in Chapters 2 and 3.
The installation always has two phases – the installation on the Hadoop cluster and the subsequent installation on the Oracle Database system. It may also include the third, “Database Acknowledge,” phase, depending on your configuration choices.
-
Start the Hadoop-Side Installation
Review the installation parameter options described in Chapter 2. The installation on the Hadoop side is where you make all of the decisions about how to configure Oracle Big Data SQL, including those that affect the Oracle Database side of the installation.
-
Edit the
bds-config.json file
provided with the bundle in order to configure the Jaguar installer as appropriate for your environment. You could also create your own configuration file using the same parameters. -
Run the installer to perform the Hadoop-side installation as described in Installing or Upgrading the Hadoop Side of Oracle Big Data SQL.
If the Database Authentication feature is enabled, then Jaguar must also output a “request key” (
.reqkey
) file for each database that will connect to the Hadoop cluster. You generate this file by including the—-requestdb
parameter in the Jaguarinstall
command (the recommended way). You can also generate the file later with other Jaguar operations that support the—-requestdb
.This file contains one half of a GUID-key pair that is used in Database Authentication. The steps to create and install the key are explained in more detail in the installation steps.
-
Copy the database-side installation bundle to any temporary directory on each Oracle Database compute node.
-
If a request key file was generated, copy over that file to the same directory.
-
Start the Database-Side Installation
Log on to the database server as the database owner Unzip bundle and execute the run file it contained. The run file does not install the software. It sets up an installation directory under
$ORACLE_HOME
. -
As the database owner, perform the Oracle Database server-side installation. (See Installing or Upgrading the Oracle Database Side of Oracle Big Data SQL.)
In this phase of the installation, you copy the database-side installation bundle to a temporary location on each compute node. If a
.reqkey
file was generated for the database, then copy the file into the installation directory before proceeding. Then run thebds-database-install.sh
installation program.The database-side installer does the following:
-
Copies the Oracle Big Data SQL binaries to the database node.
-
Creates all database metadata and MTA extprocs (external processes) required to access the Hadoop cluster and configures the communication settings.
Important:
Be sure to install the bundle on each database compute node. The Hadoop-side installation automatically propagates the software to each node of the Hadoop cluster. However, the database-side installation does not work this way. You must copy the software to each database compute node and install it directly.
In Oracle Grid environments, if cell settings need to be updated, then a Grid restart may be needed. Be sure that you know the Grid password. If a Grid restart is required, then you will need the Grid credentials to complete the installation.
-
-
If Applicable, Perform the “Database Acknowledge” Step
If Database Authentication or Hadoop Secure Impersonation were enabled, the database-side installation generates a ZIP file that you must copy back to Hadoop cluster management server. The file is generated in the installation directory under
$ORACLE_HOME
and has the following filename format.
Copy this file back to<Hadoop cluster name>-<Number nodes in the cluster>-<FQDN of the cluster management server node>-<FQDN of this database node>.zip
/opt/oracle/DM/databases/conf
on the Hadoop cluster management server and then asroot
run the Database Acknowledge command from the BDSJaguar directory:# cd <Big Data SQL install directory>/BDSJaguar # ./jaguar databaseack <bds-config.json>
Workflow Diagrams
Complete Installation Workflow
The figure below illustrates the complete set of installation steps as described in this overview.
Note:
Before you start the steps shown in the workflow, be sure that both systems meet the installation prerequisites.Figure 1-1 Installation Workflow
Note:
The --reqkey parameter in this diagram actually requires the full path to the file, as in/bds-databse-install.sh --reqkey=/opt/tmp/orcl.reqkey
.

Description of "Figure 1-1 Installation Workflow"
Key Generation and Installation
The figure below focuses on the three steps required to create and installing the GUID-key pair used in Database Authentication. The braces around parameters of the Jaguar command indicate that one of the operations in the list is required. Each of these operations supports use of the —-requestdb
parameter. Note that although updatenodes
is included in this list, updatenodes
is deprecated in this release. You should use reconfigure
instead.
Figure 1-2 Generating and Installing the GUID-Key Pair for Database Authentication

Description of "Figure 1-2 Generating and Installing the GUID-Key Pair for Database Authentication"
1.11 Post-Installation Checks
Validating the Installation With bdschecksw and Other Tests
-
The scriptSee Running Diagnostics With bdachecksw in the Oracle Big Data SQL User’s Guide for a complete description.
bdschecksw
now runs automatically as part of the installation. This script gathers and analyzes diagnostic information about the Oracle Big Data SQL installation from both the Oracle Database and the Hadoop cluster sides of the installation. You can also run this script as a troubleshooting check at any time after the installation. The script is in$ORACLE_HOME/bin
on the Oracle Database server.$ bdschecksw --help
-
Also see How to do a Quick Test in the user’s guide for some additional functionality tests.
Checking the Installation Log Files
You can examine these log files after the installation.
On the Hadoop cluster side:
/var/log/bigdatasql
/var/log/oracle
On the Oracle Database side:
$ORACLE_HOME/install/bds* (This is a set of files, not a directory)
$ORACLE_HOME/bigdatasql/logs
/var/log/bigdatasql
Tip:
If you make a support request, create a zip archive that includes all of these logs and include it in your email to Oracle Support.
Other Post-Installation Steps to Consider
-
Read about measures you can take to secure the installation. (See Securing Oracle Big Data SQL.)
-
Learn how to modify the Oracle Big Data SQL configuration when changes occur on the Hadoop cluster and in the Oracle Database installation. (See Expanding or Shrinking an Installation.)
-
If you have used Copy to Hadoop in earlier Oracle Big Data SQL releases, learn how Oracle Shell for Hadoop Loaders can simplify Copy to Hadoop tasks. (See Additional Tools Installed.)
1.11.1 Run bds_cluster_node_helper.sh to Get Information About the Oracle Big Data SQL Installation on a Node
The script bds_cluster_node_helper.sh
aggregates information about a Hadoop cluster node that is useful for Oracle Big Data SQL maintenance purposes.
- Show Oracle Big Data SQL status information via bdscli, the Oracle Big Data SQL command line interface.
- Collect and archive log data that is pertinent to Oracle Big Data SQL operations. There are three levels to the scope of the data collection.
- Set some parameters that control the level of debug information in logs that are collected.
You can find this script at <Oracle Big Data SQL installation directory>/BDSJaguar
. It must be run as root.
Usage
# bds_cluster_node_helper.sh [OPTIONS]
Table 1-4 Parameters for bds_cluster_node_helper.sh
Parameter | Description |
---|---|
-h, --help |
Show usage information. |
-v, --version |
Show the Oracle Big Data Appliance release version. |
--skip-bdscli-info |
Skip bdscli information gathering.
Default: Runs the following bdscli commands and returns the output:
|
--get-logs [--log-level=<1|2|3>] [--bundle-name=<name>] [--wrap, --envelop] |
Generates a gzipped tar file of logs.
Default: Options:
Note: See the table below for more detail on each-get-logs sub-option.
|
--set-debug=<on| |
--set-debug=<supported value> Set or remove the
|
bds_cluster_node_helper.sh --get-logs
sub-options.
Table 1-5 Sub-Parameters for --get-logs Option of bds_cluster_node_helper.sh
bds_cluster_node_helper.sh --get-logs sub-options | Description |
---|---|
--get-logs --log-level=<1|2|3> |
Specifies the log level. Default: The scope of the recovery for each log level is as follows:
Example: |
--get-logs --bundle-name=<name> |
Give a name to the created tar.gz bundle.
Default: The customer can use this option to specify a different name. For example:
|
--get-logs [--wrap | --envelop] |
Prepares the bundle for email transmission.
Default: Examples:
These sub-options are equivalent. |
1.12 Using the Installation Quick Reference
Once you are familiar with the functionality of the Jaguar utility on the Hadoop side and bds-database-install.sh on the Oracle Database side, you may find it useful to work from the Installation Quick Reference for subsequent installations. This reference provides an abbreviated description of the installation steps. It does not fully explain each step, so users should already have a working knowledge of the process. Links to relevant details in this and other documents are included.