Oracle Big Data SQL 3.0 can connect Oracle Database to the Hadoop environment on Oracle Big Data Appliance, other systems based on CDH (Cloudera's Distribution including Apache Hadoop), HDP (Hortonworks Data Platform), and potentially other non-CDH Hadoop systems.
The procedures for installing Oracle Big Data SQL in these environments differ. To install the product in your particular environment, see the appropriate section:
Installing On Oracle Big Data Appliance and the Oracle Exadata Database Machine
See this section for installation on Oracle Big Data Appliance and Exadata servers only.
Installing Oracle Big Data SQL on Other Hadoop Systems
See this section for installation on both CDH (excluding Oracle Big Data Appliance) and non-CDH (specifically, HDP) systems.
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1) in My Oracle Support for up-to-date information on Big Data SQL compatibility with the following:
Oracle Engineered Systems.
Other systems.
Linux OS distributions and versions.
Hadoop distributions.
Oracle Database releases, including required patches.
To use Oracle Big Data SQL on an Oracle Exadata Database Machine connected to Oracle Big Data Appliance, you must install the Oracle Big Data SQL software on both systems.
Follow these steps to install the Oracle Big Data SQL software on Oracle Big Data Appliance and Oracle Exadata Database Machine.
Note:
This procedure is not applicable to the installation of Oracle Big Data SQL on systems other than Oracle Big Data Appliance and Oracle Exadata Database Machine.
The April 2016 Proactive Bundle Patch (12.1.0.2.160419 BP) for Oracle Database must be pre-installed on each Exadata Database Machine. You may instead use the January 12.1.0.2.160119 Bundle Patch, but that older BP requires an additional one-off patch. (See the Oracle Big Data SQL Master Compatibility Matrix, Doc ID 2119369.1 in My Oracle Support, for details.)
This procedure assumes that you are running Oracle Big Data Appliance 4.5 or intend to upgrade to it. However, Oracle Big Data SQL 3.0.1 is also compatible with Oracle Big Data Appliance 4.3. If you are running v4.3 and and do not intend to upgrade at this time, see the Oracle Big Data SQL Master Compatibility Matrix for patch requirements.
You can use Cloudera Manager to verify that Oracle Big Data SQL is up and running.
When you are done, if the cluster is secured by Kerberos then there are additional steps you must perform on both the cluster nodes and on the Oracle Exadata Database Machine. See Enabling Oracle Big Data SQL Access to a Kerberized Cluster.
In the case of an Oracle Big Data Appliance upgrade, the customer is responsible for upgrading the Oracle Database to a supported level before re-running the post-installation script.
When you run the bds-exa-install
post-installation script, you are configuring the database to “talk” to a Hadoop cluster - i.e. registering that cluster with the selected database. The first cluster registered with the database becomes the default/primary cluster for the database. If you want the database to connect to additional clusters, call bds-exa-install
again but use the --install-as-secondary
option.
If you uninstall the primary cluster’s registration by running bds-exa-install --uninstall-as-primary
, key configuration information is removed. Therefore you must rerun bds-exa-install
to reregister any clusters that should remain in communication with the database. One cluster should be reregistered as the primary, any others as secondaries.
Important
Runbds-exa-install.sh
on every node of the Exadata cluster. If this is not done, you will see RPC connection errors when the BDS service is started.To run the Oracle Big Data SQL post-installation script:
Copy the bds-exa-install.sh
installation script from the Oracle Big Data Appliance to a temporary directory on the Oracle Exadata Database machine. (Find the script on the node where Mammoth is installed, typically the first node in the cluster.) For example:
# curl -O http://bda1node07/bda/bds-exa-install.sh
Verify the name of the Oracle installation owner and set the executable bit for this user. Typically, the oracle
user owns the installation. For example:
$ ls -l bds-exa-install.sh $ chown oracle:oinstall bds-exa-install.sh $ chmod +x bds-exa-install.sh
Set the following environment variables:
$ORACLE_HOME to <database home> $ORACLE_SID to <correct db SID> $GI_HOME to <correct grid home>
Note:
You can set the grid home with the install script as mentioned in step 5 d instead of setting the $GI_HOME as mentioned in this step.
Check that TNS_ADMIN
is pointing to the directory where the right listener.ora
is running. If the listener is in the default TNS_ADMIN
location, $ORACLE HOME/network/admin
, then there is no need to define the TNS_ADMIN
. But if the listener is in a non-default location, TNS_ADMIN must correctly point to it, using the command:
export TNS_ADMIN=<path to listener.ora>
Perform this step only if the ORACLE_SID is in uppercase, else you can proceed to the next step. This is because the install script derives the CRS database resource from ORACLE_SID, only if it is in lowercase. Perform the following sequence of steps to manually pass the SID to the script, if it is in uppercase:
Run the following command to list all the resources.
$ crsctl stat res -t
From the output note down the ora.<dbresource>.db
resource name.
Run the following command to verify whether the correct ora.<dbresource>.db
resource name is returned or not.
$ ./crsctl stat res ora.<dbresource>.db
The output displays the resource names as follows:
NAME=ora.<dbresource>.db TYPE=ora.database.type TARGET=ONLINE , ONLINE STATE=ONLINE on <name01>, ONLINE on <name02>
Specify the --db-name=<dbresource>
as additional argument to the install script as follows:
./bds-exa-install.sh --db-name=<dbresource>
Additionally, you can set the grid home instead of setting the $GI_HOME as mentioned in step 3, along with the above command as follows:
./bds-exa-install.sh --db-name=<dbresource> --grid-home=<grid home>
Note:
You can skip the next step, if you performed this step.
Run the script as any user who has dba
privileges (who can connect to sys
as sysdba
).
If you have not run the post-installation script on this node before, then there are some first-time operations that need to be performed as root
in a separate run of the script. In that case, when you run bds-exa-install.sh
, the script will pause and prompt you to run it as root
in another shell. When the execution as root
is done, return to the pending script and press Enter to continue.
This two-phase procedure is not necessary if you have already run the post-installation script on this node previously. In that case, use --no-root-script
to bypass the prompt, as in:
./bds-exa-install.sh --no-root-script
If this is the first time that the post-installation has been run on this node, then instead, enter bds-exa-install.sh
with no parameters. The script will pause and prompt to run it as root
in another shell, as follows:
$ ./bda-exa-install.sh: bds-exa-install: root shell script : /u01/app/oracle/product/12.1.0.2/dbhome_1/install/bds-root-<cluster-name>-setup.sh please run as root: /u01/app/oracle/product/12.1.0.2/dbhome_1/install/bds-root-<rack-name>-clu-setup.sh
A sample output is shown here:
bds-exa-install: platform is Linux bds-exa-install: setup script started at : Sun Feb 14 20:06:17 PST 2016 bds-exa-install: bds version : bds-3.0-1.el6.x86_64 bds-exa-install: bda cluster name : mycluster1 bds-exa-install: bda web server : mycluster1bda16.us.oracle.com bds-exa-install: cloudera manager url : mycluster1bda18.us.oracle.com:7180 bds-exa-install: hive version : hive-1.1.0-cdh5.5.1 bds-exa-install: hadoop version : hadoop-2.6.0-cdh5.5.1 bds-exa-install: bds install date : 02/14/2016 12:00 PST bds-exa-install: bd_cell version : bd_cell-12.1.2.0.100_LINUX.X64_160131-1.x86_64 bds-exa-install: action : setup bds-exa-install: crs : true bds-exa-install: db resource : orcl bds-exa-install: database type : SINGLE bds-exa-install: cardinality : 1 bds-exa-install: root shell script : /u03/app/oracle/product/12.1.0/dbhome_1/install/bds-root-mycluster1-setup.sh please run as root: /u03/app/oracle/product/12.1.0/dbhome_1/install/bds-root-mycluster1-setup.sh waiting for root script to complete, press <enter> to continue checking.. q<enter> to quit bds-exa-install: root script seem to have succeeded, continuing with setup bds bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/install bds-exa-install: downloading JDK bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/install bds-exa-install: installing JDK tarball bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/jdk1.8.0_66/jre/lib/security bds-exa-install: Copying JCE policy jars /bin/mkdir: cannot create directory `bigdata_config/mycluster1': File exists bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/jlib bds-exa-install: removing old oracle bds jars if any bds-exa-install: downloading oracle bds jars bds-exa-install: installing oracle bds jars bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql bds-exa-install: downloading : hadoop-2.6.0-cdh5.5.1.tar.gz bds-exa-install: downloading : hive-1.1.0-cdh5.5.1.tar.gz bds-exa-install: unpacking : hadoop-2.6.0-cdh5.5.1.tar.gz bds-exa-install: unpacking : hive-1.1.0-cdh5.5.1.tar.gz bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/hadoop-2.6.0-cdh5.5.1/lib bds-exa-install: downloading : cdh-ol6-native.tar.gz bds-exa-install: creating /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/hadoop_mycluster1.env for hdfs/mapred client access bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql bds-exa-install: creating bds property files bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/bigdata_config bds-exa-install: created bigdata.properties bds-exa-install: created bigdata-log4j.properties bds-exa-install: creating default and cluster directories needed by big data external tables bds-exa-install: note this will grant default and cluster directories to public! catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_29579.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: granted default and cluster directories to public! bds-exa-install: mta set to use listener end point : EXTPROC1521 bds-exa-install: mta will be setup bds-exa-install: creating /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin/initbds_orcl_mycluster1.ora bds-exa-install: mta setting agent home as : /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin bds-exa-install: mta shutdown : bds_orcl_mycluster1 bds-exa-install: registering crs resource : bds_orcl_mycluster1 bds-exa-install: using dependency db resource of orcl bds-exa-install: starting crs resource : bds_orcl_mycluster1 CRS-2672: Attempting to start 'bds_orcl_mycluster1' on 'mycluster1bda09' CRS-2676: Start of 'bds_orcl_mycluster1' on 'mycluster1bda09' succeeded NAME=bds_orcl_mycluster1 TYPE=generic_application TARGET=ONLINE STATE=ONLINE on mycluster1bda09 bds-exa-install: patching view LOADER_DIR_OBJS catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30123.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: creating mta dblinks bds-exa-install: cluster name : mycluster1 bds-exa-install: extproc sid : bds_orcl_mycluster1 bds-exa-install: cdb : true catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink_catcon_30153.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink_catcon_30179.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink_catcon_30205.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink_catcon_30231.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30257.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30283.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: setup script completed all steps
For additional details see "Running the bds-exa-install Script".
In the case of a multi-instance database, repeat Step 6 for each instance.
When the script completes, the following items including Oracle Big Data SQL are available and running on the database instance.
Oracle Big Data SQL directory and configuration with jar, and environment and properties files.
Database dba_directories.
Database dblinks.
Database big data spfile parameter.
For example, you can verify the dba_directories from the SQL prompt as follows:
SQL> select * from dba_directories where directory_name like '%BIGDATA%';
Note:
If the Oracle Big Data SQL agent to stops, then you must restart it. See "Starting and Stopping the Big Data SQL Agent".The bds-exa-install
script generates a custom installation script that is run by the owner of the Oracle home directory. That secondary script installs all the files need by Oracle Big Data SQL into the $ORACLE_HOME/bigdatasql
directory. For Oracle NoSQL Database support, it installs the client library (kvclient.jar
). It also creates the database directory objects, and the database links for the multithreaded Oracle Big Data SQL agent.
The following is the bds-exa-install
syntax:
Usage: bds-exa-install oracle-sid=<orcl> ( --version --info --root-script-only --no-root-script --uninstall-as-primary --uninstall-as-secondary --install-as-secondary --jdk-home=<dir> --grid-home=<dir> )* Options --version Prints the script version. --info Print information such as the cluster name, CM host, Oracle Big Data Appliance HTTP server. --root-script-only Only generate the root script. --no-root-script Do not generate the root script. This can be used for second and subsequent runs of bds-exa-install on a node. --uninstall-as-primary Uninstall the cluster, including hadoop client JARs. In the bds-exa-install context, to "uninstall" a cluster means to unregister the cluster with the database. Note: if any other clusters should remain registered after removing the primary, then one cluster must be reinstalled as primary and any others as secondaries. --uninstall-as-secondary Attempt to uninstall the cluster as a secondary cluster. --install-as-secondary Default = false. Do not install client libraries, etc. The primary cluster will not be affected. In the bds-exa-install context, to "install" a cluster means to register the cluster with the database. --jdk-home For example: /opt/oracle/bd_cell12.1.2.0.100_LINUX.X64_150912.1/jdk --grid-home Oracle Grid Infrastructure home. For example: "/opt/oracle/bd_cell12.1.2.0.100_LINUX.X64_150912.1/../grid"
In case of problems running the install script on Exadata, perform the following steps and open an SR with Oracle Support with the details:
Collect the debug output by running the script in a debug mode as follows:
$ ./bds-exa-install.sh --db-name=<dbresource> --grid-home=<grid home> --no-root-script --debug OR $ ./bds-exa-install.sh --no-root-script --debug
Collect the Oracle Database version as follows:
Collect the result of opatch lsinventory
from RDBMS-RAC Home.
Collect the result of opatch lsinventory
from Grid Home
Result of the following SQL statement to confirm that the Datapatch is set up.
SQL> select patch_id, patch_uid, version, bundle_series, bundle_id, action, status from dba_registry_sqlpatch;
Collect the information from the following environment variables:
$ORACLE_HOME
$ORACLE_SID
$GI_HOME
$TNS_ADMIN
Result of running lsnrctl status
command.
Oracle Big Data Appliance already provides numerous security features to protect data stored in a CDH cluster on Oracle Big Data Appliance:
Kerberos authentication: Requires users and client software to provide credentials before accessing the cluster.
Apache Sentry authorization: Provides fine-grained, role-based authorization to data and metadata.
HDFS Transparent Encryption: Protects the data on disk and at rest. Data encryption and decryption is transparent to applications using the data.
Oracle Audit Vault and Database Firewall monitoring: The Audit Vault plug-in on Oracle Big Data Appliance collects audit and logging data from MapReduce, HDFS, and Oozie services. You can then use Audit Vault Server to monitor these services on Oracle Big Data Appliance
Oracle Big Data SQL adds the full range of Oracle Database security features to this list. You can apply the same security policies and rules to your Hadoop data that you apply to your relational data.
In order to give Oracle Big Data SQL access to HDFS data on a Kerberos-enabled cluster, make each Oracle Exadata Database Machine that needs access a Kerberos client. Also run kinit
on the oracle
account on each cluster node and Exadata Database Machine to ensure that the account is authenticated by Kerberos. There are two situations where this procedure is required:
When enabling Oracle Big Data SQL on a Kerberos-enabled cluster.
When enabling Kerberos on a cluster where Oracle Big Data SQL is already installed.
Note:
Oracle Big Data SQL queries will run on the Hadoop cluster as the owner of the Oracle Database process (i.e. theoracle
user). Therefore, the oracle
user needs a valid Kerberos ticket in order to access data. This ticket is required for every Oracle Database instance that is accessing the cluster. A valid ticket is also need for each Big Data SQL Server process running on the Oracle Big Data Appliance. Run kinit oracle
to obtain the ticket.These steps enable the operating system user to authenticate with the kinit utility before submitting Oracle SQL Connector for HDFS jobs. The kinit utility typically uses a Kerberos keytab file for authentication without an interactive prompt for a password.
On each node of the cluster:
Log in as the oracle
user.
Run kinit
on the oracle account.
$ kinit oracle
Enter the Kerberos password.
Log on to the primary node and then stop and restart Oracle Big Data SQL.
$ bdacli stop big_data_sql_cluster $ bdacli start big_data_sql_cluster
On all Oracle Exadata Database Machines that need access to the cluster:
Copy the Kerberos configuration file /etc/krb5.conf
from the node where Mammoth is installed to the same path on each Oracle Exadata Machine.
Run kinit
on the oracle
account and enter the Kerberos password.
Re-run the Oracle Big Data SQL post-installation script
$ ./bds-exa-install.sh
Avoiding Kerberos Ticket Expiration
The system should run kinit on a regular basis, before letting the Kerberos ticket expire, to enable Oracle SQL Connector for HDFS to authenticate transparently. Use cron or a similar utility to run kinit. For example, if Kerberos tickets expire every two weeks, then set up a cron job to renew the ticket weekly.
The Big Data SQL agent on the database is managed by Oracle Clusterware. The agent is registered with Oracle Clusterware during Big Data SQL installation to automatically start and stop with the database. To check the status, you can run mtactl status
from the Oracle Grid Infrastructure home or Oracle Clusterware home:
# mtactl status bds_databasename_clustername
Oracle Big Data SQL is deployed using the services provides by the cluster management server. The installation process uses the management server API to register the service and start the deployment task. From there, the management server controls the process.
After installing Big Data SQL on the cluster management server, use the tools provided in the bundle to generate an installation package for the database server side.
You can download Oracle Big Data SQL from the Oracle Software Delivery Cloud
Table 2-1 Oracle Big Data SQL Product Bundle Inventory
File | Description |
---|---|
setup-bds |
Cluster-side installation script |
bds-config.json |
Configuration file. |
api_env.sh |
Setup REST API environment script |
platform_env.sh |
BDS service configuration script |
BIGDATASQL-1.0.jar |
CSD file (in the CDH product bundle only) |
bin/json-select |
JSON-select utility |
db/bds-database-create-bundle.sh |
Database bundle creation script |
db/database-install.zip |
Database side installation files |
repo/BIGDATASQL-1.0.0-el6.parcel |
Parcel file (in the CDH product bundle only) |
repo/manifest.json |
Hash key for the parcel file (in the CDH product bundle only) |
BIGDATASQL-1.0.0-el6.stack |
Stack file (in the HDP product bundle only) |
The following are required in order to install Oracle Big Data SQL on the Hortonworks Hadoop Data Platform (HDP).
Services Running
The following services must be running at the time of the Big Data SQL installation
HDP
Ambari
HDFS
YARN
Zookeeper
Hive
Tez
Packages
The following packages must be pre-installed before installing Big Data SQL.
JDK version 1.7 or later
Python version 2.6.
OpenSSL version 1.01 build 16 or later
System Tools
curl
rpm
scp
tar
unzip
wget
yum
Environment Settings
The following environment settings are required prior to the installation.
ntp enabled
iptables disabled
Ensure that /usr/java/default
exists and is linked to the appropriate Java version. To link it to the latest Java version, perform the following as root
:
$ ln -s /usr/java/latest /usr/java/default
If Oracle Big Data SQL is Already Installed
If the Ambari Web GUI shows that the Big Data SQL service is already installed, make sure that all Big Data SQL Cell components are stopped before reinstalling. (Use the actions button, as with any other service.)
The following conditions must be met when installing Oracle Big Data SQL on a CDH cluster that is not part of an Oracle Big Data Appliance.
Note:
The installation prerequisites as well as the procedure for installing Oracle Big Data SQL on the Oracle Big Data Appliance are different from process used for installations on other CDH systems. See Installing On Oracle Big Data Appliance and the Oracle Exadata Database Machine if you are installing on Oracle Big Data Appliance.Services Running
The following services must be running at the time of the Oracle Big Data SQL installation
Cloudera’s Distribution including Apache Hadoop (CDH)
HDFS
YARN
Zookeeper
Hive
Packages
The following packages must be pre-installed before installing Oracle Big Data SQL. The Oracle clients are available for download on the Oracle Technology Network.
JDK version 1.7 or later
Oracle Instant Client – 12.1.0.2 or higher, e.g. oracle-instantclient12.1-basic-12.1.0.2.0-1.x86_64.rpm
Oracle Instant JDBC Client – 12.1.0.2 or higher, e.g. oracle-instantclient12.1-jdbc-12.1.0.2.0-
PERL LibXML – 1.7.0 or higher, e.g. perl-XML-LibXML-1.70-5.el6.x86_64.rpm
Apache log4j
System Tools
unzip
finger
wget
Environment Settings
The following environment settings are required prior to the installation.
Ensure that /usr/java/default
exists and is linked to the appropriate Java version. To link it to the latest Java version, perform the following as root
:
$ ln -s /usr/java/latest /usr/java/default
The path to the Java binaries must exist in /usr/java/latest
.
The default path to Hadoop libraries must be in /opt/cloudera/parcels/CDH/lib/
.
If Oracle Big Data SQL is Already Installed
If the Configuration Manager shows that the Big Data SQL service is already installed, make sure that all Big Data SQL Cell components are stopped before reinstalling.
The Oracle Big Data SQL installation consists of two stages.
Cluster-side installation:
Deploys binaries along the cluster.
Configures Linux and network settings for the service on each cluster node.
Configures the service on the management server.
Acquires cluster information for configure database connection.
Creates database bundle for the database side installation.
Oracle Database server-side installation:
Copies binaries into database node.
Configures network settings for the service.
Inserts cluster metadata into database.
The first step of the Oracle Big Data SQL installation is to run the installer on the Hadoop cluster management server (where Cloudera Manager runs on a CDH system or Ambari runs on an HDP system). As post-installation task on the management server, you then run the script that prepares the installation bundle for the database server.
Extract the files from BIGDATASQL product bundle saved from the download (either BigDataSQL-CDH-<version>.zip
or BigDataSQL-HDP-<version>.zip
) then configure and run the Oracle Big Data SQL installer found within the bundle. This installs Oracle Big Data SQL on the local server.
Run the database bundle creation script. This script generates the database bundle file that you will run on the Oracle Database server in order to install Oracle Big Data SQL there.
Check the parameters in the database bundle file and adjust as needed.
After you have checked and (if necessary) edited the database bundle file, copy it over to the Oracle Database server and run it as described in Installing on the Oracle Database Server
Install Big Data SQL on the Cluster Management Server
To install Big Data SQL on the cluster management server:
Copy the appropriate zip file (BigDataSQL-CDH-<version>.zip
or BigDataSQL-HDP-<version>.zip
) to a temporary location on the cluster management server.
Unzip file.
Change directories to either BigDataSQL-HDP-<version>
or BigDataSQL-CDH-<version>
, depending up on which platform you are working with.
Edit the configuration file.
Table 2–4 below describes the use of each configuration parameter.
For CDH, edit bds-config.json
, as in this example. Any unused port will work as the web server port.
{ "CLUSTER_NAME" : "cluster", "CSD_PATH" : "/opt/cloudera/csd", "DATABASE_IP" : "10.12.13.14/24", "REST_API_PORT" : "7180", "WEB_SERVER_PORT" : "81", }
For HDP, edit bds-config.json
as in this example:
{ "CLUSTER_NAME" : "clustername", "DATABASE_IP" : "10.10.10.10/24", "REST_API_PORT" : "8080", }
DATABASE_IP
must be the correct network interface address for the database node where you will perform the installation. You can confirm this by running /sbin/ip -o -f inet addr show
on the database node.
Obtain the cluster administrator user ID and password and then as root
run setup-bds
. Pass it the configuration file name as an argument (bds-config.json
). The script will prompt for the administrator credentials and then install BDS on the management server.
$ ./setup-bds bds-config.json
Table 2-2 Configuration Parameters for setup-bds
Configuration Parameter | Use | Applies To |
---|---|---|
CLUSTER_NAME |
The name of the cluster on the Hadoop server. | CDH, HDP |
CSD_PATH |
Location of Custom Service Descriptor files. This user-defined CSD path is a fallback that is used only the default path does not exist. It does not override the default CSD_PATH. | CDH only |
DATABASE_IP |
The IP address of the Oracle Database server that will make connection requests. The address must include the prefix length (as in 100.112.10.36/24). Although only one IP address is specified in the configuration file, it is possible to install the database-side software on multiple database servers (as in a RAC environment) by using a command line parameter to override |
CDH, HDP |
REST_API_PORT |
The port where the cluster management server listens for requests. | CDH, HDP |
WEB_SERVER_PORT |
A port assigned temporarily to a repository for deployment tasks during installation. This can be any port where the assignment does not conflict with cluster operations. ` | CDH only. |
Important
Be sure that the address provided forDATABASE_IP
is the correct address of a network interface on the database server and is accessible from each DataNode of the Hadoop system, otherwise the installation will fail. You can test that the database IP replies to a ping from each DataNode. Also, currently the address string (including the prefix length) must be at least nine characters long.If the Oracle Big Data SQL Service Immediately Fails
If Ambari or Configuration Manager reports an Oracle Big Data SQL service failure immediately after service startup, do the following.
Check the cell server (CELLSRV) log on the cluster management server for the following error at the time of failure:
ossnet_create_box_handle: failed to parse ip : <IP Address>
If the IP address in the error message is less than nine characters in length, for example, 10.0.1.4/24
, then on the cluster management server, find this address in /opt/oracle/bd_cell/cellsrv/deploy/config/cellinit.ora
. Edit the string by padding one or more of the octets with leading zeros to make the total at least nine characters in length, as in:
ipaddress1=10.0.1.004/24
Restart the Oracle Big Data SQL service.
The need for this workaround will be eliminated in a subsequent Oracle Big Data SQL release.
Command Line Switches for setup-bds.sh
The setup-bds.sh script has several optional switches. These are not overrides of the configuration in bds_config.json
. They change the behavior of setup-bds.sh
as described in the table below. In each case, the bds_config.json
filename is passed in as an argument, as in ./setup-bds.sh <switch> bds_config.json
.
Table 2-3 Command Line Switches for setup-bds.sh
Switch | Description | Usage |
---|---|---|
--db-bundle |
Recreate or update the database bundle configuration file. This is a preliminary that may be necessary if you need to use bds-database-create-bundle.sh to recreate or update the database bundle. For example, you may need to rebuild and redeploy the database bundle to account for a changes in the cluster configuration setttings. In that case, run setup-bds.sh with this optional switch to recreate the configuration file if the original file is obsolete or cannot be located.
After recreating the configuration file, you can run |
./setup-bds.sh --db-bundle bds-config.json |
--uninstall |
Uninstall Oracle Big Data SQL from the Hadoop cluster management server. See Uninstalling Oracle Big Data SQL. | ./setup-bds.sh --uninstall bds-config.json |
On the cluster management server, run the database bundle creation script from the Oracle Big Data SQL download to create an installation bundle to install the product on the Oracle Database server. If some of the external resources that the script requires are not accessible from the management server, you can add them manually.
The database bundle creation script attempts to download the following:
Hadoop and Hive client tarballs from Cloudera or Hortonworks repository web site.
Configuration files for Yarn and Hive from the cluster management server, via Cloudera Manager (for the CDH versions) or Ambari (for the HDP versions).
For HDP only, HDFS and MapReduce configuration files from Ambari.
Change directories to BigDataSQL-CDH-<version>/db
or (BigDataSQL-HDP-<version>/db
).
Run the BDS database bundle creation script. See the table below for optional parameters that you can pass to the script in order to override any of the default settings.
$ bds-database-create-bundle.sh <optional parameters>
The message below is returned if the operation is successful.
bds-database-create-bundle: database bundle creation script completed all steps
The database bundle file includes a number of parameters. You can change any of these parameters as necessary. Any URLs specified must be accessible from the cluster management server at the time you run bds-database-create-bundle.sh.
Table 2-4 Command Line Parameters for bds-database-create-bundle.sh
Parameter | Value |
---|---|
--hadoop-client-ws |
Specifies an URL for the Hadoop client tarball download. |
--no-hadoop-client-ws |
Exclude this download. |
--hive-client-ws |
Specifies an URL for the Hive client tarball download. |
--no-hive-client-ws |
Exclude this download. |
--yarn-conf-ws |
Specifies an URL for the YARN configuration zip file download. |
--no-yarn-conf-ws |
Exclude this download. |
--hive-conf-ws |
Specifies an URL for the Hive configuration zip file download. |
--no-hive-conf-ws |
Exclude this download. |
--ignore-missing-files |
Create the bundle file even if some files are missing. |
--jdk-tar-path |
Override the default JDK path. Do not specify a relative path, use --jdk-tar-path=<jdk tarfile absolute path>. |
--clean-previous |
Deletes previous bundle files and directories from bds-database-install/ . If cluster management server the cluster settings have changed (for example, because of an extension, service node migration, or adding/removing security) then it necessary to redo the installation on the database server. As part of this re-installation, you must run --clean-previous to purge the cluster information left the database server side from the previous installation. |
--script-only |
This is useful for re-installations on the database side when there are no cluster configuration changes to communicate to the database server and where there is no need to refresh files (such as client tarballs) on the database side. With this switch, bds-database-create-bundle.sh generates a zip file that contains only the database installation script and does not bundle in other components, such as the tarballs. If these already exist on the database server, you can use --script-only to bypass the downloading and packaging of these large files. Do not include --clean-previous in this case. |
--hdfs-conf-ws |
Specify an URL for the HDFS configuration zip file download. |
--no-hdfs-conf-ws |
Exclude this download (HDP only). |
--mapreduce-conf-ws |
Specify an URL for the MapReduce configuration zip file download (HDP only). |
--no-mapreduce-conf-ws |
Exclude this download (HDP only). |
Manually Adding Resources if Download Sites are not Accessible to the BDS Database Bundle Creation Script
If one or more of the default download sites is inaccessible from the cluster management server, there are two ways around this problem:
Download the files from another server first and then provide bds-database-create-bundle.sh
with the alternate path as an argument. For example:
$ ./bds-database-create-bundle.sh --yarn-conf-ws='http://nodexample:1234/config/yarn'
Because the script will first search locally in /bds-database-install
for resources, you can download the files to another server, move the files into /bds-database-install
on the cluster management server and then run the bundle creation script with no additional argument. For example:
$ cp hadoop-xxxx.tar.gz bds-database-install/ $ cp hive-xxxx.tar.gz bds-database-install/ $ cp yarn-conf.zip bds-database-install/ $ cp hive-conf.zip bds-database-install/ $ cd db $ ./bds-database-create-bundle.sh
Copying the Database Bundle to the Oracle Database Server
Use scp
to copy the database bundle you created to the Oracle Database server. In the example below, dbnode
is the database server. The Linux account and target directory here are arbitrary. Use any account authorized to scp
to the specified path.
$ scp bds-database-install.zip oracle@dbnode:/home/oracle
The next step is to log on to the Oracle Database server and install the bundle.
Oracle Big Data SQL must be installed on both the Hadoop cluster management server and the Oracle Database server. This section describes the database server installation.
Prerequisites for Installing on an Oracle Database Server
The information in this section does not apply to the installation of Oracle Big Data SQL on an Oracle Exadata Database Machine connected Oracle Big Data Appliance.
Important
For multi-node databases, you must repeat this installation on every node of the database. For each node, you may need to modify the DATABASE_IP
parameter of the installation bundle in order to identify the correct network interface. This is described in the section, If You Need to Change the Configured Database_IP Address
Required Software
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1) in My Oracle Support for supported Linux distributions, Oracle Database release levels, and required patches.
Note:
Be sure that the correct Bundle Patch and one-off patch have been pre-applied before starting this installation. Earlier Bundle patches are not supported for use Big Data SQL 3.0 at this time.Recommended Network Connections to the Hadoop Cluster
Oracle recommends Ethernet connections between Oracle Database and the Hadoop cluster of 10Gb/s Ethernet.
Extract and Run the Big Data SQL Installation Script
Perform the procedure in this section as the oracle
user, except where sudo
is indicated.
Locate the database bundle zip file that you copied over from the cluster management server.
Unzip the bundle into a temporary directory.
Change directories to bds-database-install
, which was extracted from the zip file.
Run bds-database-install.sh
. Note the optional parameters listed in Table 2-5
If bds-database-install.sh
finds and updates /etc/oracle/cell/network-config/cellinit.ora
, then the installation is complete. If not, then the following prompt displays:
Please run as root: <temporary location>/bds-root-cluster-cellconfigdir.sh Waiting for root script to complete. Press <Enter> to continue checking or press q<Enter> to quit.
This prompt appears only if the network-config
directory and/or cellinit.ora
are not found at /etc/oracle/cell/network-config/cellinit.ora
. The installation is temporarily suspended so that you can take the following corrective action. (You can press q
and then Enter
to cancel the installation if you do not want to continue at this time.)
Star another shell and run the secondary scriptbds-root-cluster-cellconfigdir.sh
as root or via sudo.
The script will create the complete path if needed. If cellinit.ora
is missing, it will create the file and populate it with a temporary IP address that will allow the installation to continue.
When bds-root-cluster-cellconfigdir.sh
has finished, return to the original shell and press Enter
to resume the installation.
The installation will enter the correct IP address for the database and other required parameters into cellinit.ora
.
Table 2-5 Optional Parameters for bds-database-install.sh
Parameter | Function |
---|---|
--db-name |
Specify the Oracle Database SID. |
--debug |
Activate shell trace mode. If you report a problem, Oracle Support may want to see this output. |
--grid-home |
Specify the Grid home directory. |
--info |
Show information about the cluster. |
--ip-cell |
Set a particular IP address for db_cell process. See If You Need to Change the Configured Database_IP Address below |
--install-as-secondary |
Specify secondary cluster installation. |
--jdk-home |
Specify the JDK home directory. |
--root-script-only |
Generate the root script only. |
--uninstall-as-primary |
Uninstall Oracle Big Data SQL from the primary cluster. |
--uninstall-as-secondary |
Uninstall Oracle Big Data SQL from a secondary cluster. |
--version |
Show the bds-database-install.sh script version. |
If You Need to Change the Configured Database_IP Address
The DATABASE_IP
parameter in the bds-config.json
file identifies the network interface of the database node. If you run bds-database-install.sh
with no parameter passed in, it will search for that IP address (with that length, specifically) among the available network interfaces. You can pass the ––ip-cell
parameter to bds-database-install.sh
in order to override the configured DATABASE_IP
setting:
$ ./bds-database-install.sh --ip-cell=10.20.30.40/24
Possible reasons for doing this are:
bds-database-install.sh
terminates with an error. The configured IP address (or length) may be wrong.
There is an additional database node in the cluster and the defined DATABASE_IP
address is not a network interface of the current node.
The connection is to a multi-node database. In this case, perform the installation on each database node. On each node, use the ––ip-cell
parameter to set the correct DATABASE_IP
value.
ip-cell
, you can use list all network interfaces on a node as follows:
/sbin/ip -o -f inet addr show
Oracle Big Data SQL can be uninstalled from the Hadoop cluster management server or from any Oracle Database servers connected to the cluster management server. The procedure is the same for all Hadoop platforms.
Guidelines for uninstalling Oracle Big Data SQL are as follows:
To perform a complete uninstall of Oracle Big Data SQL, remove the software from the cluster management server and from each Oracle Database server connected to the BDS service.
This is a single script execution on each server. Not other manual steps are needed.
You can uninstall from the cluster management server first or from the database servers first.
Note, however, that if you uninstall from the cluster management server first, queries in process will fail.
On the database server side, uninstall from any secondary notes before uninstalling from the primary node.
This is not critical to the uninstall process, but active queries from secondary nodes will fail if the primary node is disconnected from the service.
You can uninstall from one or more secondary database nodes without impacting the operation of the Big Data SQL service on the Hadoop cluster management server.
Uninstalling the Software from the Hadoop Cluster Management Server
From the bds-database-install
directory, run the following command as root.
# ./setup-bds --uninstall bds-config.json
The script will return the following to standard output. Annotations and name alterations are included for the purposes of this example.
Big Data SQL //including the version Big Data SQL: Loading Configuration File Big Data SQL: Configuration File Loaded Successfully Big Data SQL: Beginning API setup, you will be asked for your Ambari admin user and password. Big Data SQL: REST API port is 8080 Big Data SQL: Identifying protocol for connection.. Big Data SQL: REST API version is v1 Ambari Admin User: //or Configuration Manager admin user Password: Big Data SQL: API setup finished successfully. Big Data SQL: Cluster myclusterexample verified. Big Data SQL: Big Data SQL service was detected. Beginning uninstall Big Data SQL: Stopping Big Data SQL service in order to begin uninstall. Big Data SQL: Executing Asynchronous Command.... Big Data SQL: Asynchronous Command Completed! Big Data SQL: Big Data SQL service Stopped! Big Data SQL: Cell nodes: myclusterexample-adm.myregion.mydomain.com,mybdanode08-adm.myregion.mydomain.com,mybdanode09-adm.myreqion.mydomain.com Big Data SQL: Executing Asynchronous Command... Big Data SQL: Asynchronous Command Completed! Big Data SQL: Deleted previous Big Data SQL service. Big Data SQL: Beginning Cleanup of the Ambari Server Node. Big Data SQL: Stack root directory found: /var/lib/ambari-server/resources/stacks/HDP/2.4/services Big Data SQL: Removing Big Data SQL Stack directory Big Data SQL: Removing Big Data SQL Log directory Big Data SQL: Cleaning up installation... Big Data SQL: Finished!
Uninstalling the Software from an Oracle Database Server
On any database server where you want to uninstall Oracle Big Data SQL, run the appropriate command below as the database owner (usually the oracle
user).
# ./bds-database-install.sh --uninstall-as-secondary
or
# ./bds-database-install.sh --uninstall-as-primary
If you use the --uninstall-as-secondary
switch to uninstall the software from the primary node, cleanup of database objects will be incomplete. This can be remedied by running the uninstall again. Error messages may appear if your run a second uninstall for cleanup purposes, but the cleanup should complete successfully.
The following example show the output of bds-database-install.sh --uninstall-as-primary
. The output for --uninstall-as-secondary
is similar. In this case, the command is run on a CDH cluster, but the differences in the output on an HDP cluster are minor.
oracle@mynode42bda06$ ./bds-database-install.sh --uninstall-as-primary bds-database-install: platform is : Linux bds-database-install: setup script started at : Wed May 25 11:49:07 PDT 2016 bds-database-install: cluster type : cdh bds-database-install: cluster name : myclusterexample bds-database-install: hive version : hive-1.1.0-cdh5.7.0 bds-database-install: hadoop version : hadoop-2.6.0-cdh5.7.0 bds-database-install: bds version : Big Data SQL 3.0.1 bds-database-install: bds install date : Fri May 13 10:45:59 PDT 2016 bds-database-install: bd_cell version : bd_cell-12.1.2.0.100_LINUX.X64_151208.1100-1.x86_64 bd_cell-12.1.2.0.100_LINUX.X64_160511.1100-1.x86_64 bds-database-install: cell config dir : /etc/oracle/cell/network-config bds-database-install: configured cell network : 10.101.4.13/20 bds-database-install: allow multiple subnets : _skgxp_ant_options=1 bds-database-install: use UDP protocol : _skgxp_dynamic_protocol=2 bds-database-install: cellaffinity.ora file : missing bds-database-install: configured DB network : 10.245.129.72/21 bds-database-install: action : uninstall bds-database-install: crs : true bds-database-install: db resource : orcl bds-database-install: database type : SINGLE bds-database-install: cardinality : 1 bds-database-uninstall: removing: oracle-hadoop-sql.jar ora-hadoop-common.jar oraloader.jar kvclient.jar orahivedp.jar bds-database-uninstall: removing: client jars hadoop-2.6.0-cdh5.7.0 bds-database-uninstall: removing: client jars hive-1.1.0-cdh5.7.0 bds-database-uninstall: mta setting agent home as : /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin bds-database-uninstall: stopping crs resource : bds_orcl_myclusterexample CRS-2673: Attempting to stop 'bds_orcl_myclusterexample' on 'mynode42bda06' CRS-2677: Stop of 'bds_orcl_myclusterexample' on 'mynode42bda06' succeeded bds-database-uninstall: deleting crs resource : bds_orcl_myclusterexample bds-database-uninstall: removing /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin/initbds_orcl_myclusterexample.ora bds-database-uninstall: dropping mta related db links catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_5$51.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_5$01.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-database-uninstall: uninstalled : myclusterexample
Procedures for securing Oracle Big Data SQL on Hortonworks HDP and on CDH-based systems other than Oracle Big Data Appliance are not covered in this version of the guide. Please review the MOS documents referenced in this section for more information.
Please refer to MOS Document 2123125.1 at My Oracle Support for guidelines on securing Hadoop clusters for use with Big Data SQL.