3 Installing or Upgrading the Oracle Database Side of Oracle Big Data SQL
Oracle Big Data SQL must be installed on both the Hadoop cluster management server and the Oracle Database server. This section describes the installation of Oracle Big Data SQL on Oracle Database systems, including single-node and RAC systems.
3.1 Before You Start the Database-Side Installation
Review the points in this section before starting the installation.
If the current version of Oracle Big Data SQL is already installed on the Oracle Database system and you are making changes to the existing configuration, you do not need to repeat the entire installation process. See Reconfiguring an Existing Oracle Big Data SQL Installation.
Important:
-
For multi-node databases (such as Oracle RAC systems), you must repeat this installation on every compute node of the database. If this is not done, you will see RPC connection errors when the Oracle Big Data SQL service is started.
-
It is recommended that you set up passwordless SSH for root on the database nodes where Grid is running. Otherwise, you will need to supply the credentials during the installation on each node.
-
If the
diskmon
process is not already running prior to the installation, a Grid infrastructure restart will be required in order to complete the installation. -
If you set up Oracle Big Data SQL connections from more than one Hadoop cluster to the same database, be sure that the configurations for each connection are the same. Do not set up one connection to use Infiniband and another to use Ethernet. Likewise, if you enable database authentication in one configuration, then this feature must be enabled in all Oracle Big Data SQL connections between different Hadoop clusters and the same database.
-
If the database system is Kerberos-secured, then it is important to note that authentication for the database owner principal can be performed by only one KDC. Oracle Big Data SQL currently does not support multiple Kerberos tickets. If two or more Hadoop cluster connections are installed by the same database owner, all must use the same KDC.
The
KRB5_CONF
environment variable must point to only one configuration file. The configuration file must exist, and the variable itself must be set for the database owner account at system startup. -
For database running under Oracle Grid Infrastructure, if the system has more than one network interfaces of the same type (two or more Ethernet, two or more InfiniBand, or two or more Ethernet-over-InfiniBand interfaces), then the installation always selects the one whose name is first in alphabetical order.
See Also:
Details are provide in Special Considerations When a System Under Grid Infrastructure has Multiple Network Interfaces of the Same Type.
3.1.1 Check for Required Grid Patches With bds-validate-grid-patches.sh
The script bds-validate-grid-patches.sh
checks to determine whether or not patches to the Oracle Grid Infrastructure that are required by Oracle Big Data SQL have been installed.
Note:
If you want to run this script, do so before Oracle Big Data SQL is installed on the database node.- ORACLE_HOME must be set to the Grid home.
$ echo $ORACLE_HOME /u01/app/18c/gridhome
- Copy the file
bdsql_db_patches
from the Oracle Big Data SQL installation directory on the Hadoop cluster to the same directory where you will runbds-validate-grid-patches.sh
on the database node. The script usesbdsql_db_patches
to determine if the required Grid patches are present for this release of Orace Big Data SQL on the currently-installed Oracle Database release.You can find
bdsql_db_patches
at the following location on the Hadoop cluster node where you ran that side of the installation:
By default, the installation directory on the Hadoop cluster is<Oracle Big Data SQL installation directory>/BDSJaguar/bdsrepo/db/bdsql_db_patches>
/opt/oracle
. - You must run this script as the Oracle Grid Infrastructure installation owner (Grid user).
See Also:
The Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1 in My Oracle Support) provides up-to-date information on patch requirements.
3.1.2 Potential Requirement to Restart Grid Infrastructure
In certain database environments bds-database-install.sh
needs to create cellinit.ora
and/or celliniteth.ora
. In these cases, the script will attempt to propagate similar changes across all nodes in the Grid Infrastructure. To do this, the script expects that passwordless SSH is set up between oracle
and root
or it will prompt for the root password for each node during execution. If the nature of the changes requires a restart of the Grid Infrastructure, the script will also display messages indicating that grid infrastructure needs to be restarted manually. Because the installlation cannnot complete without Grid credentials if a restart is necessary, be sure that you have the Grid password at hand.
3.1.2.1 Understanding When Grid or Database Restart is Required
On the database side of Oracle Big Data SQL, the diskmon process is the agent in charge of communications with the Hadoop cluster. This is similar to its function on the Oracle Exadata Database Machine, where it manages communications between compute nodes and storage nodes.
In Grid environments, diskmon is owned by the Grid user. In a non-Grid environment, it is owned by the Oracle Database owner.
In Oracle Big Data SQL, diskmon settings are stored on cellinit.ora
and celliniteth.ora
files in the /etc/oracle/cell/network-config/
directory. The installer updates these files in accordance with the cluster connection requirements.
This is how the installer determines when Grid or Oracle Database needs to be restarted:
-
If the installer detects that no previous
cellinit.ora
orcelliniteth.ora
file exists, this means that no diskmon process is running. In this case, if the environment includes Oracle Grid then you must restart the Grid. If the environment does not include Grid, then you must restart Oracle Database. -
If previous
cellinit.ora
and/orcelliniteth.ora
file exist, this indicates that diskmon process is running. In this case, if the installer needs to make a change to these files, then only the database needs to be restarted. -
In multi-node Grid environments, diskmon works on all nodes as a single component and
cellinit.ora
andcelliniteth.ora
must be synchronized on all nodes. This task is done through SSH. If passwordless SSH is set up on the cluster, no user interaction is required. If passwordless SSH is not set up, then the script will pause for you to input the root credentials for all nodes. When the cellinit.ora and celliniteth.ora files across all nodes are synchronized, then the script will continue. Then the script finishes and in this case, you must restart the Grid infrastructure.
3.1.3 Special Considerations When a System Under Grid Infrastructure has Multiple Network Interfaces of the Same Type
The Oracle Big Data SQL installation or SmartScan operation may sometimes fail in these environments because the wrong IP address is selected for communication with the cells.
When installing Oracle Big Data SQL within Oracle Grid Infrastructure, you cannot provide the installer with a specific IP address to select for communication with the Oracle Big Data SQL cells on the nodes. Network interface selection is automatically determined in this environment. This determination is not always correct and there are instances where the database-side installation may fail, or, you may later discover that SmartScan is not working.
You can manually correct this problem, but first it is helpful to understand how the installation decides which network interfaces to select.
How the Installation Selects From Among Multiple Interfaces of the Same Type on a System Under Grid Infrastructure
-
The
diskmon
process is controlled by Oracle Grid Infrastructure and not by the database. Grid manages communications with the Oracle Big Data SQL cells. -
The Oracle Big Data SQL installer in these cases does not create
cellinit.ora
andcelliniteth.ora
, nor does it update the cell settings stored in these files. In these environments, the task is handled by Grid, because it is a cluster-wide task that must be synchronized across all nodes. -
If there are multiple network interfaces, the Grid-managed update to the cells automatically selects the first network interface of the appropriate type on each node. It selects the interface whose name is first in alphabetical order.
For example, here is a system that is under Grid Infrastructure. It has multiple InfiniBand, Ethernet, and Ethernet over InfiniBand (bondeth*) network interfaces. This is the list of interfaces:
[root@mynode ~]# ip -o -f inet addr show
1: lo inet 127.0.0.1/8
2: eth0 inet 12.17.207.156/21
3: eth1 inet 16.10.145.12/21
19: bondeth0 inet 12.17.128.15/20
20: bondeth2 inet 16.10.230.160/20
21: bondeth4 inet 192.168.1.45/20
30: bondib0 inet 192.168.31.178/21
31: bondib1 inet 192.168.129.2/21
32: bondib2 inet 192.168.199.205/21
33: bondib3 inet 192.168.216.31/21
34: bondib4 inet 192.168.249.129/21
When the Oracle Big Data SQL installer runs on this system, the following interfaces would be selected.
-
192.168.31.178/21 is selected for the InfiniBand connection configured in
cellinit.ora
.Among the InfiniBand interfaces on this system, the interface name
bondib0
is first in an ascending alphabetical sort. -
12.17.128.15/20 is selected for an Ethernet-over-InfniBand connection configured in
celliniteth.ora
.Note:
This example demonstrates an additional selection factor – Ethernet over InfiniBand takes precedence over standard Ethernet.The
bondeth0
interface name is first in this case.
How the Installation (or SmartScan) may Fail Under These Conditions
It is possible that diskmon
cannot connect to the Oracle Big Data SQL cells via the network interface selected according to the logic described above. The correct subnet (one that diskmon
can reach) may not appear first in an alphabetical sort.
How to Fix the Problem
You can manually change the IP addresses in thecellinit.ora
and celliniteth.ora
files. These files are at /etc/oracle/cell/network-config
on each node.
-
Stop the CRS process. (Be sure to do this before the cell edit. If you do not,
diskmon
may hang.) -
Edit
cellinit.ora
and/orcelliniteth.ora
. Change the IP addresses as needed. -
Restart CRS.
3.2 About the Database-Side Installation Directory
You start the database side of the installation by unpacking the database-side installation bundle and executing the run file it contains. The run file creates an installation directory under $ORACLE_HOME/BDSJaguar-4.0.0
. For example:
$ORACLE_HOME/BDSJaguar-4.0.0/cdh510-6-node1.mycluster.mydomain.com
The installation of Oracle Big Data SQL is not finished when this directory is created. The directory is a staging area that contains all of the files needed to complete the installation on the database node.
There can be Oracle Big Data SQL connections between the database and multiple Hadoop clusters. Each connection is established through a separate database-side installation and therefore creates a separate installation directory. The segments in the name of the installation directory enable you to identify the Hadoop cluster in this specific connection:
<Hadoop cluster name>-<Number nodes in the cluster>-<FQDN of the cluster management server node>
You should keep this directory. It captures the latest state of the installation and you can use it to regenerate the installation if necessary. Furthermore, if in the future you need to adjust the database-side of the installation to Hadoop cluster changes, then the updates generated by the Jaguar reconfigure
command are applied to this directory.
Consider applying permissions that would prevent the installation directory from being modified or deleted by any user other than oracle
(or other database owner).
3.3 Steps for Installing on Oracle Database Nodes
To install the database side of Oracle Big Data SQL, copy over the zip file containing the database-side installation bundle that was created on the Hadoop cluster management server, unzip it, execute the run file it contains, then run the installer.
Perform the procedure in this section as the database owner (oracle
or other ). You stage the bundle in a temporary directory, but after you unpack the bundle and execute the run file it contains, then the installation package is installed in a subdirectory under $ORACLE_HOME
. For example: $ORACLE_HOME/BDSJaguar-4.0.0/cdh510-6-node1.mycluster.mydomain.com
.
Before starting, check that ORACLE_HOME
and ORACLE_SID
are set correctly.
How Many Times Do I Run the Installation?
You must perform the installation for each instance of each database. For example, if you have a non-CBD database and a CBD database (DBA and DBB respectively, in the example below) on a single two-node RAC, then you would install Oracle Big Data SQL on both nodes as follows:
- On node
1
./bds-database-install.sh --db-resource=DBA1 --cdb=false ./bds-database-install.sh --db-resource=DBB1 --cdb=true
- On node
2
./bds-database-install.sh --db-resource=DBA2 --cdb=false ./bds-database-install.sh --db-resource=DBB2 --cdb=true
Copy Over the Components and Prepare for the Installation
Copy over and install the software on each database node in sequence from node 1 to node n
.
-
If you have not already done so, use your preferred method to copy over the database-side installation bundle from the Hadoop cluster management server to the Oracle Database server. If there are multiple bundles, be sure to select the correct bundle for cluster that you want to connect to Oracle Database. Copy it to any location that you would like to use as a temporary staging area. To perform the copy operation, it may be easiest for you to log on to the Hadoop cluster management server as
root
, navigate down to the bundle location within the Oracle Big Data SQL installation directory, and push the bundle over to the database server. Use the database owner account (usuallyoracle
) for the remote logon on the database side.In this example the bundle is copied over to/opt/tmp
. You can copy the bundle to any secure directory, but first confirm that the directory exists.# cd <Big Data SQL Install Directory>/BDSjaguar-4.0.0/db-bundles # scp bds-4.0.0-db-<cluster>-<yymmdd.hhmi>.zip oracle@<database_node>:/opt/tmp
-
If you generated a database request key, then also copy that key over to the Oracle Database server.
# cd <Big Data SQL Install Directory>/BDSjaguar-4.0.0/dbkeys # scp <database name or other name>.reqkey oracle@<database_node>:/opt/tmp
-
Log on to the database server host as
oracle
(or whichever user is the database owner) andcd
to the directory where you copied the file (or files). -
Unzip the bundle. You will see that it contains a single, compressed run file.
-
Check to ensure that the prerequisite environment variables are set –
$ORACLE_HOME
and$ORACLE_SID
. -
Run the file in order to unpack the bundle into
$ORACLE_HOME
. For example:$ ./bds-4.0.0-db-cdh510-170309.1918.run
Because you can set up independent Oracle Big Data SQL connections between an Oracle Database instance and multiple Hadoop clusters, the
run
command unpacks the bundle to a cluster-specific directory under$ORACLE_HOME/BDSJaguar-4.0.0
. For example:$ ls $ORACLE_HOME/BDSJaguar-4.0.0 cdh510-6-node1.mycluster.mydomain.com test1-3-node1.myothercluster.mydomain.com
If the
BDSJaguar-4.0.0
directory does not already exist, it is created as well. -
If you generated a database request key, then copy it into the newly created installation directory. For example:
$ cp /opt/tmp/mydb.reqkey $ORACLE_HOME/BDSJaguar-4.0.0/cdh510-6-node1.mycluster.mydomain.com
Tip:
You also have the option to leave the key file in the temporary location and use the
--reqkey
parameter in the installation command in order to tell the script where the key file is located. This parameter lets you specify a non-default request key filename and/or path.However, the install script only detects the key in the installation directory when the key filename is the same as the database name. Otherwise, even if the key is local, if you gave it a different name then you must still use
--reqkey
to identify to the install script.
Now that the installation directory for the cluster is in place, you are ready to install the database side of Oracle Big Data SQL.
Install Oracle Big Data SQL on the Oracle Database Node
Important:
The last part of the installation may require a single restart of Oracle Database under either or both of these conditions:
-
If Oracle Database does not include the Oracle Grid Infrastructure. In this case, the installation script makes a change to the
pfile
orspfile
configuration file in order to support standalone operation ofdiskmon
. -
If there are changes to the IP address and the communication protocol recorded in
cellinit.ora
. These parameters define the connection to the cells on the Hadoop cluster. For example, if this is a re-installation and the IP address for the Hadoop/Oracle Database connection changes from an Ethernet address to an InfiniBand address and/or the protocol changes (between TCP and UDP), then a database restart is required.
-
Start the database if it is not running.
-
Log on to the Oracle Database node as
oracle
and change directories to the cluster-specific installation directory under$ORACLE_HOME/BDSJaguar-4.0.0
. For example:$ cd $ORACLE_HOME/BDSJaguar-4.0.0/cdh510-6-node1.my.domain.com
-
Run
bds-database-install.sh
, the database-side Oracle Big Data SQL installer. You may need to include some optional parameters.$ ./bds-database-install.sh [options]
-
Restart the database (optional).
See Also :
The bds-database-install.sh
installer command supports parameters that are ordinarily optional, but may be required for some configurations. See the Command Line Parameter Reference for bds-database-install.sh
Extra Step If You Enabled Database Authentication or Hadoop Secure Impersonation
-
If
database_auth_enabled
orimpersonation_enabled
was set to “true” in the configuration file used to create this installation bundle, copy the ZIP file generated by the database-side installer back to the Hadoop cluster management server and run the Jaguar “Database Acknowledge” operation. This completes the set up of login authentication between Hadoop cluster and Oracle Database.Find the zip file that contains the GUID-key pair in the installation directory. The file is named according to the format below.
For example:<name of the Hadoop cluster>-<number nodes in the cluster>-<FQDN of the node where the cluster management server is running>-<FQDN of this database node>.zip
$ ls $ORACLE_HOME/BDSJaguar-4.0.0/cdh510-6-node1.my.domain.com/*.zip $ mycluster1-18-mycluster1node03.mydomain.com-myoradb1.mydomain.com.zip
-
Copy the ZIP file back to
/opt/oracle/DM/databases/conf
on the Hadoop cluster management server. -
Log on to the cluster management server as
root
, cd to/BDSjaguar-4.0.0
under the directory where Oracle Big Data SQL was installed, and rundatabaseack
(the Jaguar “database acknowledge” routine). Pass in the configuration file that was used to generate the installation bundle (bds-config.json
or other).# cd <Big Data SQL Install Directory>/BDSjaguar-4.0.0 # ./jaguar databaseack bds-config.json
-
3.3.1 Command Line Parameter Reference for bds-database-install.sh
The bds-database-install.sh script accepts a number of command line parameters. Each parameter is optional, but the script requires at least one.
Table 3-1 Parameters for bds-database-install.sh
Parameter | Function |
---|---|
--install |
Install the Oracle Big Data SQL connection to this cluster. Note: In mid-operation, this script will pause and prompt you to run a second script as root:
As root, open a session in a second terminal and run the script there. When that script is complete, return to the original terminal and press Enter to resume the Because Oracle Big Data SQL is installed on the database side as a regular user (not a superuser), tasks that must be done as root and/or the Grid user require the installer to spawn shells to run other scripts under those accounts while In some earlier releases, the |
--version |
Show the bds-database-install.sh script version.
|
--info |
Show information about the cluster, such as the cluster name, cluster management server host, and the web server. |
--grid-home |
Specifies the Oracle Grid home directory. |
--crs |
Use
The installer always checks to verify that Grid is running. If Grid is not running, then the installer assumes that Grid is not installed and that the database is single-instance. It then automatically sets the If |
--cdb |
Create database objects on all PDBs for CDB databases.
|
--db-resource |
Deprecated. Oracle Big Data SQL scripts will run on the database specified by
|
--db-name |
Deprecated. Oracle Big Data SQL scripts will run on the database specified by
|
--reqkey |
This parameter tells the installer the name and location of the request key file. Database Authentication is enabled by default in the configuration. Unless you do not intend to enable this feature, then as one of the steps to complete the configuration, you must use
If the request key filename is provided without a path, then the key file is presumed to be in the installation directory. The installer will find the key if the filename is the same as the database name. For example, in this case the request key file is in the install directory and the name is the same as the database name.
This file is consumed only once on the database side in order to connect the first Hadoop cluster to the database. Subsequent cluster installations to connect to the same database use the configured key. You do not need to resubmit the |
--uninstall |
Uninstall the Oracle Big Data SQL connection to this cluster. |
--reconfigure |
Reconfigures bd_cell network parameters, Hadoop configuration files, and Big Data SQL parameters. The Oracle Big Data SQL installation to connect to this cluster must already exist. Run reconfigure when a change has occurred on the Hadoop cluster side, such as a change in the DataNode inventory. |
--databaseack-only |
Create the Database Acknowledge zip file. (You must then copy the zip file back to |
--mta-restart |
MTA extproc restart. (Oracle Big Data SQL must already be installed on the database server.) |
--mta-setup |
Set up MTA extproc with no other changes. (Oracle Big Data SQL must already be installed on the database server.) |
--mta-destroy |
Destroy MTA extproc and make no other changes. (Oracle Big Data SQL must already be installed on the database server.) |
--aux-run-mode |
Because Oracle Big Data SQL is installed on the database side as a regular user (not a superuser), tasks that must be done as The
Mode options are:
|
--root-script |
Enables or disables the startup root script execution.
|
--no-root-script |
Skip root script creation and execution. |
--root-script-name |
Set a name for the root script (the default name is based on the PID). |
--pdb-list-to-install |
For container-type databases, Oracle Big Data SQL is by default set up on all open PDBs. This parameter limits the setup to the specified list of PDBs. |
--restart-db |
If a database restart is required, then by default the install script prompts the user to choose between doing the restart now or later. Setting --restart-db=yes tells the script in advance to proceed with any needed restart without prompting the user. If --restart-db=no then the prompt is displayed and the installation waits for a response. This is useful for unattended executions.
|
--skip-db-patches-check |
To skip the patch validation, add this parameter when you run the installer:
By default, database patch requirements are checked when you run |
The --root-script-only
parameter from previous releases is
obsolete.
3.4 Granting User Access to Oracle Big Data SQL
In Oracle Big Data SQL releases prior to 3.1, access is granted to the PUBLIC
user group. In the current release, you must do the following for each user who needs access to Oracle Big Data SQL:
- Grant the
BDSQL_USER
role. - Grant read privileges on the BigDataSQL configuration directory object.
- Grant read privileges on the Multi-User Authorization security table to enable impersonation.
For example, to grant access to user1
:
SQL> grant BDSQL_USER to user1;
SQL> grant read on directory ORACLE_BIGDATA_CONFIG to user1;
SQL> grant read on BDSQL_USER_MAP to user1;
See Also:
Use the DBMS_BDSQL PL/SQL Package in the Oracle Big Data SQL User's Guide to indirectly provide users access to Hadoop data. The Multi-User Authorization feature that this package implements uses Hadoop Secure Impersonation to enable theoracle
account to execute tasks on behalf of other designated users.
3.5 Enabling Object Store Access
If you want to use Oracle Big Data SQL to query object stores, certain database properties and Network ACL values must be set on the Oracle Database side of the installation. The installation provides two SQL scripts you can use to do this.
As of Oracle Big Data SQL 4.0, the ORACLE_BIGDATA driver enables you to create external tables over data within object stores in the cloud. Currently, Oracle Object Store and Amazon S3 are supported. You can create external tables over Parquet, Avro, and text files in these stores. The first step is set up access to the object stores as follows.
Run set_parameters_cdb.sql and allow_proxy_pdb.sql to Enable Object Store Access
- After you run
bds-database-install.sh
to execute the database-side installation, find these two SQL script files under $ORACLE_HOME, in the cluster subdirectory under the BDSJaguar directory:set_parameters_cdb.sql allow_proxy_pdb.sql
- Open and read each of these files. Confirm that the configuration is correct.
Important:
Because there are security implications, carefully check that the HTTP server setting and other settings are correct. - In CBD ROOT, run
set_parameters_cdb.sql
. - In each PDB that needs access to object stores, run
allow_proxy_pdb.sql
.
In a RAC database, you only need to run these scripts on one instance of the database.
3.6 Maintaining WebLogic Server
The WebLogic Server is included as component of Oracle Big Data SQL. Download and run the patching tool described here to keep WebLogic Server up-to-date with the official security patches provided by Oracle.
Web Logic Server is deployed on all DataNodes where Big Data SQL is installed. It is embedded in the bd_cells (Big Data SQL processing cells) running on these nodes. At the time of Big Data SQL 4.0 release, the WebLogic Server 10.3.6 component is up-to-date with the 2019 April PSU. To maintain WebLogic Server, you should run the patching tool after the Big Data SQL installation and then again later as newer PSUs become available.
The patching tool is available as patch 31188867 on My Oracle Support. Click on the Patches and Updates tab and enter "31188867" into the search field. Installation instructions are in the readme file provided with the patch. However, also review Document 2662568.1 in My Oracle Support. It provides some information that is not in the readme file.