This section describes security features provided by Oracle Big Data SQL, measures you can take to secure Big Data SQL, and to pointers to the information you need in order to configure Oracle Big Data SQL within secured environments.
In Oracle Big Data SQL, network traffic between the database and the Hadoop cluster, is no longer guaranteed to be over a private InfiniBand network, but can occur over a client network. This network traffic is not currently secured. Therefore when operating a secured Hadoop cluster (e.g., Kerberos-enabled, RPC encryption), Oracle Big Data SQL requires either that all members of the client network be trusted, or that private network connectivity is used exclusively for communication between the Hadoop nodes and Oracle database instances. This private network is commonly referred to as the Big Data SQL interconnect network. The interconnect network must be a private network with only trusted users, use at least one switch, and 10 Gigabit Ethernet adapters. Ensure that only the nodes in the Hadoop cluster and Oracle RAC cluster can access the interconnect network. Do not use the interconnect network for other user communication.
Installer File Security
The new Jaguar installer incorporates the following best practices for secure Linux installers and applications:
No persistent or temporary world-writable files are created.
No setuid or setgid files are used.
In addition, the installer works with hardened Oracle Database environments as well as hardened CDH and HDP clusters as described in the Cloudera CDH and Hortonworks HDP security documentation
The Jaguar installer provides these password security measures:
Passwords for the Ambari and Cloudera Manager servers (management servers) are not be passed in on the command line and are not be saved on any persistent file during and/or after the installation is complete.
Passwords are not be logged to any log or trace files.
Security of Related Software
Oracle Big Data SQL relies on other installed software, including third party projects. It is the customer’s responsibility to ensure that such software is kept up-to-date with the latest security patches. This software includes is (but not limited to):
It is generally a good security practice to ensure that HDFS file access permissions are minimized in order to prevent unauthorized write/read access. This is true regardless of whether or not the Hadoop cluster is secured by Kerberos.
Please refer to MOS Document 2123125.1 at My Oracle Support for detailed guidelines on securing Hadoop clusters for use with Oracle Big Data SQL.
If Kerberos is enabled on the Hadoop system, you must configure Oracle Big Data SQL on the database server to work with Kerberos. This requires a Kerberos client on each database node where Oracle Big Data SQL is installed. Also, the OS account that owns the database (
oracle or another account ) must be provisioned as a user principal.
When operating a secured Hadoop cluster (e.g., Kerberos enabled, RPC encryption), Oracle Big Data SQL requires either that all members of the client network be trusted, or that private network connectivity is used exclusively for communication between the Hadoop nodes and Oracle database instances. This private network is commonly referred to as the Big Data SQL interconnect network. The interconnect network must be a private network with only trusted users, use at least one switch, and 10 Gigabit Ethernet adapters. Ensure that only the nodes in the Hadoop cluster and Oracle RAC cluster can access the interconnect network. Do not use the interconnect network for other user communication.
You must configure Oracle Big Data SQL to use Kerberos in environments where user access is Kerberos-controlled.
There are two situations when this is required:
When enabling Oracle Big Data SQL on a Kerberos-enabled cluster.
When enabling Kerberos on a cluster where Oracle Big Data SQL is already installed.
Oracle Big Data SQL processes run on the nodes of the Hadoop cluster as the
oracle Linux user. On the Oracle Database server, the owner of the Oracle Database process is also (usually) the
oracle Linux user. When Kerberos is enabled on the Hadoop system, the following is required in order to give the user access to HDFS.
oracle Linux user needs to be able to authenticate as a principal in the Kerberos database on the Kerberos Key Distribution Center (KDC) server. The principal name in Kerberos does not have to be 'oracle'. However, the principal must have access to the underlying Hadoop data being requested by Oracle Big Data SQL.
The following are required on all Oracle Database nodes and all Hadoop cluster nodes running Oracle Big Data SQL:
Kerberos client software installed.
A copy of the Kerberos configuration file from the KDC.
A copy of the Kerberos keytab file generated on the KDC for the
A valid Kerberos ticket for the
oracle Linux user.
Installing the Kerberos Client
If the Kerberos client is not installed, see Installing a Kerberos Client on the Oracle Database Nodes for instructions on installing the Kerberos client.
Creating a Kerberos Principal for the oracle User
On the Kerberos Key Distribution Center (KDC) server, become
root and use
kadmin.local to add a principal for the
Within kadmin.local, type:
add_principal <user>@<realm> quit
You have the option to include the password, as in:
add_principal <user>@<realm> -pw <password> quit
Creating a Kerberos Keytab for the oracle User
On the KDC, become
root and run the following:
xst –norandkey -k /home/oracle/oracle.keytab oracle quit
This creates the
oracle.keytab file for the Kerberos
oracle user in the
oracle.keytabis owned by the
oracleLinux user and is readable by that user only.
$ chown oracle oracle.keytab $ chmod 400 oracle.keytab
Kerberos Tasks Automated by Oracle Big Data SQL
The following Kerberos tasks are now automated:
Distributing Keytab and Kerberos configuration file distrbution files.
The Oracle Big Data SQL installation can now be configured to automatically distribute the keytab and Kerberos configuration files for the
oracle user or other database owner to the Hadoop DataNodes (and Oracle Database compute nodes). This is done if the principal name and keytab file location parameters are set in the Jaguar configuration file. This is automation is performed on both the Hadoop and Oracle Database side.
On Oracle Big Data Appliance, the keytab file distribution is done default for the
oracle account and you do not need to add the principal and keytab file path for this account to the configuration file.
Acquiring a Kerberos Ticket for designated principals.
oracle and other principals that were listed in the Jaguar configuration file, the installation acquires a Kerberos ticket on each Hadoop DataNode and Oracle DB compute node
The installation automatically sets up cron jobs in the Hadoop cluster and on Oracle Database to kinit for new ticket for each principal in the configuration four times daily.
Cleaning up After Ticket Expirations
When the bd_cell process is running on the nodes of a secured Hadoop cluster but the Kerberos ticket is not valid, then the cell goes to quarantine status. You should drop all such quarantines.
Check that the
oracle user has a valid Kerberos ticket on all Hadoop cluster nodes.
On each cluster node, become
oracle and run the following:
In the bdscli shell, type:
drop quarantine <id>
exit to exit bdscli.
If the Oracle Database system is Kerberos secured, then Oracle Big Data SQL requires a Kerberos client. The client must be installed on each compute node of the database.
For commodity servers, download the Kerberos client software from a repository of your choice. If the database server is an Oracle Exadata Database Machine, download and install the software from the Oracle repository as shown below. The process should be similar for downloads from non-Oracle repositories.
Log on to the database server as root and use yum to install the
krb5-workstation packages. Download from the Oracle Linux 6 or Oracle Linux 5 repository as appropriate.
Check that the Oracle
public-yum-ol5 repository ID is installed.
# yum repolist
# yum --disablerepo="*" --enablerepo="public-yum-ol6" list available
Install the Kerberos packages.
# yum install krb5-libs krb5-workstation
/etc/krb5.conf file from the Key Distribution Center (KDC) to the same path on the database server.
These steps must be performed for each Oracle Database node.
You must also register the
oracle Linux user (or other Linux user) and password in the KDC for the cluster as described in Enabling Oracle Big Data SQL Access to a Kerberized Cluster
On the Oracle Database server, you can use the Oracle Secure External Password Store (SEPS) to manage database access credentials for Oracle Big Data SQL.
This is done by creating an Oracle wallet for the
oracle Linux user (or other database owner). An Oracle wallet is a password-protected container used to store authentication and signing credentials, including private keys, certificates, and trusted certificates needed by SSL.
If your Hadoop system is an Oracle Big Data Appliance, the following tools to strengthen security are already available.
Kerberos authentication: Requires users and client software to provide credentials before accessing the cluster.
Apache Sentry authorization: Provides fine-grained, role-based authorization to data and metadata.
HTTPS/ Network Encryption: Provides HTTPS for Cloudera Manager, Hue, Oozie, and Hadoop Web UIs. Also Enables network encryption for other internal Hadoop data transfers, such as those made through YARN shuffle and RPC.
The Database Authentication feature described in this guide prevents unauthorized and potentially malicious processes (which can originate from anywhere) from connecting to Oracle Big Data SQL cell server processes in the DataNodes of the Hadoop cluster.
When Ethernet is selected for the connection between Oracle Databases and Oracle Big Data SQL, then by default this secured authentication framework is set up automatically during the installation. Database Authentication is also available as a configuration option for InfiniBand connections.
Multi-User Authorization gives you the ability to use Hadoop Secure Impersonation to direct the
oracle account to execute tasks on behalf of other designated users. This enables HDFS data access based on the user that is currently executing the query, rather than the singular
Administrators set up the rules for identifying the query user (the currently connected user) and for mapping this user to the user that is impersonated. Because there are numerous ways in which users can connect to Oracle Database, this user may be a database user, a user sourced from LDAP, from Kerberos, or other sources. Authorization rules on the files apply to the query user and audits will identify the user as the query user.
See Also:The DBMS_BDSQL PL/SQL Packagein the Oracle Big Data SQL User’s Guide describes the Multi-User Authorization security table and the procedures for adding user maps to the table and removing them from the table.