This chapter describes how to use Oracle Big Data Appliance Configuration Generation Utility.
This chapter contains the following topics:
Oracle Big Data Appliance Configuration Generation Utility acquires information from you, such as IP addresses and software preferences, that are required for deploying Oracle Big Data Appliance. After guiding you through a series of pages, the utility generates a set of configuration files. These files help automate the deployment process and ensure that Oracle Big Data Appliance is configured to your specifications.
Choose the option that describes the type of hardware installation you are configuring:
One or more new Big Data Appliance racks being installed: You enter all new data for this choice.
One or more Big Data Appliance racks being added to an existing group of Big Data Appliances: This choice activates the Import button, so that you can select the BdaDeploy.json
file that was used to configure the last rack in the group.
One or two in-rack expansion kits being added to a Big Data Appliance starter rack: This choice activates the Import button, so that you can select the BdaDeploy.json
file that was last used to configure the rack (either the starter rack or one in-rack expansion kit).
An in-process configuration using a saved master.xml configuration file: This choice activates the Import button, so that you can select the master.xml
file and continue the configuration.
Figure 4-1 shows the Customer Details page of the Oracle Big Data Appliance Configuration Generation Utility.
Figure 4-1 Oracle Big Data Appliance Configuration Generation Utility
Note:
Oracle Big Data Appliance uses Cloudera's Distribution including Apache Hadoop (CDH). A Hadoop cluster on Oracle Big Data Appliance is called a CDH cluster.
The terms appliance and rack refer to Oracle Big Data Appliance, Oracle Big Data Appliance X3-2, and Oracle Big Data Appliance X4-2.
The following procedures explain how to install Oracle Big Data Appliance Configuration Generation Utility and generate the configuration files.
To configure Oracle Big Data Appliance:
Download Oracle Big Data Appliance Configuration Generation Utility from Oracle Technology Network at
http://www.oracle.com/technetwork/database/bigdata-appliance/downloads/index.html
The file is named BDAConfigurator-
version
.zip
. The system must run Oracle JRE 1.6 or later.
Extract the files in BDAConfigurator-
version
.zip
. This example extracts the files on a Linux system for version 2.0:
$ unzip BDAConfigurator-2.0.zip Archive: BDAConfigurator-2.0.zip creating: BDAConfigurator-2.0/ inflating: BDAConfigurator-2.0/exagen.jar inflating: BDAConfigurator-2.0/oracle_ice.jar inflating: BDAConfigurator-2.0/passwd.jar inflating: BDAConfigurator-2.0/orai18n-utility.jar . . .
Change to the BDAConfigurator-
version
directory.
Run Oracle Big Data Appliance Configuration Generation Utility.
On Linux:
$ sh bdaconf.sh
On Microsoft Windows, double-click bdaconf.cmd
in Windows Explorer, or run the file from the command line:
C:\ bdaconf.cmd
On the Welcome page, select a configuration type.
Click Import if the button is activated, and select a previously saved configuration file (either BdaDeploy.json
or master.xml
, depending on the configuration type).
Follow the steps of the wizard. On the Complete page, click Create Files.
Validate the network configuration.
Send the generated bda.zip
file to your Oracle representative.
Oracle Big Data Appliance Configuration Generation Utility generates the following files to use when you configure the system. You can select the directory where they are saved.
This is the basic structure of the directory for a full rack or a starter rack:
company_name / bda-timestamp.zip bda-install-preview.html bda-preinstall-checkip.sh rack_name / BdaDeploy.json cluster_name / mammoth-rack_name.params master.xml
The directory for an expansion kit has a slightly different structure:
company_name / bda-timestamp.zip bda-install-preview.html bda-preinstall-checkip.sh rack_name / rack_name-BdaExpansion.json master.xml
Contains a copy of the configuration files. If an Oracle customer service representative will perform the installation, then send this file to Oracle before the installation date. Otherwise, transfer the file to a USB drive for copying to Oracle Big Data Appliance.
Provides a report that lists all the details of the configuration. You can view the this report in a browser. Check it carefully to ensure that all of the settings are correct.
Runs a series of tests to ensure the specified names and IP addresses for Oracle Big Data Appliance were added correctly to the name server, and they do not conflict with the existing network configuration.
Contains all the information for a cluster, including the network configuration, port numbers, user names, and passwords. The configuration utility creates a separate parameter file for each cluster. If several clusters are being configured, then each parameter file is located in a separate subdirectory.
If an in-rack expansion kit is being configured as an addition to an existing cluster, then the configuration utility does not generate a parameter file; the Mammoth utility generates it.
Contains all the configuration settings in XML format so that Oracle Big Data Appliance Configuration Generation Utility can read it. To alter the configuration of an Oracle Big Data Appliance deployment, you can load this file, enter the changes, and regenerate the configuration files.
This file is not used for the actual configuration of Oracle Big Data Appliance.
Contains the network configuration for a full rack, a starter rack, or a starter rack with one in-rack expansion kit. It contains information about all the servers, switches, and PDUs.
Contains the network configuration for one or two in-rack expansion kits. It contains information about all the servers, but no information about the switches and PDUs. This file is generated only when an expansion kit is being installed and configured.
Validating the network settings before the Oracle Big Data Appliance hardware arrives at your site is a critical step. Network problems can cause extended delays in the installation.
To validate the network configuration settings:
Copy the bda-preinstall-checkip.sh
file generated by Oracle Big Data Appliance Configuration Generation Utility to a Linux host on the same network that Oracle Big Data Appliance will use.
Log in to the Linux host and run bda-preinstall-checkip.sh
:
$ sh bda-preinstall-checkip.sh
This script checks your existing network for conflicts with the Oracle Big Data Appliance IP address pool.
Correct any network problems discovered by the script before installation begins. Network problems during the installation can cause extensive delays.
Run the network connections to the planned location for Oracle Big Data Appliance.
Inform your Oracle representative when you have completed these steps.
The following table describes the customer details fields.
Table 4-1 Customer Details Page
The rack name is used in the assignment of standardized host names for all Oracle Big Data Appliance servers. The host name for all servers on the client network is in this format:
rackname
NodeNN
.domain
In this syntax:
NN is the position number of the server node in the rack (01 to 18).
domain is the domain name.
Host names must have fewer than 38 characters, which can be ASCII letters (a to z and A to Z), numbers (0 to 9), and hyphens (-) only. Do not begin or end the name with a hyphen.
You can change the suffixes used for the different network interfaces.
You can enter server host names that do not follow the naming conventions on the Review and Edit Details Page.
Oracle recommends that for a cluster of multiple racks, you use the cluster name as the rack name.
For example, in a three-rack cluster, if the cluster name is cluster1
and the domain name is example.com
, then the fully qualified host name of the server at the bottom of the first rack is cluster101node01.example.com
. For the top server in the third rack of this cluster, the host name is cluster103node18.example.com
.
The host names on the other networks have a short extension that follows the unit number. If you retained the default extensions, then use these formats to connect to Oracle Big Data Appliance after it is connected to the network:
For short host names over the administration network:
rackname
NodeNN
-adm
For the private InfiniBand network host names:
rackname
NodeNN
-priv
For the Oracle Integrated Lights Out Manager (ILOM) host names:
rackname
NodeNN
-ilom
For the switch host names:
rackname
sw-ib
M
In this syntax:
NN is the position number of the server in the rack (01 to 18).
M is 1, 2, or 3, depending on the switch location in the rack.
The hardware page identifies one or more racks that you want to deploy at the same time. The racks must be cabled together.
For example, if you are deploying three full racks, then add Full Rack three times to your deployment.
The following table describes the hardware selection choices.
Table 4-2 Hardware Selection Page
The rack details page identifies the optional network connections for a Oracle Big Data Appliance rack.
The following table describes the rack detail fields.
Rack Detail Field | Description |
---|---|
Rack Name |
Enter the name of the rack. |
Number of 10 GbE Connections |
The two Sun Network QDR InfiniBand Gateway switches in every Oracle Big Data Appliance must have an equal number of 10 GbE links to the client network. Each gateway switch supports up to eight 10-GbE links, for a total of 16 links. Each server is assigned to one 10-Gbe link on each gateway switch. Therefore, an Oracle Big Data Appliance full rack or starter rack with an extension kit can use up to 16 10-GbE links. A starter rack can use up to 12 10-GbE links. Oracle recommends using as many 10 GbE links to the Oracle Big Data Appliance rack as the data center can support. The additional links increase the network bandwidth available between Oracle Big Data Appliance and the client network, and reduces the impact if a link fails. Oracle uses this information to ensure that the correct number of cables is ordered or available at the installation site. It also lets the Oracle field engineer plan how many 10 GbE links to connect to the Oracle Big Data Appliance rack. |
The networking page identifies the number of IP addresses required for each network. The administration, client Ethernet, and InfiniBand networks are required. You must allocate the specified number of IP addresses for them.
Each IP address pool initially contains a range of consecutive IP addresses. If some IP addresses in the range are not available, then you can change individual addresses on the Review and Edit Details Page.
The values that you enter on this page are used to add the Oracle Big Data Appliance servers to your existing client Ethernet network. Client applications typically access Oracle Big Data Appliance using this network.
The following table describes the client network fields.
Table 4-4 Client Ethernet Page
Client Ethernet Field | Description |
---|---|
The first IP address on the client network available for use by the Oracle Big Data Appliance servers. |
|
Pool Size |
The required number of IP addresses. All Oracle Big Data Appliance servers require an IP address on the client network. The pool size is calculated for the racks that you identified on the Hardware Selection Page. |
Ending IP Address for Pool |
The last IP address on the client network assigned to Oracle Big Data Appliance. This address is automatically calculated from the starting IP address and the pool size. Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page. |
Subnet Mask |
The subnet mask for the client network. |
Gateway |
The IP address for the client network gateway. The gateway IP address is generated automatically, so verify that it is correct. |
The values that you enter on this page are used to add the Oracle Big Data Appliance servers, switches, and PDUs to your existing administration network. Each server has two network interfaces for administration. One interface provides access to the operating system, and the other provides access to Oracle Integrated Lights Out Manager (ILOM).
The following table describes the administration network fields.
Table 4-5 Administration Network Page
Administration Network Field | Description |
---|---|
Starting IP Address for Pool |
The first IP address on the administration network available for use by Oracle Big Data Appliance. |
Pool Size |
The required number of IP addresses on the administration network. The IP addresses for a rack are assigned in this order: Oracle Big Data Appliance servers (6, 12, or 18), Oracle ILOMs (6, 12, or 18), Ethernet switch, spine switch, leaf switches (2), and PDUs (2). |
Ending IP Address for Pool |
The last IP address on the administration network assigned to Oracle Big Data Appliance. The value in this field is automatically calculated from the starting IP address and the pool size. Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page. |
Subnet Mask |
The subnet mask for the administration network. |
Gateway |
The IP address for the gateway. The gateway IP address is generated automatically, so verify that it is correct. |
The InfiniBand network connects the Oracle Big Data Appliance servers within a rack. It can also connect multiple racks to form a multirack Hadoop cluster, or to provide access to Oracle Big Data Appliance from Oracle Exadata Database Machine.
The following table describes the InfiniBand network fields.
Table 4-6 InfiniBand Network Page
InfiniBand Network Field | Description |
---|---|
Starting IP Address for Pool |
The first IP address on the private InfiniBand network available for use by the Oracle Big Data Appliance servers. The default is 192.168.10.1. |
Pool Size |
The required number of IP addresses. All Oracle Big Data Appliance servers require an IP address on the InfiniBand network. The pool size is calculated for the deployment that you identified on the Hardware Selection Page. |
Ending IP Address for Pool |
The last IP address assigned to the InfiniBand network for this deployment. This address is automatically calculated from the starting IP address and the pool size. Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page. |
Subnet Mask |
The subnet mask for the InfiniBand network. The default is 255.255.252.0. |
BDA will be connected via InfiniBand to any Oracle engineered systems |
Select this option if you are connecting this rack to another rack through the InfiniBand fabric. When connecting multiple racks, ensure the following:
For example, if you connect an Oracle Big Data Appliance rack to Oracle Exadata Database Machine, then you must use the same netmask on the InfiniBand networks for both systems. Moreover, after you apply this netmask to the InfiniBand IP addresses of the Exadata database servers, the Exadata storage servers, and the Oracle Big Data Appliance servers, all IP addresses are in the same subnet. |
The client and administration networks typically use the same Domain Name System (DNS) and Network Time Protocol (NTP) servers. If they are different on your networks, then enter the values for the client network first.
The following table describes the general network properties.
Table 4-7 General Network Properties Page
The next table describes the network properties for the administration network if they are different from the client network. Typically, these properties are the same for both networks. They default to the values you entered for the general network properties.
Table 4-8 Advanced Network Properties
Advanced Network Properties Field | Description |
---|---|
Advanced Network Configuration |
Select this option if the client and administration networks are isolated on your system and use different DNS and NTP servers, different domains, or both. You can then complete the fields for the administration network. |
Admin DNS Servers |
Up to six IP addresses for the administration Domain Name System server, if they are different from the client network. |
Admin NTP Servers |
Up to six IP addresses for the administration Network Time Protocol server, if they are different from the client network. |
Admin Search Domains |
Up to six domain names in which Oracle Big Data Appliance administration network operates, if they are different from the client network. |
Use this page to review and modify the network configuration settings.
If you specified a range of IP addresses for any of the networks that includes addresses already in use, then replace those IP addresses on this page. Otherwise, the network configuration of Oracle Big Data Appliance will fail, causing unnecessary delays. When you are done making changes, click Regenerate using changed base values.
Use the define clusters page to identify the number of clusters to create and the servers that compose each cluster. You can configure clusters for either CDH or Oracle NoSQL Database.
You can configure multiple clusters in a single rack, or a single cluster can span multiple racks. Each CDH cluster must have at least six servers, and each Oracle NoSQL Database cluster must have at least three servers. Thus, a starter rack supports one CDH cluster, a starter rack with one in-rack expansion supports up to two CDH clusters, and a full rack supports up to three CDH clusters.
The following table describes the cluster definition choices.
Table 4-9 Define Clusters Page
Define Clusters Field | Description |
---|---|
Number of clusters to create |
Select the number of clusters. For each cluster, a new tab appears on the page, and a new page appears in Oracle Big Data Appliance Configuration Generation Utility. Be sure to complete all tabs before continuing to the next page. |
Enter a unique name for the cluster. The name must begin with a letter and can consist of alphanumeric characters, underscores, (_) and dashes (-). |
|
Cluster Type |
Choose the type of cluster:
|
Unassigned Servers |
From the list on the left, select the servers for the cluster and move them to the list of assigned servers on the right. |
Assigned Servers |
Lists the servers selected for the cluster. A CDH cluster must have a minimum of six servers, and an Oracle NoSQL Database cluster must have a minimum of three servers. All clusters must be composed of multiples of three servers. |
Select the software to install on this cluster. The fields displayed on this field depend on the type of cluster being configured:
You are done with the software configuration. The Mammoth utility configures the software for the new servers equal to the other servers in the cluster.
The cluster page for new Oracle NoSQL Database clusters has the following sections:
You can install either Community Edition or Enterprise Edition of Oracle NoSQL Database.
The following table describes these choices.
Table 4-10 Installed Components
Component Field | Description |
---|---|
Choose between Community Edition and Enterprise Edition:
|
The cluster page for new CDH clusters has the following sections:
The following table describes the user name, groups, and password fields for a new CDH cluster. Passwords are optional, but you must enter them during the software installation if you do not provide them here.
Table 4-11 User /Groups for a New CDH Cluster
ASR monitors the health of Oracle Big Data Appliance hardware and automatically submits a service request when it detects a fault. Although you can opt out of this program, Oracle recommends that you enable ASR.
ASR Manager must be installed and configured to run on a separate server outside of Oracle Big Data Appliance before the software is installed and configured on Oracle Big Data Appliance. The software installation fails with an error if Enable Auto Service Request is selected, but ASR Manager is not accessible using the specified host address and port number. The Mammoth utility does not install ASR Manager.
The software on Oracle Big Data Appliance must be able to connect to ASR Manager. ASR Manager must be able to route to the Internet, either directly or through a proxy, to send event information that automatically opens service requests.
The following table describes the Auto Service Request fields.
Table 4-12 Auto Service Request
ASR Field | Description |
---|---|
Enable Auto Service Request |
Select this option to support Auto Service Request. |
ASR Manager Host Name |
The fully qualified name or the IP address of a Linux server on the network where ASR will be installed |
The port number for ASR Manager. The default port is 162. |
|
ASR Root Password |
Password for |
You can install Oracle Big Data Connectors on a CDH cluster. You must have a separate license for this product. The following table describes the installed components field.
Table 4-13 Installed Components
Kerberos authentication is a security option for CDH clusters. It is included with your Oracle Big Data Appliance license.
To use a key distribution center (KDC) elsewhere on the network (that is, not on Oracle Big Data Appliance), you must complete several steps before installing the software. See "Installation Prerequisites."
The following table describes the Kerberos fields.
Kerberos Field | Description |
---|---|
Enable Kerberos-based authentication? |
Select this option to support Kerberos on Oracle Big Data Appliance. |
Choose Yes to set up a key distribution center (KDC) on Oracle Big Data Appliance. Otherwise, a KDC must always be available on the network to all clients. |
|
Kerberos KDC database password |
A password for the KDC database, if it is being created on Oracle Big Data Appliance. |
Non-BDA key distribution center hosts |
List the fully qualified names or the IP addresses of the KDCs, available on the same network, that can serve as either the primary or backup KDC for Oracle Big Data Appliance. |
Kerberos realm |
Enter the name of the realm for Oracle Big Data Appliance, such as |
Enable network encryption |
Select this option to protect your data as it travels over the network. |
You can configure CDH clusters on Oracle Big Data Appliance to automatically encrypt and decrypt data stored on disk. On-disk encryption does not affect user access to Hadoop data, although it can have a minor impact on performance.
Oracle Big Data Appliance supports two types of on-disk encryption:
Password-based encryption encodes Hadoop data based on a password, which is the same for all servers in a cluster. If a disk is removed from a server, then the encrypted data remains protected until you install the disk in a server (the same server or a different one), startup the server, and provide the password. If a server is powered off and removed from an Oracle Big Data Appliance rack, then the encrypted data remains protected until you restart server and provide the password. You must enter the password after every startup of every server to enable access to the data.
You can change the password at any time. See Oracle Big Data Appliance Software User's Guide for enabling a password-encrypted server after restarting it.
TPM encryption encodes Hadoop data using the Trusted Platform Module (TPM) chip on the server motherboard. If a disk is removed from a server, then the data is unreadable until the disk is reinstalled in the same server. If a server is removed from an Oracle Big Data Appliance rack, then the data is still accessible; the data is automatically encrypted and decrypted for normal use while the disk resides in the same server.
The following table describes the encryption fields.
Disk Encryption Field | Description |
---|---|
Enable Disk Encryption |
Select this option to encrypt data on disk and at rest. |
Use TPM Encryption |
Encrypts Hadoop data using the Trusted Platform Module (TPM) chip on the server motherboard. Otherwise, password-based encryption is used. See the previous descriptions of the encryption methods. |
Password to Use for Disk Encryption |
The password used to encrypt the data. A valid password consists of 1 to 64 printable ASCII characters. It cannot contain whitespace characters (such as spaces, tabs, or carriage returns), single or double quotation marks, or backslashes (\). This field is unavailable when you select TPM encryption, which does not use a password. |
You can configure CDH clusters on Oracle Big Data Appliance as secured targets for Oracle Audit Vault and Database Firewall. The Audit Vault plug-in on Oracle Big Data Appliance collects audit and logging data from MapReduce, HDFS, and Oozie services. You can then use Audit Vault Server to monitor these services on Oracle Big Data Appliance.
Oracle Audit Vault and Database Firewall Server Release 12.1.1 or later must be up and running on a separate server on the same network as Oracle Big Data Appliance before you perform the actual configuration.
The following table describes the Audit Vault fields.
Audit Vault Field | Description |
---|---|
Enable Audit Vault |
Select this option to support Oracle Audit Vault and Database Firewall on Oracle Big Data Appliance. |
Audit Vault server |
The IP address of the Audit Vault server. |
Audit Vault port |
The port number that Audit Vault Server listens on. |
Audit Vault database service name |
The database service name for Audit Vault Server. |
Audit Vault admin user |
The name of the Audit Vault administration user. |
Audit Vault admin user password |
The password for the administration user. |
The Mammoth utility deploys and validates agents on Oracle Big Data Appliance that Enterprise Manager uses to monitor the appliance. Mammoth does not install Oracle Enterprise Manager Cloud Control.
Before you can configure Oracle Big Data Appliance for the Enterprise Manager system monitoring plugin, you must install and configure Enterprise Manager to run on a separate server outside of Oracle Big Data Appliance. The Oracle Big Data Appliance software installation fails with an error if you choose the Enterprise Manager option, but Enterprise Manager is not installed and accessible using the specified host address, port numbers, and so forth.
The following table describes the Enterprise Manager Cloud Control fields.
Table 4-17 Oracle Enterprise Manager Cloud Control
Cloud Control Field | Description |
---|---|
Enable Oracle Enterprise Manager Cloud Control Agent |
Select this option to use the Oracle Enterprise Manager system monitoring plugin. |
OMS Host Name |
The fully qualified name or the IP address of the server where Oracle Management Server (OMS) is installed with the plugin for Oracle Big Data Appliance. |
OMS HTTPS Console Port |
The port number for the Oracle Enterprise Manager Cloud Control web interface. To obtain the HTTPS port numbers, use an |
OMS HTTPS Upload Port |
The HTTP upload port number for Oracle Enterprise Manager Cloud Control web interface. |
EM Super Admin User |
A Cloud Control user with super-administration privileges to perform administration |
EM Super Admin Password |
Password for the Cloud Control user name. |
EM Agent Registration Password |
The password for validating the Oracle Management agents on Oracle Big Data Appliance. The Agent Registration password is part of the security setup of Enterprise Manager. To obtain the password in Enterprise Manager, click Setup at the top right of the window, Security, and then Registration Passwords. |
Cloud Control SYS password |
The |
Inventory location |
The full path of the |
Cloudera Manager sends email alerts when it detects a problem in the CDH cluster.
The following table describes the email alert fields.
Table 4-18 Email Alerting Page
Email Alerting Field | Description |
---|---|
SMTP Server |
The fully qualified name or the IP address of the existing SMTP server that the company uses on its internal network. Required. |
Uses SSL |
Select Yes if a Secure Sockets Layer (SSL) connection is required. |
SMTP Port |
The port number used by the email server |
Requires Authentication |
Select this option if your SMTP server requires authentication. You can then enter a user name and a password. |
SMTP User Name |
User name for Cloudera Manager to log in to the SMTP server. This field is hidden when authentication not selected. |
SMTP Password |
Password for the user name. This field is hidden when authentication is not selected. |
Recipient Addresses |
The email addresses of users who need to get alerts from Cloudera Manager. Enter each email address on a separate line. Required. The field to the right indicates the number of email addresses entered in the dialog box. |
You have now set all the installation and configuration options. Click Back to return to a page and change its settings. The Back button does not clear the pages; your settings remain unless you change them.
The text box on this page provides a place for you to record any notes that might be useful at a later date. They are saved in a file named master.xml
, which you can use to reload these configuration settings into Oracle Big Data Appliance Configuration Generation Utility.
To generate the configuration files, click Create Files and click Yes in response to the prompt. An operating system window automatically opens in the directory where the files are saved.