4 Using Oracle Big Data Appliance Configuration Generation Utility

This chapter describes how to use Oracle Big Data Appliance Configuration Generation Utility.

This chapter contains the following topics:

4.1 Overview of Oracle Big Data Appliance Configuration Generation Utility

Oracle Big Data Appliance Configuration Generation Utility acquires information from you, such as IP addresses and software preferences, that are required for deploying Oracle Big Data Appliance. After guiding you through a series of pages, the utility generates a set of configuration files. These files help automate the deployment process and ensure that Oracle Big Data Appliance is configured to your specifications.

Choose the option that describes the type of hardware installation you are configuring:

  • Click One or more new Big Data Appliance racks being installed if you do not have a previously-generated master.xml configuration file to start with.

The other three choices activate the Import button, so that you can select the master.xml file from a previous configuration. After selecting one of these options, import the master.xml,

  • Click One or more Big Data Appliance racks being added to an existing group of Big Data Appliances if you need to configure new rack hardware and do not intend to add servers or make any cluster changes at this time.

  • Click One or more nodes being added to a Big Data Appliance starter rack or partially filled rack for in-rack or cluster expansion – adding new nodes to the existing racks and/or to existing clusters.

  • Click An in-process configuration using a saved master.xml configuration file for general editing purposes.

Click Next to go to the Customer Details page. If you are working off of an existing master.xml file, fields are pre-populated and selectors are preset throughout the pages of the utility based on the data in the file.

Figure 4-1 shows the Customer Details page of the Oracle Big Data Appliance Configuration Generation Utility.

Figure 4-1 Oracle Big Data Appliance Configuration Generation Utility

Description of Figure 4-1 follows
Description of "Figure 4-1 Oracle Big Data Appliance Configuration Generation Utility"

Note:

  • Oracle Big Data Appliance uses Cloudera's Distribution including Apache Hadoop (CDH). A Hadoop cluster on Oracle Big Data Appliance is called a CDH cluster.

  • The terms appliance and rack refer to Oracle Big Data Appliance, Oracle Big Data Appliance X3–2, Oracle Big Data Appliance X4-2, Oracle Big Data Appliance X5-2, and Oracle Big Data Appliance X6–2.

4.2 Generating the Configuration Files

The following procedures explain how to install Oracle Big Data Appliance Configuration Generation Utility and generate the configuration files.

To configure Oracle Big Data Appliance:

  1. Download Oracle Big Data Appliance Configuration Generation Utility from Oracle Technology Network at

    http://www.oracle.com/technetwork/database/bigdata-appliance/downloads/index.html

    The file is named BDAConfigurator-version.zip. The utility requires Oracle JRE 1.7 or later.

  2. Extract the files in BDAConfigurator-version.zip. For example:

    $ unzip BDAConfigurator-<version>.zip
    Archive:  BDAConfigurator-<version>.zip
       creating: BDAConfigurator-<version>/
      inflating: BDAConfigurator-<version>/exagen.jar
      inflating: BDAConfigurator-<version>/oracle_ice.jar
      inflating: BDAConfigurator-<version>/passwd.jar
      inflating: BDAConfigurator-<version>/orai18n-utility.jar
         .
         .
         .
    
  3. Change to the BDAConfigurator-version directory.

  4. Run Oracle Big Data Appliance Configuration Generation Utility.

    • On Linux:

      $ sh bdaconf.sh
      
    • On Microsoft Windows, double-click bdaconf.cmd in Windows Explorer, or run the file from the command line:

      C:\ bdaconf.cmd
      
  5. On the Welcome page, select a configuration type.

  6. Click Import if the button is activated, and select a previously saved master.xml configuration file.

  7. Follow the steps of the wizard. On the Complete page, click Create Files.

  8. Validate the network configuration.

    See "Validating the Network Settings."

  9. Send the generated bda.zip file to your Oracle representative.

4.3 About the Configuration Files

Oracle Big Data Appliance Configuration Generation Utility generates the following files to use when you configure the system. You can select the directory where they are saved.

This is the basic structure of the directory:

company_name /
   bda-timestamp.zip
   bda-install-preview.html
   bda-preinstall-checkip.sh
   rack_name /
      rack_name-network.json
      rack_name-rack-network.json
   cluster_name /
      cluster_name-config.json
      cluster_name-cluster-network.json
   master.xml
bda-timestamp.zip

Contains a copy of the configuration files. If an Oracle customer service representative will perform the installation, then send this file to Oracle before the installation date. Otherwise, transfer the file to a USB drive for copying to Oracle Big Data Appliance.

bda-install-preview.html

Provides a report that lists all the details of the configuration. You can view the this report in a browser. Check it carefully to ensure that all of the settings are correct.

bda-preinstall-checkip.sh

Runs a series of tests to ensure the specified names and IP addresses for Oracle Big Data Appliance were added correctly to the name server, and they do not conflict with the existing network configuration.

rack_name-network.json

Contains the network configuration for a full rack, a starter rack, or a starter rack with additional servers. It contains information about all the servers, switches, and PDUs.

cluster_name-config.json

Contains all the information for a cluster, including the network configuration, port numbers, user names, and passwords. The configuration utility creates a separate parameter file for each cluster. If several clusters are being configured, then each parameter file is located in a separate subdirectory.

If additional servers are configured over a starter rack as an addition to an existing cluster, then the configuration utility does not generate a parameter file; the Mammoth utility generates it.

rack_name-rack-network.json

Contains the Administrative Network information for the entire rack.

cluster_name-cluster-network.json

Contains the client and private InfiniBand network information for the cluster created.

master.xml

Contains all the configuration settings in XML format so that Oracle Big Data Appliance Configuration Generation Utility can read it. To alter the configuration of an Oracle Big Data Appliance deployment, you can load this file, enter the changes, and regenerate the configuration files.

This file is used only by Oracle Big Data Appliance Configuration Generation Utility. It is not used for the actual configuration of Oracle Big Data Appliance.

4.4 Validating the Network Settings

Validating the network settings before the Oracle Big Data Appliance hardware arrives at your site is a critical step. Network problems can cause extended delays in the installation.

To validate the network configuration settings:

  1. Copy the bda-preinstall-checkip.sh file generated by Oracle Big Data Appliance Configuration Generation Utility to a Linux host on the same network that Oracle Big Data Appliance will use.

  2. Log in to the Linux host and run bda-preinstall-checkip.sh:

    $ sh bda-preinstall-checkip.sh
    

    This script checks your existing network for conflicts with the Oracle Big Data Appliance IP address pool.

  3. Correct any network problems discovered by the script before installation begins. Network problems during the installation can cause extensive delays.

  4. Run the network connections to the planned location for Oracle Big Data Appliance.

  5. Inform your Oracle representative when you have completed these steps.

4.5 Customer Details Page

The following table describes the customer details fields. The Generated Names shows the names created from the entered values.

Table 4-1 Customer Details Page

Customer Details Field Description

Customer Name

The name of your enterprise. Required.

Region

The geographic area where Oracle Big Data Appliance will be installed.

Time Zone

The time zone for your installation. You must select the appropriate region before selecting the time zone.

Rack Base Name

A maximum of 10 alphanumeric characters for the name of the Oracle Big Data Appliance rack.

Rack Start Index

A digit that uniquely identifies the rack. It is a suffix of the rack base name.

Server Base Name

Base name for all servers. A two-digit suffix uniquely identifies each server.

The rack name and server base name are used to generate the host names for all network interfaces: eth0, bondib0, bondeth0, and Oracle ILOM. For example, a rack base name of bda, a rack start index of 1, and a server base name of node results in host names of bda1node01, bda1node02, and so forth.

Admin Name Suffix

Suffix to the basic host name to form the eth0 host names

Private Name Suffix

Suffix to the basic host name to form the bondib0 host name

ILOM Name Suffix

Suffix to the basic host name to form the Oracle ILOM name

Switch Base Name

Suffix to the rack name to form the base name for all switches. For example, a rack name of bda and a switch base name of sw results in switch names of bda1sw-ip, bda1sw-ib1, and so forth.

Important: Ensure that the fully constructed switch names (such as bda1sw-ib1) do not exceed 20 characters.

Domain Name

Name of the domain in which Oracle Big Data Appliance operates. Required.

4.5.1 Using Standardized Host Names

The rack name is used in the assignment of standardized host names for all Oracle Big Data Appliance servers. The host name for all servers on the client network is in this format:

racknameNodeNN.domain

In this syntax:

  • NN is the position number of the server node in the rack (01 to 18).

  • domain is the domain name.

Host names must have fewer than 16 characters, which can be ASCII lower-case letters (a to z), numbers (0 to 9), and hyphens (-) only. Do not begin or end the name with a hyphen.

4.5.2 Using Customized Host Names

You can change the suffixes used for the different network interfaces.

You can enter server host names that do not follow the naming conventions on the Review and Edit Details Page.

4.5.3 Naming Multirack Clusters

Oracle recommends that for a cluster of multiple racks, you use the cluster name as the rack name.

For example, in a three-rack cluster, if the cluster name is cluster1 and the domain name is example.com, then the fully qualified host name of the server at the bottom of the first rack is cluster101node01.example.com. For the top server in the third rack of this cluster, the host name is cluster103node18.example.com.

4.5.4 Connecting to Oracle Big Data Appliance Over the Networks

The host names on the other networks have a short extension that follows the unit number. If you retained the default extensions, then use these formats to connect to Oracle Big Data Appliance after it is connected to the network:

  • For short host names over the administration network:

    racknameNodeNN-adm

  • For the private InfiniBand network host names:

    racknameNodeNN-priv

  • For the Oracle Integrated Lights Out Manager (ILOM) host names:

    racknameNodeNN-ilom

  • For the switch host names:

    racknamesw-ibM

In this syntax:

  • NN is the position number of the server in the rack (01 to 18).

  • M is 1, 2, or 3, depending on the switch location in the rack.

4.6 Hardware Selection Page

The hardware page identifies one or more racks that you want to deploy at the same time. The racks must be cabled together.

For example, if you are deploying three full racks, then add Full Rack three times to your deployment.

The following table describes the hardware selection choices.

Table 4-2 Hardware Selection Page

Hardware Selection Field Description

Select interconnected hardware to deploy

Lists the available hardware configurations. Choose one or more racks. You can choose the same type of rack multiple times.

  • Full rack: Contains 18 servers.

  • Starter rack: Contains 6 servers.

  • Partially filled rack: Contains any number of servers between 7 and 17.

This is your deployment

Lists the hardware selected for your site.

Will you use non-Oracle PDUs?

Oracle strongly recommends that the PDUs that are shipped with Oracle Big Data Appliance be used to supply power to its servers and switches.

However, if your data center has specific requirements that the Oracle-supplied PDUs do not meet, then you can use other PDUs to supply power to the Oracle Big Data Appliance servers and switches. Oracle does not support customer-supplied PDUs, and Oracle Enterprise Manager Cloud Control does not monitor them.

4.7 Rack Details

The rack details page identifies the optional network connections for an Oracle Big Data Appliance rack.

The following table describes the rack detail fields.

Table 4-3 Rack Details Page

Rack Detail Field Description

Rack Name

Enter the name of the rack.

Number of 10 GbE Connections

The two Sun Network QDR InfiniBand Gateway switches in every Oracle Big Data Appliance must have an equal number of 10 GbE links to the client network. Each gateway switch supports up to eight 10-GbE links, for a total of 16 links. Each server is assigned to one 10-Gbe link on each gateway switch.

Therefore, an Oracle Big Data Appliance full rack or starter rack with an extension kit can use up to 16 10-GbE links. A starter rack can use up to 12 10-GbE links.

Oracle recommends using as many 10 GbE links to the Oracle Big Data Appliance rack as the data center can support. The additional links increase the network bandwidth available between Oracle Big Data Appliance and the client network, and reduces the impact if a link fails.

Oracle uses this information to ensure that the correct number of cables is ordered or available at the installation site. It also lets the Oracle field engineer plan how many 10 GbE links to connect to the Oracle Big Data Appliance rack.

BDA will be connected via InfiniBand to any Oracle engineered systems

Select this option if you are connecting this rack to another rack through the InfiniBand fabric. When connecting multiple racks, ensure the following:
  • The InfiniBand IP addresses of all servers are unique, including the servers in other Oracle engineered systems.

  • All InfiniBand IP addresses are on the same network.

For example, if you connect an Oracle Big Data Appliance rack to Oracle Exadata Database Machine, then you must use the same netmask on the InfiniBand networks for both systems. Moreover, after you apply this netmask to the InfiniBand IP addresses of the Exadata database servers, the Exadata storage servers, and the Oracle Big Data Appliance servers, all IP addresses are in the same subnet."

4.8 Networking Page

The networking page identifies the number of IP addresses required for each network. The administration, client Ethernet, and InfiniBand networks are required. You must allocate the specified number of IP addresses for them.

Each IP address pool initially contains a range of consecutive IP addresses. If some IP addresses in the range are not available, then you can change individual addresses on the Review and Edit Details Page.

4.9 Administration Network Page

The values that you enter on this page are used to add the Oracle Big Data Appliance servers, switches, and PDUs to your existing administration network. Each server has two network interfaces for administration. One interface provides access to the operating system, and the other provides access to Oracle Integrated Lights Out Manager (ILOM).

Note:

Each network must be on a separate subnet from the other networks.

The following table describes the administration network fields.

Table 4-4 Administration Network Page

Administration Network Field Description

Starting IP Address for Pool

The first IP address on the administration network available for use by Oracle Big Data Appliance.

Pool Size

The required number of IP addresses on the administration network.

The IP addresses for a rack are assigned in this order: Oracle Big Data Appliance servers, Oracle ILOMs, Ethernet switch, spine switch, leaf switches (2), and PDUs (2).

Ending IP Address for Pool

The last IP address on the administration network assigned to Oracle Big Data Appliance. The value in this field is automatically calculated from the starting IP address and the pool size.

Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page.

Subnet Mask

The subnet mask for the administration network.

Gateway

The IP address for the gateway.

The gateway IP address is generated automatically, so verify that it is correct.

4.10 General Network Properties Page

The client and administration networks typically use the same Domain Name System (DNS) and Network Time Protocol (NTP) servers. If they are different on your networks, then enter the values for the client network first.

The following table describes the general network properties.

Table 4-5 General Network Properties Page

General Network Properties Field Description

DNS Servers

Up to six IP addresses for the DNS servers. At least one DNS server must be accessible on the client network.

DNS servers are not required on the administration network, although Oracle Big Data Appliance uses them if they are available.

NTP Servers

Up to six IP addresses for the NTP servers. Both the client network and the administration network must have access to at least one NTP server.

The NTP servers for the administration network can be different from the NTP servers for the client network. If they are, then identify the NTP servers for the client network in this field.

Search Domains

Up to six domain names in which Oracle Big Data Appliance operates, such as example.com and us.example.com

Are administration host name entries in DNS?

Select Yes or No:

  • Yes: Administration host name DNS entries are validated during preinstall checks and network configuration.

  • No: Administration host name DNS entries are not validated during preinstall checks and network configuration.

The next table describes the network properties for the administration network if they are different from the client network. Typically, these properties are the same for both networks. They default to the values you entered for the general network properties.

Table 4-6 Advanced Network Properties

Advanced Network Properties Field Description

Advanced Network Configuration

Select this option if the client and administration networks are isolated on your system and use different DNS and NTP servers, different domains, or both. You can then complete the fields for the administration network.

Admin Domain Name

The name of the administration domain.

Admin DNS Servers

Up to six IP addresses for the administration Domain Name System server, if they are different from the client network.

Admin NTP Servers

Up to six IP addresses for the administration Network Time Protocol server, if they are different from the client network.

Admin Search Domains

Up to six domain names in which Oracle Big Data Appliance administration network operates, if they are different from the client network.

4.11 Review and Edit Details Page

Use this page to review and modify the network configuration settings.

If you specified a range of IP addresses for any of the networks that includes addresses already in use, then replace those IP addresses on this page. Otherwise, the network configuration of Oracle Big Data Appliance will fail, causing unnecessary delays. When you are done making changes, click Regenerate using changed base values.

4.12 Define Clusters Page

Use the define clusters page to identify the number of clusters to create and the servers that compose each cluster. You can configure clusters for either CDH or Oracle NoSQL Database on this page. A cluster can occupy a single rack or span multiple racks.

Note:

In a multi-cluster environment, it is not required that all clusters uniformly run the same version of Oracle Big Data Appliance. Multiple clusters running different versions of the software on different racks or the same rack are supported configurations.

The minimum cluster size for each type is as follows:

  • CDH cluster – five servers minimum is recommended for production in order to support high availability. Clusters as small as three servers can be created. These are generally recommended for development only. However the five server minimum for clusters in production is a guideline, not a rule, and has no implications for Oracle support.

  • NoSQL Database cluster – three servers minimum

For each rack contributing servers to a cluster of either type, the minimum number of servers is three.

The following table describes the cluster definition choices.

Table 4-7 Define Clusters Page

Define Clusters Field Description

Number of clusters to create

Select the number of clusters. For each cluster, a new tab appears on the page, and a new page appears in Oracle Big Data Appliance Configuration Generation Utility. Be sure to complete all tabs before continuing to the next page.

Cluster Name

Enter a unique name for the cluster. The name must begin with a letter and can consist of alphanumeric characters and dashes (-). Underscores, (_) and other non-alphanumeric characters are not accepted.

Cluster Type

Choose the type of cluster:

  • CDH cluster: Installs Cloudera's Distribution including Apache Hadoop and optional software on cluster of new servers

  • NoSQL DB cluster: Installs Oracle NoSQL Database on a cluster of new servers

  • Adding to existing cluster: Installs the same software on the new servers as the rest of the cluster.

Unassigned Servers

From the list on the left, select the servers for the cluster and move them to the list of assigned servers on the right.

Assigned Servers

Lists the servers selected for the cluster. A CDH cluster must consist of at least three servers. A minimum of five is recommended for a production environment. Oracle NoSQL Database clusters must consist of a minimum of three servers for development or production.

4.13 Cluster Page

Select the software to install on this cluster. The fields displayed on this field depend on the type of cluster being configured:

4.13.1 A New Oracle NoSQL Database Cluster

For each new Oracle NoSQL Database cluster, the Configuration Utility provides a page where you configure the networking for the cluster. On this page, provide the following general information about the cluster’s client network:

  • Cluster Name

  • DNS Servers

  • NTP Servers

  • Search Domains

  • Domain Name

  • Region

  • Time Zone

Also provide the configuration details requested in the following subsections of the page:

4.13.1.1 User and Groups

The following table describes the user name, groups, and password fields for a new Oracle NoSQL Database cluster. Passwords are optional, but you must enter them during the software installation if you do not provide them here.

Table 4-8 User and Groups for a New Oracle NoSQL Database Cluster

User/Groups Field Description

OS password for root user

The root password on all servers in the cluster.

OS password for oracle user

The oracle password on all servers in the cluster. Oracle applications run under this identity.

oracle user ID

The ID number of the oracle user. It must match the oracle UID of a connected Oracle Exadata Database Machine. The UID ensures that the oracle user can log in from Oracle Big Data Appliance to the correct account in Oracle Database. Required.

oinstall group ID

The ID number of the Oracle Inventory Group (oinstall). It must match the oinstall group ID of a connected Oracle Exadata Database Machine.

dba group ID

The ID number of the dba group. It must match the dba group ID of a connected Oracle Exadata Database Machine.

4.13.1.2 Client Network

The following table describes the fields of the Client Network panel on the Cluster Page of the Oracle Big Data Configuration Utility.

Note that Client and private InfiniBand networks within a cluster may not use the same subnet. However, subnets may be shared by client networks on different clusters.

Table 4-9 Client Network

Client Network Field Description

Starting IP Address for Pool

The first IP address on the client network you are configuring.

Pool Size

The required number of IP addresses (one for each node in the current cluster).

Ending IP Address for Pool

The last IP address on the client network. This is calculated automatically based on the starting address and pool size.

Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If not all are available, then you can either assign a different range or manually change individual IP addresses on the Client and InfiniBand Network page.

Subnet Mask

The subnet mask of the client network on the current cluster.

Gateway

The IP address of the client network on the current cluster.

VLAN ID

Note that the default VLAN ID value is NO.

Domain

FQDN of the domain hosting the network.

Connectors

You must check at least one connector from the eight connectors for each cluster’s client network. This provides each cluster with least two connectors. (One connector is dedicated to failover and is located on the secondary InfiniBand switch.) Make the10 GbE connections to the same ports on both switches.

Add a Client to this Cluster

Add another client network to this cluster. Click this button to create another set of input fields. Add the necessary identfiers to these fields in order to configure the additional client network.

Remove a Client from this Cluster

Removes the last additional client from the list of client networks.

4.13.1.3 InfiniBand Network

The InfiniBand network connects the Oracle Big Data Appliance servers within a rack. It can also connect multiple racks to form a multirack Hadoop cluster, or to provide access to Oracle Big Data Appliance from Oracle Exadata Database Machine.

Although you can configure multiple client networks within a cluster, each cluster can support only one InfiniBand network.

The following table describes the InfiniBand network fields.

Table 4-10 InfiniBand Network Page

InfiniBand Network Field Description

Starting IP Address for Pool

The first IP address on the private InfiniBand network available for use by the Oracle Big Data Appliance servers. The default is 192.168.10.1.

Pool Size

The required number of IP addresses. All Oracle Big Data Appliance servers require an IP address on the InfiniBand network. The pool size is calculated for the deployment that you identified on the Hardware Selection Page.

Ending IP Address for Pool

The last IP address assigned to the InfiniBand network for this deployment. This address is automatically calculated from the starting IP address and the pool size.

Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page.

Subnet Mask

The subnet mask for the InfiniBand network. The default is 255.255.252.0.

PKey The unique ID of the InfiniBand partition used by this network.
PName The partition name.
PType The partition membership type (FULL or LIMITED).
MTU Maximum Transmission Unit for this network.

4.13.1.4 Installed Components

You can install either Community Edition or Enterprise Edition of Oracle NoSQL Database.

The following table describes these choices.

Table 4-11 Installed Components

Component Field Description

Oracle NoSQL Database Edition

Choose between Community Edition and Enterprise Edition:

  • Community Edition is included in the license for Oracle Big Data Appliance.

  • Enterprise Edition requires a separate license. You must have this license to install Enterprise Edition on Oracle Big Data Appliance.

4.13.1.5 Oracle NoSQL Configuration

Oracle NoSQL Database 12c Release 1.3.0.5 and later versions support secondary zones, which are composed of nodes that function only as replicas. You can use the secondary zones on Oracle Big Data Appliance to maintain extra copies of the data for increased redundancy and read capacity, or to provide low latency, read access to data at a distant location.

The following table describes the modifiable configuration settings.

Table 4-12 Oracle NoSQL Configuration

Configuration Field Description

Oracle NoSQL store name

A unique name for the KV store on Oracle Big Data Appliance. The default value is BDAKV1.

Oracle NoSQL primary zone name

The name of the primary zone for all nodes in the cluster. The default value is BDAKV1_PRIMARY_ZN.

Oracle NoSQL primary zone replication factor

A replication factor of 1 or more for the primary zone. The default value is 3.

4.13.2 A New CDH Cluster

For each new cluster, the Configuration Utility provides a page where you configure the networking for the cluster. On this page, provide the following general information about the cluster’s client network:

  • Cluster Name

  • DNS Servers

  • NTP Servers

  • Search Domains

  • Domain Name

  • Region

  • Time Zone

Also provide the configuration details requested in the following subsections of the page:

4.13.2.1 User/Groups

The following table describes the user name, groups, and password fields for a new CDH cluster. Passwords are optional, but you must enter them during the software installation if you do not provide them here.

Table 4-13 User /Groups for a New CDH Cluster

User/Groups Field Description

OS password for root user

The root password on all servers in the cluster.

OS password for oracle user

The oracle password on all servers in the cluster. Oracle applications run under this identity.

oracle user ID

The ID number of the oracle user. It must match the oracle UID of a connected Oracle Exadata Database Machine. The UID ensures that the oracle user can log in from Oracle Big Data Appliance to the correct account in Oracle Database. Required.

oinstall group ID

The ID number of the Oracle Inventory Group (oinstall). It must match the oinstall group ID of a connected Oracle Exadata Database Machine.

dba group ID

The ID number of the dba group. It must match the dba group ID of a connected Oracle Exadata Database Machine.

Cloudera Manager admin password

The password for the admin user for Cloudera Manager. Available only for CDH clusters.

MySQL admin password

The password for the MySQL Database administration user. Available only for CDH clusters.

4.13.2.2 Client Network

The following table describes the fields of the Client Network panel on the Cluster Page of the Oracle Big Data Configuration Utility.

Note that Client and private InfiniBand networks within a cluster may not use the same subnet. However, subnets may be shared by client networks on different clusters.

Table 4-14 Client Network

Client Network Field Description

Starting IP Address for Pool

The first IP address on the client network you are configuring.

Pool Size

The required number of IP addresses (one for each node in the current cluster).

Ending IP Address for Pool

The last IP address on the client network. This is calculated automatically based on the starting address and pool size.

Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If not all are available, then you can either assign a different range or manually change individual IP addresses on the Client and InfiniBand Network page.

Subnet Mask

The subnet mask of the client network on the current cluster.

Gateway

The IP address of the client network on the current cluster.

VLAN ID

Note that the default VLAN ID value is NO.

Domain

FQDN of the domain hosting the network.

Connectors

You must check at least one connector from the eight connectors for each cluster’s client network. This provides each cluster with least two connectors. (One connector is dedicated to failover and is located on the secondary InfiniBand switch.) Make the10 GbE connections to the same ports on both switches.

Add a Client to this Cluster

Add another client network to this cluster. Click this button to create another set of input fields. Add the necessary identfiers to these fields in order to configure the additional client network.

Remove a Client from this Cluster

Removes the last additional client from the list of client networks.

4.13.2.3 InfiniBand Network

The InfiniBand network connects the Oracle Big Data Appliance servers within a rack. It can also connect multiple racks to form a multirack Hadoop cluster, or to provide access to Oracle Big Data Appliance from Oracle Exadata Database Machine.

Although you can configure multiple client networks within a cluster, each cluster can support only one InfiniBand network.

The following table describes the InfiniBand network fields.

Table 4-15 InfiniBand Network Page

InfiniBand Network Field Description

Starting IP Address for Pool

The first IP address on the private InfiniBand network available for use by the Oracle Big Data Appliance servers. The default is 192.168.10.1.

Pool Size

The required number of IP addresses. All Oracle Big Data Appliance servers require an IP address on the InfiniBand network. The pool size is calculated for the deployment that you identified on the Hardware Selection Page.

Ending IP Address for Pool

The last IP address assigned to the InfiniBand network for this deployment. This address is automatically calculated from the starting IP address and the pool size.

Ensure that all of the IP addresses are available for use in the pool defined by the starting and ending addresses. If they are not available, then you can either assign a different range or manually change individual IP addresses on the Review and Edit Details Page.

Subnet Mask

The subnet mask for the InfiniBand network. The default is 255.255.252.0.

PKey The unique ID of the InfiniBand partition used by this network.
PName The partition name.
PType The partition membership type (FULL or LIMITED).
MTU Maximum Transmission Unit for this network.

4.13.2.4 Big Data SQL

You can install Oracle Big Data SQL on a CDH cluster. You must have a separate license for this product. The following table describes the Oracle Big Data SQL fields.

Table 4-16 Big Data SQL

Big Data SQL Field Description

Is Big Data SQL licensed?

Oracle Big Data SQL supports queries from Oracle Database against data stored in Hive and HDFS on Oracle Big Data Appliance. This technology requires a separate license. If you have a license, choose Yes.

Install Big Data SQL?

Choose Yes to install Oracle Big Data SQL during the initial software release. Oracle Big Data Appliance must be connected over the InfiniBand network to Oracle Exadata Database Machine. Choose No to defer this installation to a later date.

4.13.2.5 Big Data Connectors

You can install Oracle Big Data Connectors on a CDH cluster. You must have a separate license for this product. The following table describes the installed components field.

Table 4-17 Installed Components

Component Field Description

Are Big Data Connectors licensed?

Oracle Big Data Connectors facilitate data access between data stored in the CDH cluster and Oracle Database. The connectors require a separate license. If you have a license, choose Yes. This choice activates the other fields.

Install Oracle Data Integrator Agent?

The agent supports Oracle Data Integrator, which is one of the Oracle Big Data Connectors.To configure Oracle Data Integrator for use immediately, choose Yes.

You must have a license for Oracle Big Data Connectors.

MySQL password for Oracle Data Integrator Agent?

The password for the Oracle Data Integrator user in MySQL Database.

4.13.2.6 MIT Kerberos

MIT Kerberos authentication is a security option for CDH clusters. It is included with your Oracle Big Data Appliance license.

To use a key distribution center (KDC) elsewhere on the network (that is, not on Oracle Big Data Appliance), you must complete several steps before installing the software. See "Installation Prerequisites."

The following table describes the MIT Kerberos fields.

Table 4-18 MIT Kerberos

Kerberos Field Description

Enable MIT Kerberos-based authentication?

Select this option to support MIT Kerberos on Oracle Big Data Appliance.

Set up key distribution center on BDA

Choose Yes to set up a key distribution center (KDC) on Oracle Big Data Appliance. Otherwise, a KDC must always be available on the network to all clients.

Kerberos KDC database password

A password for the KDC database, if it is being created on Oracle Big Data Appliance.

Non-BDA key distribution center hosts

List the fully qualified names or the IP addresses of the KDCs, available on the same network, that can serve as either the primary or backup KDC for Oracle Big Data Appliance.

Important:

The first external KDC server in the list (the one at the top of the list), must be the same server identified as the Kerberos Admin Server in Krb5.conf.

Kerberos realm

Enter the name of the realm for Oracle Big Data Appliance, such as EXAMPLE.COM.

Encrypt HDFS Data Transport

Enables encryption of data transfer between DataNodes and clients, and among DataNodes.

Encrypt Hadoop Services

Enables SSL encryption for HDFS, MapReduce, and YARN web interfaces, as well as encrypted shuffle for MapReduce and YARN. It also enables authentication for access to the web console for the HDFS, MapReduce, and YARN roles.

Enable Sentry Authorization

Select this option to use Apache Sentry to provide fine-grained authorization to data stored in Hadoop.

4.13.2.7 Active Directory Kerberos

Active Directory Kerberos authentication is a Windows-based security option for CDH clusters. It is included with your Oracle Big Data Appliance license.

The following table describes the Active Directory Kerberos fields.

Table 4-19 Active Directory Kerberos

Kerberos Field Description

Enable Active Directory Kerberos-based authentication?

Select this option to support Active Directory Kerberos on Oracle Big Data Appliance.

Active Directory Host

The fully qualified name or IP address of the Active Directory host.

Kerberos realm

Enter the name of the realm for Oracle Big Data Appliance, such as EXAMPLE.COM.

Active Directory Domain

Domain distinguished name where all the accounts are created on the Active Directory, such as ou=bda,DC=example,DC=com.

Active Directory admin user

Name of the Active Director Admin user that can create, delete, and modify accounts on the Active Directory.

Active Directory admin password

Password for the Active Director Admin user.

Encrypt HDFS Data Transport

Enables encryption of data transfer between DataNodes and clients, and among DataNodes.

Encrypt Hadoop Services

Enables SSL encryption for HDFS, MapReduce, and YARN web interfaces, as well as encrypted shuffle for MapReduce and YARN. It also enables authentication for access to the web console for the HDFS, MapReduce, and YARN roles.

Enable Sentry authorization

Select this option to use Apache Sentry to provide fine-grained authorization to data stored in Hadoop.

4.13.2.8 HDFS Transparent Encryption

HDFS Transparent Encryption is a security option for data at-rest in CDH clusters.

HDFS Transparent Encryption requires that you enable Kerberos. It works with either Active Directory Kerberos or MIT Kerberos. (See "Installation Prerequisites.")

The following table describes the HDFS Transparent Encryption fields.

Table 4-20 HDFS Transparent Encryption

HDFS Transparent Field Description

Enable HDFS Transparent Encryption.

Select this option to enable encryption of Hadoop data at rest on the Oracle Big Data Appliance.

Set up Key Trustee Servers on the BDA

Select this option to set up Key Trustee Servers internally on BDA cluster. If you enable this option, the remaining key trustee setup parameters are not needed. Mammoth will configure the installation.

Active Navigator Key Trustee Server

Manages encryption keys, certificates, and passwords. In the current implementation, this must be a server external to the Oracle Big Data Appliance.

Passive Navigator Key Trustee Server

The passive Navigator Key Trustee Server is a backup to ensure that this service does not become a single point of failure.

Key Trustee Organization

Key Trustee organization name configured by the Key Trustee Server administrator. This is the organization that keys are stored against.

Key Trustee Authorization Code

The authorization code that corresponds to the Navigator Key Trustee organization. This code is used by Cloudera Manager to authenticate against the Key Trustee Server.

4.13.2.9 Audit Vault

You can configure CDH clusters on Oracle Big Data Appliance as secured targets for Oracle Audit Vault and Database Firewall. The Audit Vault plug-in on Oracle Big Data Appliance collects audit and logging data from MapReduce, HDFS, and Oozie services. You can then use Audit Vault Server to monitor these services on Oracle Big Data Appliance.

Oracle Audit Vault and Database Firewall Server Release 12.1.1 must be up and running on a separate server on the same network as Oracle Big Data Appliance before you perform the actual configuration. No other version of Oracle Audit Vault and Database Firewall Server is supported on Oracle Big Data Appliance.

The following table describes the Audit Vault fields.

Table 4-21 Audit Vault

Audit Vault Field Description

Enable Audit Vault

Select this option to support Oracle Audit Vault and Database Firewall on Oracle Big Data Appliance.

Audit Vault server

The IP address of the Audit Vault server.

Audit Vault port

The port number that Audit Vault Server listens on.

Audit Vault database service name

The database service name for Audit Vault Server.

Audit Vault admin user

The name of the Audit Vault administration user.

Audit Vault admin user password

The password for the administration user.

4.13.2.10 Auto Service Request

ASR monitors the health of Oracle Big Data Appliance hardware and automatically submits a service request when it detects a fault. Although you can opt out of this program, Oracle recommends that you enable ASR.

ASR Manager must be installed and configured to run on a separate server outside of Oracle Big Data Appliance before the software is installed and configured on Oracle Big Data Appliance. The software installation fails with an error if Enable Auto Service Request is selected, but ASR Manager is not accessible using the specified host address and port number. The Mammoth utility does not install ASR Manager.

The software on Oracle Big Data Appliance must be able to connect to ASR Manager. ASR Manager must be able to route to the Internet, either directly or through a proxy, to send event information that automatically opens service requests.

The following table describes the Auto Service Request fields.

Table 4-22 Auto Service Request

ASR Field Description

Enable Auto Service Request

Select this option to support Auto Service Request.

ASR Manager Host Name

The fully qualified name or the IP address of a Linux server on the network where ASR will be installed

ASR Manager Port

The port number for ASR Manager. The default port is 162.

ASR Root Password

Password for root on the ASR Manager host

4.13.2.11 Enterprise Management Cloud Control

The Mammoth utility deploys and validates agents on Oracle Big Data Appliance that Enterprise Manager uses to monitor the appliance. Mammoth does not install Oracle Enterprise Manager Cloud Control.

Before you can configure Oracle Big Data Appliance for the Enterprise Manager system monitoring plugin, you must install and configure Enterprise Manager to run on a separate server outside of Oracle Big Data Appliance. The Oracle Big Data Appliance software installation fails with an error if you choose the Enterprise Manager option, but Enterprise Manager is not installed and accessible using the specified host address, port numbers, and so forth.

The following table describes the Enterprise Manager Cloud Control fields.

Table 4-23 Enterprise Manager Cloud Control

Cloud Control Field Description

Enable Oracle Enterprise Manager Cloud Control Agent

Select this option to use the Oracle Enterprise Manager system monitoring plugin.

OMS Host Name

The fully qualified name or the IP address of the server where Oracle Management Server (OMS) is installed with the plugin for Oracle Big Data Appliance.

OMS HTTPS Console Port

The port number for the Oracle Enterprise Manager Cloud Control web interface.

To obtain the HTTPS port numbers, use an emctl status oms -details command from the Enterprise Manager host.

OMS HTTPS Upload Port

The HTTP upload port number for Oracle Enterprise Manager Cloud Control web interface.

EM Super Admin User

A Cloud Control user with super-administration privileges to perform administration emcli commands. Typically, this user is sysman.

EM Super Admin Password

Password for the Cloud Control user name.

EM Agent Registration Password

The password for validating the Oracle Management agents on Oracle Big Data Appliance.

The Agent Registration password is part of the security setup of Enterprise Manager. To obtain the password in Enterprise Manager, click Setup at the top right of the window, Security, and then Registration Passwords.

Cloud Control SYS Password

The SYS password for the Cloud Control repository.

Inventory Location

The full path of the oraInventory directory for the system where Oracle Enterprise Manager is installed.

4.13.2.12 Email Alerting

Cloudera Manager sends email alerts when it detects a problem in the CDH cluster.

The following table describes the email alert fields.

Table 4-24 Email Alerting Page

Email Alerting Field Description

SMTP Server

The fully qualified name or the IP address of the existing SMTP server that the company uses on its internal network. Required.

Uses SSL

Select Yes if a Secure Sockets Layer (SSL) connection is required.

SMTP Port

The port number used by the email server

Requires Authentication

Select this option if your SMTP server requires authentication. You can then enter a user name and a password.

SMTP User Name

User name for Cloudera Manager to log in to the SMTP server.

This field is hidden when authentication not selected.

SMTP Password

Password for the user name.

This field is hidden when authentication is not selected.

Recipient Addresses

The valid email addresses of users who need to get alerts from Cloudera Manager. Enter each email address on a separate line. Required.

The field to the right indicates the number of email addresses entered in the dialog box.

4.13.3 Adding to an Existing Cluster

You cannot use the Oracle Big Data Appliance Configuration Utility to add servers to an existing cluster. Use the Mammoth Utility to do this.

See "Adding Servers to a Cluster."

4.14 Client Network Config Summary

This page provides an opportunity to view all of the client network IP addresses on the rack and make any final changes before generating the configuration files.

You can also click Reset all IPs using default values to void all client IP addresses and re-generate IP address assignments from the default client network IP address pool.

4.15 Complete Page

You have now set all the installation and configuration options. Click Back to return to a page and change its settings. The Back button does not clear the pages; your settings remain unless you change them.

The text boxes on this page provide a place for you to identify the Hadoop components and any third-party applications that you plan to use on the cluster, and to record any notes that might be useful at a later date. This information is saved in a file named master.xml, which you can use to reload these configuration settings into Oracle Big Data Appliance Configuration Generation Utility.

To generate the configuration files, click Create Files and click Yes in response to the prompt. An operating system window automatically opens in the directory where the files are saved.