Skip Headers
Oracle® Big Data Appliance Owner's Guide
Release 1 (1.1)

Part Number E36161-06
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

8 Configuring Oracle Big Data Appliance

This chapter describes how to configure the system, accounts, and software for Oracle Big Data Appliance. Many of the procedures in this chapter use Oracle Integrated Lights Out Manager (Oracle ILOM) and the dcli utility.

This chapter contains the following sections:

See Also:

Note:

8.1 Configuring the KVM Switch

The KVM configuration consists of these procedures:

8.1.1 Starting the KVM Switch

To start the KVM switch:

  1. Ensure that all connected components are powered off.

  2. Pull the KVM tray out from the front of the rack, and open it using the handle.

  3. Touch the touch pad.

  4. Toggle between the host and KVM interface by pressing the Ctrl key on the left side twice, similar to double-clicking with a mouse. You see the main user interface page.

  5. In the navigator on the left, select Target Devices under Unit View. In the main display area, verify that 18 target devices are listed with Action set to KVM Session.

    The sessions are numbered from the bottom of the rack to the top.

  6. If 18 sessions are not shown:

    1. In the navigator under Appliances, expand Ports, and then select IQ Adaptors.

    2. In the main display area, choose the Port table heading to sort the sessions by port number.

    3. Note any missing sessions, so that you can fix them later.

    4. In the navigator, choose Target Devices to return to the Target Devices page.

8.1.2 Connecting the KVM Switch to the Management Network

To connect the KVM switch to the management network:

  1. In the navigator under User Accounts, select Local.

  2. Under Users, choose Admin.

  3. Set the password for the Admin account to welcome1, and then choose Save. Do not modify any other parameters.

  4. Under Appliance Settings, expand Network, and then choose IPv4. The Network Information page appears.

  5. Enter values for Address, Subnet, and Gateway, and then choose Save.

  6. Under Appliance Settings, choose DNS to display the DNS Information page.

  7. Enter the IP addresses of the DNS servers, and then choose Save.

  8. Under Network, choose General to display the Appliance General Network Settings page.

  9. Connect the KVM LAN1 Ethernet port to the management network.

  10. To verify that the port has been configured correctly, ensure that the Media Access Control (MAC) address on the Network Settings page matches the label next to the LAN1/LAN2 ports at the rear of the KVM switch.

  11. Under Users, select Overview to display the Unit Maintenance page.

  12. Enter a name for the KVM switch, and then choose Save.

  13. To reboot the KVM switch, choose Reboot under Overview and Yes to confirm.

8.1.3 Checking the KVM Firmware Version

You may need to upgrade the KVM firmware to the recommended version.

To check the KVM firmware version: 

  1. In the navigator under Appliance Settings, select Versions. There are two version numbers, Application and Boot. Compare the displayed versions with these recommended versions:

    • Application: 1.10.2.17762

    • Boot: 1.9.16473

    If the application firmware version is earlier than 1.10.2, then you should upgrade it. To upgrade the firmware, continue with this procedure. Otherwise, you are done.

  2. Download the firmware from this website to a USB flash drive:

    http://www.avocent.com/Pages/GenericTwoColumn.aspx?id=12541

  3. Plug the flash drive into the KVM USB port and open a browser session.

  4. Log in to the KVM as Admin with password welcome1.

  5. Under Appliance, select Overview.

  6. From the Tools list, select Upgrade Firmware.

  7. Select the connection method, such as FTP or HTTP.

  8. Enter the file name of the downloaded firmware.

  9. Click Upgrade.

    The upgrade process takes 5 to10 minutes, including an automatic reboot.

  10. Confirm the firmware version by selecting Versions under Appliance Settings.

8.1.4 Configuring the KVM Switch to Access the Servers

To configure the KVM switch to access the servers:

  1. Under Unit View, select Target Devices to display the Target Devices page.

  2. Power on the server. The power button is on the front panel.

  3. Click the server name in the Name column to display the Unit Overview page.

  4. Click Overview and overwrite the name with the Oracle standard naming format of customer prefix, node type, and number. For example, bda1node03 identifies the third server from the bottom of the bda1 rack.

  5. Click Save.

  6. Repeat Steps 2 through 5 for each server in the rack. Each server boots through BIOS, and boots the operating system with the default factory IP configuration.

8.1.5 Accessing a Server by Using the KVM Switch

To access a server by using the KVM switch:

  1. Under Unit View, select Target Devices to display the Target Devices page.

  2. Click the system name in the Name column.

  3. Click KVM Session to open a session with the server.

8.2 Configuring the Cisco Ethernet Switch

The Cisco Catalyst 4948 Ethernet switch supplied with Oracle Big Data Appliance is minimally configured during installation. These procedures configure the Cisco Ethernet switch into one large virtual LAN.

The Cisco Ethernet switch configuration consists of these topics and procedures:

8.2.1 Scope of the Configuration

This configuration disables IP routing and sets the following:

  • Host name

  • IP address

  • Subnet mask

  • Default gateway

  • Domain name

  • Name server

  • NTP server

  • Time

  • Time zone

8.2.2 Prerequisites for Configuring the Ethernet Switch

To avoid disrupting the customer network, observe these prerequisites:

  • Do not connect the Cisco Ethernet switch until the network administrator has verified the running configuration and made any necessary changes.

  • Do not connect the Cisco Ethernet switch to the customer network until the IP addresses on all components have been configured in Oracle Big Data Appliance. This sequence prevents any duplicate IP address conflicts, which are possible due to the default addresses set in the components when shipped.

  • Configure the Cisco Ethernet switch with the network administrator.

8.2.3 Configuring the Ethernet Switch on the Customer Network

To configure the Ethernet switch on the customer network:

  1. Connect a serial cable from the Cisco switch console to a laptop or similar device. An RJ45 to DB9 serial cable is included with the Cisco documentation package.

  2. Ensure that the terminal session is recorded on the laptop by logging the output. The output can be used as a record that the switch has been configured correctly. The default serial port speed is 9600 baud, 8 bits, no parity, 1 stop bit, and no handshake.

    Switch con0 is now available
    Press RETURN to get started.
    
  3. Change to enable mode using the following command. The default password is welcome1.

    Switch> enable
    Password:
    
  4. Configure the network for a single VLAN. The following is an example of the configuration:

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    Switch(config)# interface vlan 1
    Switch(config-if)# ip address 10.7.7.34 255.255.255.0
    Switch(config-if)# end
    Switch# *Jan 23 15:54:00.506: %SYS-5-CONFIG_I:Configured from console by console
    Switch# write memory
    Building configuration...
    Compressed configuration from 2474 bytes to 1066 bytes [OK ]
    
  5. If the network does not require IP routing on the switch, and then disable the default IP routing setting and configure the default gateway. This method is preferred. Consult the network administrator if in doubt.

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    Switch(config)# no ip routing
    Switch(config)# ip default-gateway 10.17.7.1
    Switch(config)# end 
    *Jan 23 15:54:00.506: %SYS-5-CONFIG_I:Configured from console by console
    Switch# write memory
    Building configuration...
    Compressed configuration from 3600 bytes to 1305 bytes[OK]]
    
  6. If the network requires IP routing on the switch, and then keep the default IP routing setting and configure the default gateway as follows:

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    Switch(config)# ip route 0.0.0.0 0.0.0.0 10.7.7.1
    Switch(config)# end
    *Jan 23 15:55:02.506: %SYS-5-CONFIG_I:Configured from console by console
    Switch# write memory
    Building configuration...
    Compressed configuration from 2502 bytes to 1085 bytes [OK ]
    
  7. Set the host name of the switch using the standard Oracle Big Data Appliance naming convention of rack_namesw-ip. This example uses the name bda1sw-ip.

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    Switch(config)# hostname bda1sw-ip
    bda1sw-ip(config)# end
    *Jan 23 15:57:50.886: %SYS-5-CONFIG_I: Configured from console by console
    bda1sw-ip# write memory
    Building configuration...
    Compressed configuration from 3604 bytes to 1308 bytes[OK]
    bda1sw-ip#
    

    The system host name appears in the prompt.

8.2.4 Setting Up Telnet Access

Telnet access is optional. The following procedure describes how to enable and disable remote telnet access.

Note:

Oracle Big Data Appliance ships with a version of the Cisco Ethernet switch software that supports telnet but not SSH. To obtain support for SSH, you must install an update. See My Oracle Support Information Center: Oracle Big Data Appliance (ID 1445762.2).

To set up telnet access to the Ethernet switch:

  1. Set the password for telnet access if necessary; it should already be set when you receive Oracle Big Data Appliance.

    bda1sw-ip # configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    bda1sw-ip(config)# enable password welcome1
    bda1sw-ip(config)# enable secret welcome1 
    The enable secret you have chosen is the same as your enable password.
    This is not recommended. Re-enter the enable secret.
    bda1sw-ip(config)# end
    bda1sw-ip# write memory 
    *Jan 23 15:57:50.886: %SYS-5-CONFIG_I:Configured from console by console
    Building configuration...
    Compressed configuration from 2502 bytes to 1085 bytes [OK ]
    
  2. Set up telnet access. In this example, the first login output shows that the password is not set and telnet access is disabled. If the login command returns nothing, then the password is set and telnet access is available.

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    bda1sw-ip(config)# line vty 0 15
    bda1sw-ip(config-line)# login
    %Login disabled on line 1,until 'password' is set
    %Login disabled on line 2,until 'password' is set
    %Login disabled on line 3,until 'password' is set
    ...
    bda1sw-ip(config-line)# password welcome1
    bda1sw-ip(config-line)# login
    bda1sw-ip(config-line)# end
    bda1sw-ip# write memory
    *Jan 23 15:58:53.630: %SYS-5-CONFIG_I: Configured from console by console
    Building configuration...
    Compressed configuration from 3604 bytes to 1308 bytes[OK]
    
  3. To disable telnet access and prevent remote access, follow this example:

    Switch# configure terminal
    Enter configuration commands, one per line. End with CNTL/Z.
    bda1sw-ip(config)# line vty 0 15
    bda1sw-ip(config-line)# no password
    bda1sw-ip(config-line)# login
    %Login disabled on line 1, until 'password' is set
    %Login disabled on line 2, until 'password' is set
    %Login disabled on line 3, until 'password' is set
    ...
    bda1sw-ip(config-line)# end
    bda1sw-ip# write memory
    *Jan 23 15:58:53.630: %SYS-5-CONFIG_I: Configured from console by console
    Building configuration...
    Compressed configuration from 3604 bytes to 1308 bytes[OK]
    

8.2.5 Identifying the DNS Servers

Configure up to three Domain Name System (DNS) servers, replacing the values shown here with valid ones for the site:

bda1sw-ip# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
bda1sw-ip(config)# ip domain-name us.example.com
bda1sw-ip(config)# ip name-server 10.7.7.3
bda1sw-ip(config)# ip name-server 172.28.5.5
bda1sw-ip(config)# ip name-server 10.8.160.1
bda1sw-ip(config)# end 
*Jan 23 16:01:35.010: %SYS-5-CONFIG_I:Configured from console by console
bda1sw-ip# write memory
Building configuration...
Compressed configuration from 3662 bytes to 1348 bytes[OK]

8.2.6 Setting the Clock and Time Zone

The Cisco Ethernet switch keeps internal time in coordinated universal time (UTC) format.

To set the local time and time zone, ordering is important. The following is an example of setting the local time to the U.S. Eastern time zone:

bda1sw-ip# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
bda1sw-ip(config)# clock timezone EST -5
bda1sw-ip(config)# clock summer-time EDT recurring
bda1sw-ip(config)# end
bda1sw-ip# clock set 15:00:00 January 23 2012
bda1sw-ip# write memory
Building configuration...
Compressed configuration from 3778 bytes to 1433 bytes[OK]
bda1sw-ip# show clock
15:00:18.819 EST Mon Jan 23 2012

Following are descriptions of the commands for setting the clock and time zone:

  • To use UTC, enter this command:

    no clock timezone global configuration
    
  • To use a time zone:

    clock timezone zone hours-offset [minutes-offset]
    

    In this command, zone is the time zone to display when standard time is in effect, hours-offset is the hours offset from UTC, and minutes-offset is the minutes offset from UTC.

  • To set summer time hours:

    clock summer-time zone recurring [week day month hh:mm week day month \
    hh:mm [offset]]
    

    In this command, zone is the time zone to be displayed when summer time (daylight savings time) is in effect, week is the week of the month (1 to 5 or last), day is the day of the week, month is the month, hh:mm is the time in 24-hour format, and offset is the number of minutes to add during summer time. The default offset is 60 minutes.

  • To manually set the clock to any time:

    clock set hh:mm:ss month day year
    

    In this command, hh:mm:ss is the hour, month, and second in 24-hour format, day is the day of the month, month is the month, and year is the year. The time specified is relative to the configured time zone.

See Also:

Cisco IOS Configuration Fundamentals Command Reference at

http://www.cisco.com/en/US/docs/ios/12_2/configfun/command/reference/frf012.html

8.2.7 Configuring the NTP Servers

Configure up to two NTP servers. The following example shows the NTP server synchronized to local time when the Cisco switch is connected to the network and has access to NTP.

bda1sw-ip# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
bda1sw-ip(config)# ntp server 10.196.23.254 prefer
bda1sw-ip(config)# ntp server 192.168.9.19
bda1sw-ip(config)# end
Jan 23 20:00:41.235: %SYS-5-CONFIG_I:Configured from console by console
bda1sw-ip# write memory
Building configuration...
Compressed configuration from 3870 bytes to 1487 bytes [OK ]
bda1sw-ip# show ntp status
output varies by network
bda1sw-ip# show clock
15:00:57.919 EST Mon Jan 23 2012

8.2.8 Disabling the Spanning Tree

Ask the network administrator whether the network requires the spanning tree to be enabled before connecting the Cisco Ethernet switch.

The spanning tree is enabled by default on switch-to-switch uplink port 48. If this is correct, then you can skip this procedure.

To disable the spanning tree: 

  1. If the port must be disabled, then enter these commands:

    bda1sw-ip# configure terminal
    Enter configuration commands, one per line.  End with CNTL/Z.
    Switch(config)# no spanning-tree vlan 1
    Switch(config)# end
    Jan 23 20:01:15.083: %SYS-5-CONFIG_I: Configured from console by console
    bda1sw-ip# write memory
    Building configuration...
    Compressed configuration from 2654 bytes to 1163 bytes[OK]
    
  2. To verify the disabling of the spanning tree:

    bda1sw-ip# show spanning-tree vlan 1
    Spanning tree instance(s) for vlan 1 does not exist.
    

8.2.9 Verifying the Ethernet Configuration

To verify the Cisco Ethernet switch configuration:

  1. Verify the configuration by entering the following command:

    bda1sw-ip# show running-config
    

    The following is an example of the output:

    Building configuration...
    Current configuration :2654 bytes
    !
    version 12.2
    no service pad
    service timestamps debug datetime msec
    service timestamps log datetime msec
    no service password-encryption
    service compress-config
    .
    .
    .
    

    If any setting is incorrect, then repeat the appropriate step. To erase a setting, enter no in front of the same command. For example, to erase the default gateway, enter these commands:

    bda1sw-ip# no ip default-gateway 10.7.7.1
    bda1sw-ip# end
    bda1sw-ip# write memory
    
  2. Save the current configuration by entering this command:

    bda1sw-ip# copy running-config startup-config
    
  3. Exit from the session with this command:

    bda1sw-ip#exit
    bda1sw-ip con0 is now available
    
  4. Disconnect the cable from the Cisco console.

  5. To check the configuration, attach a laptop to port 48 and ping the IP address of the internal management network.

Caution:

Do not connect the Cisco Ethernet switch to the management network until after the system is configured with the customer's IP addresses and the switch configuration is complete.

8.3 Configuring the InfiniBand Leaf and Spine Switches

Oracle Big Data Appliance has two Sun Network QDR InfiniBand Gateway leaf switches and one Sun Datacenter InfiniBand Switch 36 spine switch. To configure the switches, follow these procedures for each one:

8.3.1 Configuring an InfiniBand Switch

To configure an InfiniBand switch:

  1. Connect to the switch using a serial or an Ethernet connection.

  2. Log in as ilom-admin with password welcome1.

    The switch has a Linux-like operating system and an Oracle ILOM interface that is used for configuration.

  3. Change to the /SP/network directory.

    cd /SP/network
    
  4. Enter these commands to configure the switch:

    set pendingipaddress=ip_address 
    set pendingipnetmask=ip_netmask
    set pendingipgateway=ip_gateway
    set pendingipdiscovery=static
    set commitpending=true
    

    In these commands, ip_address, ip_netmask, and ip_gateway represent the appropriate settings on your network.

  5. Enter a show command to view the changes. If any values are wrong, reenter the set commands ending with set commitpending=true.

    -> show
    
    /SP/network
       Targets:
            interconnect
            ipv6
            test
    
       Properties:
            commitpending = (Cannot show property)
            dhcp_ser_ip = none
            ipaddress = 10.135.42.24
            ipdiscovery = static
            ipgateway = 10.135.40.1
            ipnetmask = 255.255.255.0
            macaddress = 00:21:28:E7:B3:34
            managementport = SYS/SP/NET0
            outofbandmacaddress = 00:21:28:E7:B3:33
            pendingipaddress = 10.135.42.23
            pendingipdiscovery = static
            pendingipgateway = 10.135.42.1
            pendingipnetmask = 255.255.248.0
            pendingmanagementport = /SYS/SP/NET0
            sidebandmacaddress = 00:21:28:E7:B3:35
            state = enabled
    
       Commands:
            cd
            set
            show
    
    ->
    
  6. Set and verify the switch host name, replacing hostname with the valid name of the switch, such as bda1sw-ib2. Do not include the domain name.

    -> set /SP hostname=hostname
    -> show /SP hostname
    
     /SP
        Properties:
            hostname = bda1sw-ib2
    
  7. Set the DNS server name and the domain name:

    -> set /SP/clients/dns auto_dns=enabled
    -> set /SP/clients/dns nameserver=ip_address
    -> set /SP/clients/dns searchpath=domain_name
    

    In these commands, ip_address is one to three comma-separated IP addresses of the name servers in the preferred search order, and domain_name is the full DNS domain name, such as us.example.com.

  8. Verify the settings:

    -> show /SP/clients/dns
     /SP/clients/dns
        Targets:
     
        Properties:
            auto_dns = enabled
            nameserver = 10.196.23.245, 172.32.202.15
            retries = 1
            searchpath = us.example.com
            timeout = 5
     
       Commands:
            cd
            set
            show
    

8.3.2 Setting the Time Zone and Clock on an InfiniBand Switch

To set the time zone on an InfiniBand switch:

  1. Check the current time setting:

    -> show /SP/clock
    

    If the setting is not accurate, continue with these steps.

  2. Set the time zone, replacing zone_identifier with the time zone in the Configuration Template, such as America/New_York:

    -> set /SP/clock timezone=zone_identifier
    
  3. Check the current time setting:

    -> show /SP/clock
    

    If the setting is not accurate, continue with these steps.

  4. Set the SP clock manually, replacing MMDDHHmmCCyy with the month, day, hour, minute, century, and year.

    -> set datetime=MMddHHmmCCyy
    
  5. Check the current time setting:

    -> show /SP/clock
    
  6. Configure the Network Time Protocol (NTP), replacing ip_address with the server address. Server 1 is the primary NTP server and Server 2 is the secondary server.

    -> set /SP/clients/ntp/server/1 address=ip_address
    -> set /SP/clients/ntp/server/2 address=ip_address
    
  7. Enable the NTP servers:

    -> set /SP/clock usentpserver=enabled
    

    Note:

    Properly synchronized clocks are required for the Mammoth Utility software installation to succeed. If NTP is not used on the network, then configure the first Oracle Big Data Appliance server as an NTP server.
  8. Verify the settings:

    -> show /SP/clients/ntp/server/1
    -> show /SP/clients/ntp/server/2
    -> show /SP/clock
    

8.3.3 Checking the Health of an InfiniBand Switch

To check the health of an InfiniBand leaf or spine switch:

  1. Open the Fabric Management shell:

    -> show /SYS/Fabric_Mgmt
    

    The prompt changes from -> to FabMan@hostname->

  2. Check the firmware version, which should be 2.0.5-2 or later. See My Oracle Support Information Center: Oracle Big Data Appliance (ID 1445762.2) for the current version.

    FabMan@bda1sw-02->version
    SUN DCS gw version: 2.0.5-2
    Build time: Nov 29 2011 16:05:05
    FPGA version: 0x34
    SP board info:
    Manufacturing Date: 2011.05.31
    Serial Number: "NCD6Q0126"
    Hardware Revision: 0x0006
    Firmware Revision: 0x0000
    BIOS version: SUN0R100
    BIOS date: 06/22/2010
    FabMan@bda1sw-02->
    
  3. Check the overall health of the switch and correct any issues:

    FabMan@bda1sw-ib2-> showunhealthy
    OK - No unhealthy sensors
    
  4. Check the environment. Ensure that all tests return OK and PASSED, and correct any issues before continuing. This example shows a problem with PSU1 caused by a loose power cord. See the line starting with WARNING PSU.

    FabMan@bda1sw-ib2-> env_test
    Environment test started:
    Starting Environment Daemon test:
    Environment daemon running
    Environment Daemon test returned OK
    Starting Voltage test
    Voltage ECB OK
    Measured 3.3V Main = 3.25 
    Measured 3.3V Standby = 3.37 V
    Measured 12V = 11.97 V
    Measured 5V = 4.99 V
    Measured VBAT = 3.09 V
    Measured 1.0V = 1.01 V
    Measured I4 1.2V = 1.22 V
    Measured 2.5V = 2.52 V
    Measured V1P2 DIG = 1.19 V
    Measured V1P2 ANG = 1.18 V
    Measured 1.2V BridgeX = 1.22 V
    Measured 1.8V = 1.78 V
    Measured 1.2V Standby = 1.20 V
    Voltage test returned OK
    Starting PSU test:
    PSU 0 present OK
    WARNING PSU 1 present AC Loss
    PSU test returned 1 faults
    Starting Temperature test:
    Back temperature 30
    Front temperature 29
    SP temperature 36
    Switch temperature 52,
              .
              .
              .
    
  5. Verify a priority setting of 5 for the InfiniBand Gateway leaf switches or 8 for the InfiniBand Switch 36 spine switch:

    FabMan@bda1sw-ib2-> setsmpriority list
    Current SM settings:
    smpriority 5
    controlled_handover TRUE
    subnet_prefix 0xfe80000000000000
    

    If smpriority is correct, then you can skip the next step.

  6. To correct the priority setting:

    1. Stop the InfiniBand Subnet Manager:

      FabMan@bda1sw-ib2-> disablesm
      
    2. Set the priority to 5 for the InfiniBand Gateway leaf switches or 8 for the InfiniBand Switch 36 spine switch. This example is for a leaf switch:

      FabMan@bda1sw-ib2-> setsmpriority 5
      
    3. Restart the InfiniBand Subnet Manager:

      FabMan@bda1sw-ib2-> enablesm
      
  7. If you are connecting this Oracle Big Data Appliance rack to an Oracle Exadata Database Machine or an Oracle Exalogic Elastic Cloud rack:

    1. Verify that the Exadata InfiniBand switches and the Exalogic spine switch are running firmware version 1.3.3_2 or later.

    2. Ensure that the subnet manager runs only on the switches with the highest firmware version.

    3. On systems running earlier firmware versions, disable the subnet manager. Log in to the switch as root and run the disablesm command as described previously.

    For example, if Oracle Big Data Appliance has the highest firmware version, then make its spine switch the master and its gateway switches the failover. Then, on the other engineered system, disable the subnet manager on any InfiniBand switch that has a lower firmware version than the version on Oracle Big Data Appliance.

  8. Exit the Fabric Management shell:

    FabMan@bda1sw-ib2-> exit
    ->
    
  9. Exit the Oracle ILOM shell:

    -> exit
    
  10. Log in to the switch as root and restart it to ensure that all changes take effect:

    reboot
    
  11. Repeat these steps for the other InfiniBand switches.

8.4 Configuring the Power Distribution Units

The power distribution unit (PDU) configuration consists of these procedures:

8.4.1 Connecting the PDUs to the Network

The power distribution units (PDUs) are configured with a static IP address to connect to the network for monitoring. Ensure that you have the following network information before connecting the PDUs:

  • Static IP address

  • Subnet mask

  • Default gateway

To connect the PDUs to the network: 

  1. Use a web browser to access the PDU metering unit by entering the factory default IP address for the unit. The address of PDU A is 192.168.1.210, and the address of PDU B is 192.168.1.211.

    The Current Measurement page opens.

  2. Click Net Configuration in the upper left of the page.

  3. Log in as the admin user on the PDU metering unit. The default password is admin. Change this password after configuring the network.

  4. Confirm that the DHCP Enabled option is not selected.

  5. Enter the following network settings for the PDU metering unit:

    • IP address

    • Subnet mask address

    • Default gateway

  6. Click Submit to set the network settings and reset the PDU metering unit.

  7. Repeat Steps 5 and 6 for the second PDU.

8.4.2 Verifying the PDU Firmware Version

To verify the PDU firmware version:

  1. Select Module Info. If the output displays a firmware version of 1.04 or later, then you are done. Otherwise, continue this procedure to update the firmware version.

  2. Download the latest firmware version from My Oracle Support:

    1. Log in at http://support.oracle.com.

    2. Select the Patches & Updates tab.

    3. For Patch Search, click Product or Family (Advanced).

    4. For Product, select Sun Rack II PDU.

    5. For Release, select Sun Rack II PDU 1.0.4.

    6. Click Search to see the Patch Search Results page.

    7. Click the patch name, such as 12871297.

    8. Download the file.

  3. Unzip the file on your local system.

  4. Return to the PDU metering unit Network Configuration page.

  5. Scroll down to Firmware Update.

  6. Click Browse, select the MKAPP_V1.04.DL file, and click Submit.

  7. Click Browse, select the HTML_V1.04.DL file, and click Submit.

  8. Click Module Info to verify the version number.

  9. Click Net Configuration, and then click Logout.

8.4.3 Configuring the Threshold Settings for the PDUs

The PDU current can be monitored directly. Configure the threshold settings to monitor the PDUs. The configurable threshold values for each metering unit module and phase are Info low, Pre Warning, and Alarm.

See Also:

Sun Rack II Power Distribution Units User's Guide for information about configuring and monitoring PDUs at

http://docs.oracle.com/cd/E19844-01/index.html

Table 8-1 lists the threshold values for the Oracle Big Data Appliance rack using a single-phase, low-voltage PDU.

Table 8-1 Threshold Values for Single-Phase, Low-Voltage PDU

PDU Module/Phase Info Low Threshold Pre Warning Threshold Alarm Threshold

A

Module 1, phase 1

0

18

23

A

Module 1, phase 2

0

22

24

A

Module 1, phase 3

0

18

23

B

Module 1, phase 1

0

18

23

B

Module 1, phase 2

0

22

24

B

Module 1, phase 3

0

18

23


Table 8-2 lists the threshold values for the Oracle Big Data Appliance rack using a three-phase, low-voltage PDU.

Table 8-2 Threshold Values for Three-Phase, Low-Voltage PDU

PDU Module/Phase Info Low Threshold Pre Warning Threshold Alarm Threshold

A and B

Module 1, phase 1

0

32

40

A and B

Module 1, phase 2

0

34

43

A and B

Module 1, phase 3

0

33

42


Table 8-3 lists the threshold values for the Oracle Big Data Appliance rack using a single-phase, high-voltage PDU.

Table 8-3 Threshold Values for Single-Phase, High-Voltage PDU

PDU Module/Phase Info Low Threshold Pre Warning Threshold Alarm Threshold

A

Module 1, phase 1

0

16

20

A

Module 1, phase 2

0

20

21

A

Module 1, phase 3

0

16

20

B

Module 1, phase 1

0

16

20

B

Module 1, phase 2

0

20

21

B

Module 1, phase 3

0

16

20


Table 8-4 lists the threshold values for the Oracle Big Data Appliance rack using a three-phase, high-voltage PDU.

Table 8-4 Threshold Values for Three-Phase, High-Voltage PDU

PDU Module/Phase Info Low Threshold Pre Warning Threshold Alarm Threshold

A and B

Module 1, phase 1

0

18

21

A and B

Module 1, phase 2

0

18

21

A and B

Module 1, phase 3

0

17

21


8.5 Configuring the Oracle Big Data Appliance Servers

Before configuring the network, ensure that the Oracle Big Data Appliance servers are set up correctly.

Note:

  • When you use the KVM switch and press the Esc key, the BIOS may receive two Esc characters and prompt to exit. Select CANCEL.

  • If you must connect to the Oracle ILOM serial management port, then the baud rate setting on the servers changes from the default Oracle ILOM setting of 9600 to 115200 baud, 8 bits, no parity, and 1 stop bit.

To check the Oracle Big Data Appliance servers: 

  1. Power on all servers by pressing the power button on the front panel of each server. The servers take 5 to 10 minutes to run through the normal startup tests.

  2. To configure the KVM switch with the server names:

    1. Under Unit View, select Target Devices.

    2. Click the system name in the Name column.

    3. Click Overview and overwrite the name with the appropriate name from the Installation Template. For example, bda1node01 identifies the first server (node01) in a rack named bda1. The servers are numbered from bottom to top, as shown in "Oracle Big Data Appliance Rack Layout".

    4. Repeat these steps for all 18 servers. Each server restarts with the factory default IP configuration.

  3. Connect to a server using either the KVM switch or a laptop:

    • KVM switch: Under Unit View, select Target Devices. Left-click the system name, and then click the KVM session.

    • Laptop: Open an SSH connection using PuTTY or a similar utility. Enter the default IP address of the server.

  4. Log in as the root user to the first server. The password is welcome1.

  5. Verify that the /opt/oracle/bda/rack-hosts-infiniband file exists. If not, create the file with the default IP addresses listed one per line. All dcli commands are sent by default to the servers listed in this file. See "Factory Network Settings".

  6. Set up passwordless SSH for root by entering the setup-root-ssh command, as described in "Setting Up Passwordless SSH".

  7. Verify that SSH keys are distributed across the rack:

    dcli "hostname ; date"
    
  8. If prompted for a password, enter Ctrl+C several times. This prompt confirms distribution of the keys, so that you can continue to the next step. Otherwise, generate the root SSH keys across the rack, replacing password with a valid password:

    setup-root-ssh -p password
    

    Enter the dcli command in Step 7 again to verify the keys.

  9. Verify that the InfiniBand ports are up, two on each server (36 total).

    # dcli ibstatus | grep phys
    192.168.10.1: phys state: 5: LinkUp
    192.168.10.1: phys state: 5: LinkUp
    ..
    192.168.10.18: phys state: 5: LinkUp
    192.168.10.18: phys state: 5: LinkUp
    
  10. Verify that the InfiniBand ports are running at 40 Gbps (4X QDR):

    # dcli ibstatus | grep rate
    192.168.10.1: rate: 40 Gb/sec (4X QDR)
    192.168.10.1: rate: 40 Gb/sec (4X QDR)
    ..
    192.168.10.18: rate: 40 Gb/sec (4X QDR)
    192.168.10.18: rate: 40 Gb/sec (4X QDR)
    
  11. Verify that Oracle ILOM does not detect any faults:

    # dcli 'ipmitool sunoem cli "show faulty"'
    

    The output should appear as follows for each server:

    bda1node02-adm.example.com: Connected. Use ^D to exit.
    bda1node02-adm.example.com: -> show faulty
    bda1node02-adm.example.com: Target      | Property        | Value
    bda1node02-adm.example.com:-------------+---------------------+-----------
    bda1node02-adm.example.com:
    bda1node02-adm.example.com: -> Session closed
    bda1node02-adm.example.com: Disconnected
    
  12. Save the hardware profile output from each system in a file for review, replacing filename with a file name of your choice:

    # dcli bdacheckhw > filename
    
  13. Check the hardware profile output file using commands like the following. In these example, the file name is all-bdahwcheck.out.

    • To verify that there are no failures in the hardware profile:

      grep -v SUCCESS ~/all-bdahwcheck.out
      
    • To verify 24 cores:

      grep cores  ~/all-bdahwcheck.out 
      
    • To verify 48 GB of memory:

      grep memory ~/all-bdahwcheck.out
      
    • To verify six fans:

      grep fans ~/all-bdahwcheck.out
      
    • To verify that the status is OK for both power supplies:

      grep supply ~/all-bdahwcheck.out
      
    • To verify that disks 0 to 11 are all the same model, online, spun up, and no alert:

      grep disk ~/all-bdahwcheck.out | grep "model\|status" | more
      
    • To verify that the host channel adapter model is Mellanox Technologies MT26428 ConnectX VPI PCIe 2.0:

      grep Host ~/all-bdahwcheck.out | grep model
      
  14. Save the RAID configuration in a file, replacing filename with a file name of your choice:

    dcli MegaCli64 -ldinfo -lall -a0 | grep "Virtual Drive\|State" > filename
    
  15. Verify that 12 virtual drives (0 to 11) are listed for each server. In this example, the RAID configuration is stored in a file named all-ldstate.out.

    less ~/all-ldstate.out
    
  16. Save the software profile output from each system into a file for review, replacing filename with a file name of your choice:

    dcli bdachecksw > filename
    
  17. Verify that the partition setup and software versions are correct. In this example, the software profile is stored in a file named all-bdaswcheck.out.

    less ~/all-bdaswcheck.out
    
  18. Verify the system boots in this order: USB, RAID Slot 0, PXE:

    dcli 'biosconfig -get_boot_order' | grep DEV | more
    
    <BOOT_DEVICE_PRIORITY>
         <DEVICE_NAME>USB:02.82;01  Unigen PSA4000</DEVICE_NAME>
         <DEVICE_NAME>RAID:Slot0.F0:(Bus 13 Dev 00)PCI RAID Adapter</DEVICE_NAME>
         <DEVICE_NAME>PXE:IBA GE Slot 0100 v1331</DEVICE_NAME>
         <DEVICE_NAME>PXE:IBA GE Slot 0101 v1331</DEVICE_NAME>
         <DEVICE_NAME>PXE:IBA GE Slot 0700 v1331</DEVICE_NAME>
         <DEVICE_NAME>PXE:IBA GE Slot 0701 v1331</DEVICE_NAME>
    </BOOT_DEVICE_PRIORITY>
    

8.6 Configuring the Network

The Oracle Big Data Appliance Configuration Utility generates the BdaDeploy.json file, which is used to configure the administrative network and the private InfiniBand network. See "Generating the Configuration Files" if you do not have this file.

The network configuration consists of these procedures:

8.6.1 Verifying the Factory Software Image

To verify that the factory software image is installed correctly and the servers are operating correctly, check that the BDA_IMAGING_SUCCEEDED and BDA_REBOOT_SUCCEEDED files are in the /root directory of each server. If you see a BDA_IMAGING_FAILED or BDA_REBOOT_FAILED file in the output, then check the /root/bda_imaging_status file on that server for more information. Do not proceed with network configuration until all problems are resolved.

The dcli utility requires passwordless SSH for root, as described in "Setting Up Passwordless SSH".

# dcli ls -1 /root | grep BDA
IP address BDA_IMAGING_SUCCEEDED
IP address BDA_REBOOT_SUCCEEDED
     .
     .
     .

You can also confirm the image version:

# dcli imageinfo
Big Data Appliance Image Info
 
IMAGE_VERSION             : 1.0.2
IMAGE_CREATION_DATE       : Sun Mar 4 11:39:36 PST 2012
IMAGE_LABEL               : BDA_MAIN_LINUX.X64_120303
KERNEL_VERSION            : 2.6.32-200.21.1.el5uek
BDA_RPM_VERSION           : bda-1.0.2-1
OFA_RPM_VERSION           : ofa-2.6.32-200.21.1.el5uek-1.5.5-4.0.55.4
JDK_VERSION               : jdk-1.6.0_29-fcs
     .
     .
     .

8.6.2 Copying the Configuration Files to Oracle Big Data Appliance

To copy the configuration files to Oracle Big Data Appliance:

  1. Copy the configuration files to a USB flash drive.

  2. Use the KVM switch to open a console session to the first server. The first server is the lowest server in the rack. See Figure C-1.

  3. Log in as the root user on the first server. The initial password is welcome1.

  4. Plug the USB drive into the USB port of the first server. The port is on the right front of the server. Information like the following is displayed on the console:

    # scsi 0:0:0:0: Direct-Access     CBM      USB 2.0
    Q: 0 ANSI:2
    sd 0:0:0:0: Attached scsi generic sg14 type 0
    sd 0:0:0:0: [sdn] 7954432 512-byte logical blocks: (4.07 GB/3.79 GiB)
    sd 0:0:0:0: [sdn] Write Protect is off
    sd 0:0:0:0: [sdn] Assuming drive cache: write through
    sd 0:0:0:0: [sdn] Assuming drive cache: write through
    sd 0:0:0:0: [sdn] Assuming drive cache: write through
    sd 0:0:0:0: [sdn] Attached SCSI removable disk
    
  5. Enter the showusb command to locate the USB drive. The command returns with the mapped device or, if no USB drive is connected, with no output.

    # showusb
    /dev/sdn1
    
  6. Create a directory on the server:

    # mkdir /mnt/usb
    
  7. Mount the device using the device name given in Step 5. The following is an example of the command.

    # mount -t vfat /dev/sdn1 /mnt/usb
    
  8. Verify the location of the file on the USB flash drive:

    # ls /mnt/usb
    BdaDeploy.json
    bin
    boot
    .
    .
    .
    
  9. Copy BdaDeploy.json from the USB flash drive to the /opt/oracle/bda directory on the server:

    # cd /mnt/usb
    # cp BdaDeploy.json /opt/oracle/bda
    

    Note:

    If mammoth-rack_name.params is also on the drive, you can copy it to /opt/oracle/BDAMammoth for use in Chapter 11.
  10. Unmount the USB flash drive and remove the device:

    # umount /mnt/usb
    # rmdir /mnt/usb
    
  11. Remove the USB flash drive from the server.

8.6.3 Starting the Network Configuration

The networksetup-one script sets up the host names and Oracle ILOM names for all servers and configures the administrative network and the private InfiniBand network.

To start the network configuration: 

  1. Log in as the root user on the first server. The initial password is welcome1.

    # ssh root@192.168.10.1
    
  2. Begin the network configuration:

    # cd /opt/oracle/bda/network
    # ./networksetup-one
    

    Example 8-1 shows sample output from the script.

Example 8-1 Sample Output from networksetup-one

# ./networksetup-one
networksetup-one: check syntax and static semantics of /opt/oracle/bda/BdaDeploy.json
networksetup-one: passed
networksetup-one: ping servers on ship admin network
networksetup-one: passed
networksetup-one: test ssh to servers on ship admin network
hello from node02
hello from node03
     .
     .
     .
networksetup-one: passed
networksetup-one: copy /opt/oracle/bda/BdaDeploy.json to servers
BdaDeploy.json  0% 0  0.0KB/s   --:-- ETABdaDeploy.json 100% 4304  4.2KB/s 00:00
BdaDeploy.json  0% 0  0.0KB/s   --:-- ETABdaDeploy.json 100% 4304  4.2KB/s 00:00
     .
     .
     .
networksetup-one: passed
networksetup-one: executing network settings on all servers
networksetup-one: wait a few seconds for the network to restart on 192.168.1.2
     .
     .
     .
bda1node02.example.com BdaUserConfigNetwork: reset network
bda1node03.example.com BdaUserConfigNetwork: reset network
bda1node04.example.com BdaUserConfigNetwork: reset network
     .
     .
     .
networksetup-one: deploying this server
networksetup-one: network will restart momentarily, pardon our dust
bda1node01.example.com BdaUserConfigNetwork: reset network
networksetup-one: generate dcli bda host file lists
networksetup-one: ping server ips on admin network
networksetup-one: passed
networksetup-one: passed
networksetup-one: test ssh server ips on admin network
hello from bda1node02.example.com
hello from bda1node03.example.com
hello from bda1node04.example.com
     .
     .
     .
networksetup-one: passed

8.6.4 Connecting to the Network

Before completing the network configuration, you must connect the administrative and client networks to the data center.

To connect Oracle Big Data Appliance to the network: 

  • Connect the 1 GbE administrative network by connecting the Cisco Ethernet switch to the data center.

  • Connect the 10 GbE client network by connecting the two Sun Network QDR InfiniBand Gateway Switch leaf switches to the data center.

8.6.5 Completing the Network Configuration

The networksetup-two script completes some steps started by networksetup-one that require a network connection. It also configures the default VLAN and all required VNICs for the 10 GbE client network. It then verifies all network connections and displays a message if it discovers any unexpected ones, including those caused by cabling mistakes.

The 10 GbE ports of the Sun Network QDR InfiniBand Gateway Switches must be connected to the data center.

To complete the network configuration: 

  1. Ensure that both the administrative network and the client network are connected to Oracle Big Data Appliance.

    Note:

    This procedure fails if the networks are not connected. See "Connecting to the Network".
  2. Run the following script to complete the network setup:

    ./networksetup-two
    

Example 8-2 shows sample output from the script.

Example 8-2 Sample Output from networksetup-two

# ./networksetup-two
networksetup-two: check syntax and static semantics of /opt/oracle/bda/BdaDeploy.json
networksetup-two: passed
networksetup-two: ping server ips on admin network
networksetup-two: passed
networksetup-two: test ssh server ips on admin network
hello from bda1node02.example.com
hello from bda1node03.example.com
hello from bda1node04.example.com
.
.
.
networksetup-two: passed
networksetup-two: run connected network post script on each server
networksetup-two: post network setup for 10.133.42.253
networksetup-two: post network setup for 10.133.42.254
networksetup-two: post network setup for 10.133.43.1
.
.
.
networksetup-two: post network setup for this node
networksetup-two: ping admin servers by name on admin network
networksetup-two: passed
networksetup-two: verify infiniband topology
networksetup-two: passed
networksetup-two: start setup client network (10gigE over Infiniband)
networksetup-two: ping both gtw leaf switches
networksetup-two: passed
networksetup-two: verify existence of gateway ports
networksetup-two: passed
networksetup-two: ping server ips on admin network
networksetup-two: passed
networksetup-two: ping servers by name on admin network
networksetup-two: passed
networksetup-two: test ssh server ips on admin network
hello from bda1node02.example.com
hello from bda1node03.example.com
.
.
.
networksetup-two: passed
networksetup-two: check existence of default vlan for port 0A-ETH-1 on bda1sw-ib2
networksetup-two: no default vlan for port, create it
spawn ssh root@10.133.43.36 createvlan 0A-ETH-1 -vlan -1 -pkey default
networksetup-two: verify default vlan for port 0A-ETH-1 for bda1sw-ib2
.
.
.
networksetup-two: passed
networksetup-two: apply eoib on each server
networksetup-two: wait a few seconds for the network to restart on 10.133.42.253
networksetup-two: wait a few seconds for the network to restart on 10.133.42.254
.
.
.
check and delete vNIC for bda1node02 eth9 on switch bda1sw-ib2
check and delete vNIC for bda1node02 eth9 on switch bda1sw-ib3
create vNIC eth9 bda1node02 using switch bda1sw-ib3
vNIC created
check and delete vNIC for bda1node02 eth8 on switch bda1sw-ib2
.
.
.
networksetup-two: ping server ips on client network
networksetup-two: passed
networksetup-two: test ssh server ips on client network
hello from bda1node02.example.com
hello from bda1node03.example.com
.
.
.
networksetup-two: passed
networksetup-two: end setup client network

8.7 Reinstalling the Base Image

The operating system and various utilities are factory installed on Oracle Big Data Appliance, as described in "Oracle Big Data Appliance Management Software". You may need to reinstall this base image if, for example, you want to return Oracle Big Data Appliance to its original state, or you want to upgrade the base image to a more recent version before using the Mammoth Utility to install the Oracle Big Data Appliance software.

Following is the procedure for reimaging an entire rack.

Caution:

If you reinstall the base image, then all files on that server are erased.

To reinstall the base image on all servers in a rack: 

  1. If the Oracle Big Data Appliance software was installed previously on the rack, then save the /opt/oracle/BDAMammoth/mammoth-rack_name.params file to a safe place outside Oracle Big Data Appliance.

  2. Download the zip file with the correct version of the base image and copy it to node01 (bottom server). See My Oracle Support Information Center: Oracle Big Data Appliance (ID 1445762.2) for the download location. You can copy the file to any directory, such as /tmp.

    The name of the file is in the format BDABaseImage-version.zip (for example, BDABaseImage-1.1.0.zip).

    Note:

    You can also download BDABaseImage-version.zip to a safe location. It contains the version of the Mammoth Utility that you must run to install the end-user software after reimaging.
  3. Establish an SSH connection to node01 and log in as root.

  4. Locate the appropriate configuration file and verify that it reflects the intended network configuration. Edit the file and copy it to /opt/oracle/bda/ as needed.

    • To reimage to the custom network settings, copy BdaDeploy.json.

    • To reimage to the factory default network settings, copy BdaShip.json.

  5. Ensure that passwordless SSH is set up:

    dcli hostname
    

    This command should run without errors and return the host names of all 18 Oracle Big Data Appliance servers. If not, then follow the steps in "Setting Up Passwordless SSH". Do not continue until the dcli hostname command runs successfully on all servers.

  6. Verify that at least 4 GB are available in the root (/) partition of node01:

  7. [# df -h /
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/md2              161G  8.2G  145G   6% /
    
  8. Extract all files from the zip file, for example:

    # unzip BDABaseImage-1.1.0.zip
    $ unzip BDABaseImage-1.1.0.zip
    Archive:  BDABaseImage-1.1.0.zip
      inflating: README.txt
       creating: BDABaseImage-1.1.0/
      inflating: BDABaseImage-1.1.0/BDABaseImage-1.1.0.iso
      inflating: BDABaseImage-1.1.0/makebdaimage
     extracting: BDABaseImage-1.1.0/BDABaseImage-1.1.0.md5sum
      inflating: BDABaseImage-1.1.0/reimagerack
    
  9. Change to the BDABaseImage-version directory created in the previous step, for example:

    cd BDABaseImage-1.1.0
    
  10. Complete one of the following procedures:

    • To reimage Oracle Big Data Appliance to the customer network settings specified in /opt/oracle/bda/BdaDeploy.json:

      ./reimagerack
      
    • To reimage an appliance that still has the factory settings:

      1. Ensure that /opt/oracle/bda/BdaDeploy.json does not exist.

      2. Enter the ./reimagerack command.

    • To restore the factory network settings on a rack configured with custom network settings:

      1. Copy /opt/oracle/bda/BdaDeploy.json to a safe location outside Oracle Big Data Appliance.

      2. Disconnect the rack from the network.

      3. Reimage the rack:

        ./reimagerack deploy ship
        

    The reimagerack utility creates an ISO image, copies it to the internal USB drive of each server in the rack, reboots each server, and initializes the installation.

  11. Run the Mammoth Utility, as described in Chapter 11.

8.8 Checking the Health of the Network

Following are commands that you can run at any time to check the health of the Oracle Big Data Appliance network. This section also contains commands that you may need if the health checks fail.

8.8.1 bdacheckcluster

Checks the health of a CDH cluster, including the software, hardware, and network.

This example shows the output from the utility:

# bdacheckcluster
SUCCESS: Mammoth configuration file is valid.
SUCCESS: All cluster host names are pingable
SUCCESS: All cluster hosts passed checks on last reboot
INFO: Starting cluster host hardware checks
SUCCESS: All cluster hosts pass hardware checks
INFO: Starting cluster host software checks
SUCCESS: All cluster hosts pass software checks
SUCCESS: All ILOM hosts are pingable
SUCCESS: All client interface IPs are pingable
SUCCESS: All admin eth0 interface IPs are pingable
SUCCESS: All private Infiniband interface IPs are pingable
Warning: Permanently added 'bda1node01-master' (RSA) to the list of known hosts.
SUCCESS: Puppet master is running on bda1noce01-master
SUCCESS: Puppet running on all cluster hosts
SUCCESS: Cloudera SCM server is running on bda1node02
SUCCESS: Cloudera SCM agent running on all cluster hosts
SUCCESS: Name Node is running on bda1node01
SUCCESS: Secondary Name Node is running on bda1node02
SUCCESS: Job Tracker is running on bda1node01
SUCCESS: Data Nodes running on all cluster hosts
SUCCESS: Task Trackers running on all cluster slave hosts
SUCCESS: Hadoop filesystem is healthy.
SUCCESS: MySQL server is running on MySQL master node bda1node03
SUCCESS: MySQL server is running on MySQL backup node bda1node02
SUCCESS: Big Data Appliance cluster health checks succeeded

8.8.2 bdacheckhw

Checks the hardware profile of the server. See "Configuring the Oracle Big Data Appliance Servers" for tips about using this utility.

This example shows the output from the utility:

# bdacheckhw
SUCCESS: Correct system model : SUN FIRE X4270 M2 SERVER
SUCCESS: Correct processor info : Intel(R) Xeon(R) CPU X5675 @ 3.07GHz
SUCCESS: Correct number of types of CPU : 1
SUCCESS: Correct number of CPU cores : 24
SUCCESS: Sufficient GB of memory (>=48): 48
SUCCESS: Correct GB of swap space : 24
SUCCESS: Correct BIOS vendor : American Megatrends Inc.
SUCCESS: Sufficient BIOS version (>=08080102): 08100102
SUCCESS: Recent enough BIOS release date (>=05/23/2011) : 10/11/2011
SUCCESS: Correct ILOM version : 3.0.16.10.a r68533
SUCCESS: Correct number of fans : 6
SUCCESS: Correct fan 0 status : ok
SUCCESS: Correct fan 1 status : ok
     .
     .
     .

8.8.3 bdacheckib

Checks the InfiniBand cabling between the servers and switches of a single rack, when entered with no options. The network must be configured with custom settings as described by /opt/oracle/bda/BdaDeploy.json.

Run this command after connecting as root to any server.

The bdacheckib command has these options:

-s

The same as running without options except that the network must still be configured with the factory default settings. You can use this option as soon as Oracle Big Data Appliance arrives at the site, even before the switches are configured.

-m json_file

Verifies that the InfiniBand switch-to-switch cabling among multiple ranks is correct. To create json_file, see the -g option.

-g

Generates a sample JSON file named sample-multi-rack.json. Use this file as an example of the format required by the -m option.

This example checks the switch-to-server InfiniBand cables:

[root@node01 network]# bdacheckib
LINK bda1sw-ib3.15A  ...  bda1node02.HCA-1.2 UP
LINK bda1sw-ib3.15B  ...  bda1node01.HCA-1.2 UP
LINK bda1sw-ib3.14A  ...  bda1node04.HCA-1.2 UP
LINK bda1sw-ib3.14B  ...  bda1node03.HCA-1.2 UP
     .
     .
     .

The next example generates the JSON file and shows the output.

[root@bda1node01 bda]# bdacheckib -g
[root@bda1node01 bda]# cat sample-multi-rack.json
# This json multirack spec is generated. The array elements are sorted
# alphabetically.  A properly arranged json spec representing racks from left to right
# can be used as input to bdacheckib (bdacheckib -m multi-rack.json)
# Note commas separating rack elements are optional.
[
{"SPINE_NAME": "dm01sw-ib1", "LEAF1_NAME": "dm01sw-ib2", "LEAF2_NAME": "dm01sw-ib3"}
{"SPINE_NAME": "bda1sw-ib1", "LEAF1_NAME": "bda1sw-ib2", "LEAF2_NAME": "bda1sw-ib3"}
{"SPINE_NAME": "bda2sw-ib1", "LEAF1_NAME": "bda2sw-ib2", "LEAF2_NAME": "bda2sw-ib3"}

The final example checks all the racks on the InfiniBand network using the edited JSON file created in the previous example:

# bdacheckib -m sample-multi-rack.json
 
Verifying rack #1
 leaf: dm01sw-ib2
   LINK ...  to rack2 UP
   LINK ...  to rack2 UP
   LINK ...  to rack1 UP
   LINK ...  to rack2 UP
   LINK ...  to rack3 UP
   LINK ...  to rack3 UP
   LINK ...  to rack1 UP
   LINK ...  to rack1 UP
 leaf: dm01sw-ib3
   LINK ...  to rack2 UP
   LINK ...  to rack2 UP
   LINK ...  to rack1 UP
   LINK ...  to rack2 UP
   LINK ...  to rack1 UP
   LINK ...  to rack3 UP
   LINK ...  to rack3 UP
   LINK ...  to rack1 UP
 
Verifying rack #2
 leaf: bda1sw-ib2
   LINK ...  to rack1 UP
   LINK ...  to rack1 UP
     .
     .
     .

8.8.4 bdachecknet

Checks whether the network configuration is working properly. Run this command after connecting as root to any server.

This example shows the output from the utility:

[root@node01 network]# bdachecknet
bdachecknet: check syntax and static semantics of /opt/oracle/bda/BdaDeploy.json
bdachecknet: passed
bdachecknet: ping test private infiniband ips (bondib0 40gbs)
bdachecknet: passed
bdachecknet: ping test admin ips (eth0 1gbs)
bdachecknet: passed
bdachecknet: ping test client access ips (bondeth0 10gbs Eoib)
bdachecknet: passed
bdachecknet: test admin network resolve and reverse resolve
bdachecknet: passed
bdachecknet: test admin name array matches ip array
bdachecknet: passed
bdachecknet: test client network (eoib) resolve and reverse resolve
bdachecknet: passed
bdachecknet: test client name array matches ip array
bdachecknet: passed
bdachecknet: test ntp servers
bdachecknet: passed
bdachecknet: test arp -a
bdachecknet: passed

8.8.5 bdachecksw

Checks the software profile of the server. See "Configuring the Oracle Big Data Appliance Servers" for tips about using this utility.

This example shows the output from the utility:

# bdachecksw
SUCCESS: Correct OS disk sda partition info : 1 ext3 raid 2 ext3 raid 3 linux-swap 4 ext3 primary
SUCCESS: Correct OS disk sdb partition info : 1 ext3 raid 2 ext3 raid 3 linux-swap 4 ext3 primary
SUCCESS: Correct data disk sdc partition info : 1 ext3 primary
SUCCESS: Correct data disk sdd partition info : 1 ext3 primary
SUCCESS: Correct data disk sde partition info : 1 ext3 primary
SUCCESS: Correct data disk sdf partition info : 1 ext3 primary
SUCCESS: Correct data disk sdg partition info : 1 ext3 primary
SUCCESS: Correct data disk sdh partition info : 1 ext3 primary
SUCCESS: Correct data disk sdi partition info : 1 ext3 primary
SUCCESS: Correct data disk sdj partition info : 1 ext3 primary
SUCCESS: Correct data disk sdk partition info : 1 ext3 primary
SUCCESS: Correct data disk sdl partition info : 1 ext3 primary
SUCCESS: Correct software RAID info : /dev/md2 level=raid1 num-devices=2 /dev/md0 level=raid1 num-devices=2
SUCCESS: Correct mounted partitions : /dev/md0 /boot ext3 /dev/md2 / ext3 /dev/sda4 /u01 ext4 /dev/sdb4 /u02 ext4 /dev/sdc1 /u03 ext4 /dev/sdd1 /u04 ext4 /dev/sde1 /u05 ext4 /dev/sdf1 /u06 ext4 /dev/sdg1 /u07 ext4 /dev/sdh1 /u08 ext4 /dev/sdi1 /u09 ext4 /dev/sdj1 /u10 ext4 /dev/sdk1 /u11 ext4 /dev/sdl1 /u12 ext4
SUCCESS: Correct swap partitions : /dev/sdb3 partition /dev/sda3 partition
SUCCESS: Correct Linux kernel version : Linux 2.6.32-200.21.1.el5uek
SUCCESS: Correct Java Virtual Machine version : HotSpot(TM) 64-Bit Server 1.6.0_29
SUCCESS: Correct puppet version : 2.6.11
SUCCESS: Correct MySQL version : 5.5.17
SUCCESS: All required programs are accessible in $PATH
SUCCESS: All required RPMs are installed and valid
SUCCESS: Big Data Appliance software validation checks succeeded

8.8.6 bdadiag

Collects diagnostic information about an individual server and returns the name of the compressed file in /tmp where it stored the data. You must be connected to the server as root.

Following are the bdadiag options, which instruct bdadiag to collect additional diagnostics. You can enter the options together on the command line to collect the most information.

hadoop

Collects the CDH cluster logs for Hadoop and the Cloudera Manager logs.

hdfs

Collects the output of a complete Hadoop Distributed File System (HDFS) check.

osw

Collects Oracle OS Watcher logs, which include historical operating system performance and monitoring data.

This example shows the output from the utility:

# bdadiag
 
Big Data Appliance Diagnostics Collection Tool v1.0.3
 
Checking installed rpms
 
Generating diagnostics tarball and removing temp directory
 
==============================================================================
Done. The report files are bzip2 compressed in /tmp/bda1node09_bdadiag_2012_04_10_14_08.tar.bz2
==============================================================================

The logs are organized in subdirectories, including the following:

asr
ilom
install
messages
raid
sysconfig

8.8.7 bdaid

Returns information about an individual server. If you need to contact Oracle Support about an issue with Cloudera's Distribution including Apache Hadoop, you should run this command first. You must be connected to the server as root.

This example shows the output from the utility:

# bdaid
Server Hostname           : bda1node09
Rack Serial Number        : AK00023713
Server Serial Number      : 1137FMM06Y
Cluster Name              : Cluster 1
Appliance Name            : bda1

8.8.8 bdaimagevalidate

Validates the hardware and software by running bdacheckhw, and then bdachecksw.

8.8.9 bdaredoclientnet

Re-creates the virtual NICs (VNICs) for all servers in the rack and spreads them across the available 10 GbE ports. You must run this utility after changing the number of 10 GbE connections to a Sun Network QDR InfiniBand Gateway Switch. The bdaredoclientnet utility performs the following subset of tasks done by the networksetup-two script during the initial configuration of Oracle Big Data Appliance:

  • Verifies that the administrative network is working, the InfiniBand cabling is correct, and the InfiniBand switches are available

  • Determines how many 10 GbE connections are available and connects them to the InfiniBand Gateway switches

  • Deletes all VNICs and re-creates them

  • Connects to each server and updates the configuration files

  • Restarts the client network and verifies that it can connect to each server using the newly configured client network

To re-create the VNICs in a rack: 

  1. Verify that /opt/oracle/bda/BdaDeploy.json exists on all servers and correctly describes the custom network settings. This command identifies files that are missing or have different date stamps:

    dcli ls -l /opt/oracle/bda/BdaDeploy.json
    
  2. Connect to node01 (bottom of rack) using either the administrative network or the KVM. The bdaredoclientnet utility shuts down the client network, so you cannot use it in this procedure.

  3. Remove passwordless SSH:

    /opt/oracle/bda/bin/remove-root-ssh
    

    See "Setting Up Passwordless SSH" for more information about this command.

  4. Change directories:

    cd /opt/oracle/bda/network
    
  5. Run the utility:

    bdaredoclientnet
    

    The output is similar to that shown in Example 8-2.

  6. Restore passwordless SSH (optional):

    /opt/oracle/bda/bin/setup-root-ssh
    

8.8.10 bdaserials

Returns the serial numbers and media access control (MAC) addresses for most components of the Oracle Big Data Appliance server that you are connected to.

This example shows the output from the utility:

# bdaserials
Rack serial number :
System serial number : 1137FMM0BY
System UUID : 080020FF-FFFF-FFFF-FFFF-7E97D6282100
Motherboard serial number : 0338MSL-1131BA2194
Chassis serial number : 1137FMM0BY
Memory serial numbers : 87948175 87949173 87948163 8794816B 87948130 87948176
Infiniband HCA serial number : 1388FMH-1122501437
Disk controller serial number : SV11713731
Hard disk serial numbers :
SEAGATE ST32000SSSUN2.0T061A1125L6M89X
SEAGATE ST32000SSSUN2.0T061A1125L6LFH0
SEAGATE ST32000SSSUN2.0T061A1125L6M94J
SEAGATE ST32000SSSUN2.0T061A1125L6LLEZ
SEAGATE ST32000SSSUN2.0T061A1125L6M5S2
SEAGATE ST32000SSSUN2.0T061A1125L6LSD4
SEAGATE ST32000SSSUN2.0T061A1127L6M58L
SEAGATE ST32000SSSUN2.0T061A1127L6R40S
SEAGATE ST32000SSSUN2.0T061A1125L6M3WX
SEAGATE ST32000SSSUN2.0T061A1125L6M65D
SEAGATE ST32000SSSUN2.0T061A1127L6NW3K
SEAGATE ST32000SSSUN2.0T061A1127L6N4G1
 
MAC addresses :
bondeth0 Ethernet : CE:1B:4B:85:2A:63
bondib0 InfiniBand : 80:00:00:4A:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
bond0 Ethernet : 00:00:00:00:00:00
eth0 Ethernet : 00:21:28:E7:97:7E
eth1 Ethernet : 00:21:28:E7:97:7F
eth2 Ethernet : 00:21:28:E7:97:80
eth3 Ethernet : 00:21:28:E7:97:81
eth8 Ethernet : CE:1B:4B:85:2A:63
eth9 Ethernet : CE:1B:4C:85:2A:63
ib0 InfiniBand : 80:00:00:4A:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
ib1 InfiniBand : 80:00:00:4B:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

8.8.11 iblinkinfo

Lists all InfiniBand connections in the InfiniBand network. Run this command as root from any server.

This example shows two Oracle Big Data Appliances and one Oracle Exadata Database Machine on the InfiniBand network:

[root@bda1node01 network]# iblinkinfo
Switch 0x002128df348ac0a0 SUN IB QDR GW switch bda1sw-ib2 10.133.43.36:
  149  1[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  130  2[ ] "SUN IB QDR GW switch bda1sw-ib2 10.133...
  149  2[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  127  1[ ] "SUN IB QDR GW switch bda1sw-ib2 10.133...
  149  3[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  111  2[ ] "SUN IB QDR GW switch bda1sw-ib2 10.133...
  149  4[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  109  1[ ] "SUN IB QDR GW switch bda1sw-ib2 10.133...
  149  5[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  143  1[ ] "bda1node02 BDA 192.168.41.20 HCA-1" ( )
  149  6[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  137  1[ ] "bda1node01 BDA 192.168.41.19 HCA-1" ( )
  149  7[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  141  1[ ] "bda1node04 BDA 192.168.41.22 HCA-1" ( )
  149  8[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  123  1[ ] "bda1node03 BDA 192.168.41.21 HCA-1" ( )
  149  9[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  151  1[ ] "bda1node06 BDA 192.168.41.24 HCA-1" ( )
  149 10[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  112  1[ ] "bda1node05 BDA 192.168.41.23 HCA-1" ( )
  149 11[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>  139  1[ ] "bda1node07 BDA 192.168.41.25 HCA-1" ( )
  149 12[ ] ==(                Down/Disabled)==>        [ ] "" ( )
  149 13[ ] ==(                Down/Disabled)==>        [ ] "" ( )
  149 14[ ] ==( 4X 10.0 Gbps Active/  LinkUp)==>   85  9[ ] "SUN DCS 36P QDR dm01sw-ib1 10.133.40.203" ( )
  149 15[ ] ==(                Down/Disabled)==>        [ ] "" ( )
          .
          .
          .

8.8.12 imagehistory

Displays a history of operating system upgrades.

This example shows that the appliance was imaged with version 1.0.3 with no upgrades:

IMAGE_VERSION             : 1.0.3
IMAGE_CREATION_DATE       : Sun Apr 1 20:00:43 PDT 2012
IMAGING_START_DATE        : Wed Apr 4 16:57:59 UTC 2012
IMAGING_END_DATE          : Wed Apr 4 10:45:48 PDT 2012

8.8.13 imageinfo

Displays information about the Oracle Big Data Appliance operating system image currently running.

This example identifies the 1.0.3 image:

# imageinfo
Big Data Appliance Image Info
 
IMAGE_VERSION             : 1.0.3
IMAGE_CREATION_DATE       : Sun Apr 1 20:00:43 PDT 2012
IMAGE_LABEL               : BDA_1.0.3_LINUX.X64_RELEASE
KERNEL_VERSION            : 2.6.32-200.21.1.el5uek
BDA_RPM_VERSION           : bda-1.0.3-1
OFA_RPM_VERSION           : ofa-2.6.32-200.21.1.el5uek-1.5.5-4.0.55.4
JDK_VERSION               : jdk-1.6.0_29-fcs

8.8.14 listlinkup

Shows the Ethernet Bridge ports with active links. Run this command after connecting as root to a Sun Network QDR InfiniBand Gateway Switch.

This example shows three active ports (0A-ETH-1, 0A-ETH-3, and 0A-ETH-4) out of the eight available ports on switch bda1sw-ib3:

[root@bda1sw-ib3 ~]# listlinkup | grep Bridge
  Bridge-0 Port 0A-ETH-1 (Bridge-0-2) up (Enabled)
  Bridge-0 Port 0A-ETH-2 (Bridge-0-2) down (Enabled)
  Bridge-0 Port 0A-ETH-3 (Bridge-0-1) up (Enabled)
  Bridge-0 Port 0A-ETH-4 (Bridge-0-1) up (Enabled)
  Bridge-1 Port 1A-ETH-1 (Bridge-1-2) down (Enabled)
  Bridge-1 Port 1A-ETH-2 (Bridge-1-2) down (Enabled)
  Bridge-1 Port 1A-ETH-3 (Bridge-1-1) down (Enabled)
  Bridge-1 Port 1A-ETH-4 (Bridge-1-1) down (Enabled)

8.8.15 showvlan

Lists the VLANs configured on the switch. Run this command after connecting as root to a Sun Network QDR InfiniBand Gateway Switch.

This example shows the default VLAN, which has an ID of 0, on switch bda1sw-ib3:

[root@bda1sw-ib3 ~]# showvlan
   Connector/LAG  VLN   PKEY
   -------------  ---   ----
   0A-ETH-1        0    ffff
   0A-ETH-3        0    ffff
   0A-ETH-4        0    ffff

8.8.16 showvnics

Lists the virtual network interface cards (VNICs) created for the switch. Run this command after connecting as root to a Sun Network QDR InfiniBand Gateway Switch.

This example shows the VNICs created in a round-robin process for switch bda1sw-ib3:

[root@bda1sw-ib3 ~]# showvnics
ID  STATE FLG IOA_GUID          NODE                                IID  MAC               VLN PKEY   GW
--- ----- --- ----------------- --------------------------------    ---- ----------------- --- ----   --------
561 UP      N 0021280001CF4C23  bda1node13 BDA 192.168.41.31    0000 CE:4C:23:85:2B:0A NO  ffff   0A-ETH-1
564 UP      N 0021280001CF4C53  bda1node16 BDA 192.168.41.34    0000 CE:4C:53:85:2B:0D NO  ffff   0A-ETH-1
567 UP      N 0021280001CF4B58  bda1node01 BDA 192.168.41.19    0000 CE:4B:58:85:2A:FC NO  ffff   0A-ETH-1
555 UP      N 0021280001CF2A5C  bda1node07 BDA 192.168.41.25    0000 CE:2A:5C:85:2B:04 NO  ffff   0A-ETH-1
552 UP      N 0021280001CF4C74  bda1node04 BDA 192.168.41.22    0000 CE:4C:74:85:2B:01 NO  ffff   0A-ETH-1
558 UP      N 0021280001CF179B  bda1node10 BDA 192.168.41.28    0000 CE:17:9B:85:2B:07 NO  ffff   0A-ETH-1
     .
     .
     .