8 Connecting Multiple Oracle Big Data Appliance Racks

This chapter describes how to combine multiple Oracle Big Data Appliance racks into one large cluster. It contains the following sections:

8.1 Extending a Rack by Adding Another Rack

When creating a multirack Hadoop cluster or providing access to Oracle Big Data Appliance from an Oracle Exadata Database Machine, you must connect multiple racks to each other. Racks can be cabled together with no downtime.

During the cabling procedure, note the following:

  • There is some performance degradation while you are cabling the racks together. This degradation results from reduced network bandwidth, and the data retransmission due to packet loss when a cable is unplugged.

  • The environment is not a high-availability environment because one leaf switch must be off. All traffic goes through the remaining leaf switch.

  • Only the existing rack is operational, and any new rack is powered down.

  • The software running on the systems must not have problems related to InfiniBand restarts.

  • The new racks must be configured with the appropriate IP addresses to be migrated into the expanded system before any cabling, and duplicate IP addresses are not allowed.

  • The existing spine switch is set to priority 10 during the cabling procedure. This setting gives the spine switch a higher priority than any other switch in the network fabric. The spine switch is first to take the Subnet Manager Master role whenever a new Subnet Manager Master is set during the cabling procedure.

See Also:

8.2 Cabling Two Racks Together

The following procedure describes how to cable two racks together. This procedure assumes that the racks are adjacent to each other. In the procedure, the existing rack is R1, and the new rack is R2.

To cable two racks together: 

  1. Set the priority of the current, active Subnet Manager Master to 10 on the spine switch, as follows:

    1. Log in to any InfiniBand switch on the active system.

    2. Use the getmaster command to verify that the Subnet Manager Master is running on the spine switch.

    3. Log in to the spine switch.

    4. Use the disablesm command to stop the Subnet Manager.

    5. Use the setsmpriority 10 command to set the priority to 10.

    6. Use the enablesm command to restart the Subnet Manager.

    7. Repeat Step b to ensure that the Subnet Manager Master is running on the spine switch.

  2. Ensure that the new rack is near the existing rack. The InfiniBand cables must be able to reach the servers in each rack.

  3. Completely shut down the new rack (R2).

  4. Cable the leaf switch in the new rack according to Table E-2.

  5. Power off leaf switch R1 IB2. This causes all servers to fail over their InfiniBand traffic to R1 IB3.

  6. Disconnect all interswitch links between R1 IB2 and R1 IB3.

  7. Cable leaf switch R1 IB2 according to Table E-1.

  8. Power on leaf switch R1 IB2.

  9. Wait for 3 minutes for R1 IB2 to become completely operational.

    To check the switch, log in to it and run the ibswitches command. The output should show three switches: R1 IB1, R1 IB2, and R1 IB3.

  10. Power off leaf switch R1 IB3. This causes all servers to fail over their InfiniBand traffic to R1 IB2.

  11. Cable leaf switch R1 IB3 according to Table E-1.

  12. Power on leaf switch R1 IB3.

  13. Wait for 3 minutes for R1 IB3 to become completely operational.

    To check the switch, log in to it and run the ibswitches command. The output should show three switches: R1 IB1, R1 IB2, and R1 IB3.

  14. Power on all the InfiniBand switches in R2.

  15. Wait for 3 minutes for the switches to become completely operational.

    To check the switch, log in to it and run the ibswitches command. The output should show six switches: R1 IB1, R1 IB2, R1 IB3, R2 IB1, R2 IB2, and R2 IB3.

  16. Ensure that the Subnet Manager Master is running on R1 IB1 by running the getmaster command from any switch.

  17. Power on all servers in R2.

  18. Log in to spine switch R1 IB1, and lower its priority to 8 as follows:

    1. Use the disablesm command to stop the Subnet Manager.

    2. Use the setsmpriority 8 command to set the priority to 8.

    3. Use the enablesm command to restart the Subnet Manager.

  19. Ensure that the Subnet Manager Master is running on one of the spine switches.

After cabling the racks together, proceed to configure the racks.

8.3 Cabling Several Racks Together

The following procedure describes how to cable several racks together. This procedure assumes that the racks are adjacent to each other. In the procedure, the existing racks are R1, R2,... Rn, the new rack is Rn+1, and the Subnet Manager Master is running on R1 IB1.

To cable several racks together: 

  1. Set the priority of the current, active Subnet Manager Master to 10 on the spine switch, as follows:

    1. Log in to any InfiniBand switch on the active system.

    2. Use the getmaster command to verify that the Subnet Manager Master is running on the spine switch.

    3. Log in to the spine switch.

    4. Use the disablesm command to stop the Subnet Manager.

    5. Use the setsmpriority 10 command to set the priority to 10.

    6. Use the enablesm command to restart the Subnet Manager.

    7. Repeat Step b to ensure that the Subnet Manager Master is running on the spine switch.

  2. Ensure that the new rack is near the existing rack. The InfiniBand cables must be able to reach the servers in each rack.

  3. Completely shut down the new rack (Rn+1).

  4. Cable the leaf switch in the new rack according to the appropriate table in Appendix E. For example, if rack Rn+1 was R4, then use Table E-9.

  5. Complete the following procedure for each of the original racks. In these steps, Rx represents a rack number from R1 to Rn.

    1. Power off leaf switch Rx IB2. This causes all servers to fail over their InfiniBand traffic to Rx IB3.

    2. Cable leaf switch Rx IB2 according to Appendix E.

    3. Power on leaf switch Rx IB2.

    4. Wait for 3 minutes for Rx IB2 to become completely operational.

      To check the switch, log in it and run the ibswitches command. The output should show n*3 switches for IB1, IB2, and IB3 in racks R1, R2,... Rn.

    5. Power off leaf switch Rx IB3. This causes all servers to fail over their InfiniBand traffic to Rx IB2.

    6. Cable leaf switch Rx IB3 according to Appendix E.

    7. Power on leaf switch Rx IB3.

    8. Wait for 3 minutes for Rx IB3 to become completely operational.

      To check the switch, log in to the switch and enter the ibswitches command. The output should show n*3 switches for IB1, IB2, and IB3 in racks R1, R2,... Rn.

      All racks should now be rewired according to Appendix E.

  6. Power on all the InfiniBand switches in Rn+1.

  7. Wait for 3 minutes for the switches to become completely operational.

    To check the switch, log in to the switch and run the ibswitches command. The output should show (n+1)*3 switches for IB1, IB2, and IB3 in racks R1, R2,... Rn+1.

  8. Ensure that the Subnet Manager Master is running on R1 IB1 by entering the getmaster command from any switch.

  9. Power on all servers in Rn+1.

  10. Log in to spine switch R1 IB1, and lower its priority to 8 as follows:

    1. Enter the disablesm command to stop the Subnet Manager.

    2. Enter the setsmpriority 8 command to set the priority to 8.

    3. Enter the enablesm command to restart the Subnet Manager.

  11. Ensure that the Subnet Manager Master is running on one of the spine switches by entering the getmaster command from any switch.

  12. Ensure that the Subnet Manager is running on every spine switch by entering the following command from any switch:

    ibdiagnet -r 
    

    Each spine switch should show as running in the Summary Fabric SM-state-priority section of the output. If a spine switch is not running, then log in to the switch and enable the Subnet Manager by entering the enablesm command.

  13. If there are now four or more racks, then log in to the leaf switches in each rack and disable the Subnet Manager by entering the disablesm command.