Skip Headers
Oracle® Exadata Database Machine Maintenance Guide
12c Release 1 (12.1)

E51951-21
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

4 Maintaining Other Components of Oracle Exadata Racks

This chapter contains the following topics:

Notes:

  • For ease of reading, the name "Oracle Exadata Rack" is used when information refers to both Oracle Exadata Database Machine and Oracle Exadata Storage Expansion Rack.

  • All procedures in this chapter are applicable to Oracle Exadata Database Machine and Oracle Exadata Storage Expansion Rack.

Replacing a Power Distribution Unit

A power distribution unit (PDU) can be replaced while Oracle Exadata Rack is online. The second PDU in the rack maintains the power to all components in the rack except for the LCD monitor in Oracle Exadata Database Machine X2-2. The LCD monitor is a non-critical item that is powered from the PDU-B side of the rack. PDU-A is on the left, and PDU-B is on the right when viewing the rack from the rear.

Reviewing the PDU Replacement Guidelines

Before replacing a PDU, the following guidelines should be reviewed to ensure the procedure is safe and does not disrupt availability:

  • Unlatching the InfiniBand cables while removing or inserting PDU-A may cause a loss of service due to nodes being removed from the cluster. This could cause the rack to be unavailable. Care should be taken when handling the InfiniBand cables, which are normally latched securely. Do not place excessive tension on the InfiniBand cables by pulling them.

  • Unhooking the wrong power feeds causes the rack to shut down. Trace the power cables running from the PDU that will be replaced to the power source, and only unplug those feeds.

  • Allow time to unpack and repack the PDU replacement parts. Note how the power cords are coiled in the packaging so the failed unit can be repacked the same way.

  • Removal of the side panel lessens the amount of time needed to replace the PDU. However, it is not necessary to remove the side panel to replace the PDU.

  • Use of a cordless drill or power screwdriver lessens the amount of time needed to replace the PDU. Allow more time for the replacement if using the hand wrench tool provided with the replacement rack. If using a screwdriver, then ensure that there are Torx T30 and T25 bits.

  • It may be necessary to remove the server cable arms to move the power cables. If that is the case, then twist the plug connection and flex the cable arm connector to avoid having to unclip the cable arm. If it is necessary to unclip the cable arm, then support the cables with one hand, remove the power cord, and then clip the cable arm. Do not leave the cable arm hanging.

  • When removing the T30 screws from the L-bracket, do not remove the T25 screws or nuts that attach the PDU to the bracket until the PDU is out of the rack.

Replacing a PDU

The following procedure describes how to replace a PDU:

  1. Use the PDU monitor as follows to identify its network settings, if it is not the reason for the PDU replacement:

    1. Press the reset button until it starts to count from 5 to 0. While it is counting down, release the button, and then press it once.

      Note:

      You must press the reset button for 20 seconds in order for the countdown to begin.
    2. Record the network settings, firmware version, and so on, displayed on the LCD screen as the monitor restarts.

      Note:

      If the PDU monitor is not working, then retrieve the network settings by connecting to the PDU over the network, or from the network administrator.
  2. Turn off all the PDU breakers.

  3. Unplug the PDU power plugs from the AC outlets.

    Note:

    • If the power cords use overhead routing, then put the power plugs in a location where they will not fall or hit anyone.

    • If the rack is on a raised floor, then move the power cords out through the floor cutout. It may be necessary to maneuver the rack over the cutout in order to move the power cords out.

  4. Do the following procedure for a PDU-B replacement when there is not side panel access, and the rack does not have an InfiniBand cable harness:

    Note:

    Do not unstrap any cables attached to the cable arms.
    1. Unscrew the T25 screws holding the square cable arms to the rack.

    2. Move the InfiniBand cables to the middle, out of the way.

  5. Unplug all power cables going from the servers and switches to the PDU. Keep the power cables together in group bundles.

  6. Remove the T30 screws from the top and bottom of the L-bracket, and note where the screws go.

  7. Note where the PDU sits in the rack frame. It is usually 1 inch back from the rack frame to allow access to the breaker switches.

  8. Angle and maneuver the PDU out of the rack.

  9. Hold the PDU or lay it down, if there is enough room, while maneuvering the AC power cords through the rack. It may be necessary to cut the cable ties that hold the AC cord flush with the bottom side of the PDU.

  10. Pull the cords as near to the bottom or top of the rack as possible where there is more room between the servers to get the outlet plug through the routing hole.

  11. Remove the smaller Torx T25 screws, and loosen the nut on the top and bottom to remove the PDU from the L-bracket. The nut does not have to be removed.

  12. Attach the L-bracket to the new PDU.

  13. Lay the new PDU next to the rack.

  14. Route the AC cords through the rack, and to where the outlets are.

    Note:

    Do not cable tie the AC cord to the new PDU at this time.
  15. Place the new PDU in the rack by angling and maneuvering it until the L-brackets sit on the top and bottom rails.

  16. Line up the holes and slots so that the PDU sits about 1 inch back from the rack frame.

  17. Attach the power cords using the labels on the cords as a guide. For example, G5-0 indicates PDU group 5 outlet 0 on the PDU.

  18. Attach the InfiniBand cable holders if they were removed in step 4. Oracle recommends screwing the holders in by hand at first to avoid stripping the screws.

  19. Attach the AC power cords to the outlets.

  20. Turn on the breakers.

  21. Cable and program the PDU monitor for the network, as needed.

    See Also:

Resetting a Non-Responsive ILOM

The Integrated Lights Out Manager (ILOM) may become unresponsive. If this happens, then manual intervention is needed to reset the Service Processor (SP) on the ILOM. The following procedures describe how to reset the ILOM:

Resetting the ILOM Using SSH

The following procedure describes how to reset the ILOM by connecting to it using SSH:

  1. Connect to the ILOM using SSH from another machine.

  2. Enter the following command at the ILOM prompt:

    reset /SP
    

Resetting the ILOM Using the ILOM Remote Console

If it is not possible to connect to the ILOM using SSH, then log in to the ILOM remote console. The following procedure describes how to reset the ILOM using the remote console.

  1. Log in to the ILOM remote console.

  2. Select Reset SP from the Maintenance tab.

  3. Click Reset SP.

Resetting the ILOM Using IPMItool

If you could not connect to the ILOM using SSH or the remote console, then log in to the local host or another host on the ILOM network, and use IPMItool. The following procedure describes how to reset the ILOM using IPMItool:

  1. Log in to local host or another host on the ILOM network.

  2. Run the following IPMItool command:

    • Using local host:

      $ ipmitool mc reset cold
      Sent cold reset command to MC
      
    • Using another host:

      $ ipmitool -H ILOM_host_name -U ILOM_user mc reset cold
      Sent cold reset command to MC
      

      In the preceding command, ILOM_host_name is the host name being used, and ILOM_user is the user name for the ILOM.

Resetting the ILOM Using the SP Reset Pin on Oracle Exadata Database Machine X2-2 Servers and Exadata Storage Servers

If you could not connect to the ILOM using SSH, the remote console, or IPMItool on the Oracle Exadata Database Machine X2-2 server or Exadata Storage Server, then press the SP reset pin. The following procedure describes how to reset the ILOM using the SP reset pin.

  1. Obtain a small, non-conductive stick.

  2. Go to the rear of the rack.

  3. Locate the SP reset pin opening. The SP reset pin opening is the first opening to the right of the NET MGT port.

  4. Insert the stick into the opening and press the pin.

Removing the SP from Sun Fire X4800 Oracle Database Servers and Sun Server X2-8 Oracle Database Servers

If you could not reset the ILOM on the Sun Fire X4800 Oracle Database Server or Sun Server X2-8 Oracle Database Server using SSH, the remote console or IPMItool, then remove the SP from the server, and put it back. Messages are displayed at the operating system level. These messages can be ignored. The fans will speed up because there is no fan control.

See Also:

Sun Fire X4800 Server Service Manual at

http://docs.oracle.com/cd/E19140-01/html/821-0282/index.html

Unplugging the ILOM Power Supply

If you could not reset to the ILOM using the preceding options, then unplug the power supply, and then plug it back in. This action power cycles the server as well as the ILOM.

Configuring Service Processor and ILOM Network Settings

The following procedure describes how to configure the service processor (SP) and ILOM network settings:

  1. Log in to the SP as the root user using SSH.

  2. Use the version command to check the SP/ILOM firmware release. The following is an example of the output from the command:

    -> version
    SP firmware 3.2.4.10
    SP firmware build number: 93199
    SP firmware date: Sat Oct  4 18:42:56 EDT 2014
    SP filesystem version: 0.2.10
    

    Note:

    The ipmitool can be used to log into the server SP. This is useful when the SP/ILOM is not accessible from the management network. The following command is used to connect to the SP:
    # ipmitool sunoem cli
    Connected. Use ^D to exit.
    -> version
    SP firmware 3.2.4.10
    SP firmware build number: 93199
    SP firmware date: Sat Oct  4 18:42:56 EDT 2014
    SP filesystem version: 0.2.10
    
  3. Configure the DNS server settings using the set command as follows:

    cd /SP/clients/dns/  
        /SP/clients/dns
    show
         /SP/clients/dns
            Targets:
            Properties:
                auto_dns = enabled
                nameserver = 0.0.0.0
                retries = 1
                searchpath =
                timeout = 5
            Commands:
                cd
                set
                show
    set nameserver=192.68.0.2
    set searchpath=yourdomain.com
    
  4. Configure the NTP server settings using the set command as follows.

    cd /SP/clients/ntp/server/1/
    /SP/clients/ntp/server/1
    show
     /SP/clients/ntp/server/1
        Targets:
        Properties:
            address = 0.0.0.0
        Commands:
            cd
            set
            show
    set address=192.68.0.1
    

    Note:

    Two NTP servers can be configured. Set the first NTP server using the set command, and then use the path SP/clients/ntp/server/2 to configure the second server.
  5. Use the set command to configure the network settings as follows:

    cd /SP/network
       /SP/network
    show
       /SP/network
        Targets:
            interconnect
            ipv6
            test
        Properties:
            commitpending = (Cannot show property)
            dhcp_clientid = none
            dhcp_server_ip = none
            ipaddress = 0.0.0.0
            ipdiscovery = dhcp
            ipgateway = 0.0.0.0
            ipnetmask = 0.0.0.0
            managementport = MGMT
            pendingipaddress = 0.0.0.0
            pendingipdiscovery = dhcp
            pendingipgateway = 0.0.0.0
            pendingipnetmask = 0.0.0.0
            pendingmanagementport = MGMT
            pendingvlan_id = (none)
            state = enabled
            vlan_id = (none)
        Commands:
            cd
            set
            show
    
  6. Configure the corresponding pendingip* settings for the ipaddress, ipdiscovery, ipgateway, ipnetmask, and vlan_id, and then commit the pending settings using the following command:

    set commitpending=true
    
  7. Disconnect from the command line interface after the network configuration is complete.

    Note:

    Use ^D to exit the session when using the ipmitool.

Changing from 1 GbE Connections to 10 GbE Connections

Sun Server X4-2 Oracle Database Servers, Sun Server X3-2 Oracle Database Servers, Sun Fire X4170 M2 Oracle Database Servers, Oracle Server X6-8 Database Servers, Oracle Server X5-8 Database Servers, Sun Server X4-8 Oracle Database Servers, Sun Server X3-8 Oracle Database Servers, and Sun Fire X4800 Oracle Database Servers have 10 Gigabit Ethernet (GbE) network cards. The 1 GbE connections can be changed to 10 GbE connections. When changing the connections, note the following:

  • To prevent a single point of failure for a bonded 10 GbE interface on Oracle Exadata Database Machine X2-8 Full Rack, use different ports on the Network Express Modules (NEMs) on the two cards, such as NEM0 NET1 and NEM1 NET0.

  • The 10 GbE interfaces are identified as eth4 and eth5 on Sun Fire X4170 M2 Oracle Database Servers, and as eth8 through eth15 on Sun Fire X4800 Oracle Database Servers. Oracle recommends using following on Oracle Exadata Database Machine X2-8 Full Racks:

    • BONDETH0 using interfaces eth9 and eth15

    • 10 GbE NEM0(left)/NET1

    • 10 GbE NEM1(right)/NET3

  • Oracle Clusterware is shut down, and the database server is restarted during the procedure.

This section contains the following tasks:

Task 1: Verify ping Functionality

Verify the functionality of the ping command before any changes using the following commands. By verifying the ping command before any changes, you know what is the results should be after changing the interfaces. Similar commands can be used to check other servers that connect to Oracle Exadata Database Machine.

# grep "^nameserver" /etc/resolv.conf
nameserver ip_address_1
nameserver ip_address_2

# ping -c 2 ip_address_1
PING ip_address_1 (ip_address_1) 56(84) bytes of data.
64 bytes from ip_address_1: icmp_seq=1 ttl=57 time=1.12 ms
64 bytes from ip_address_1: icmp_seq=2 ttl=57 time=1.05 ms
 
--- ip_address_1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 1.054/1.087/1.120/0.033 ms

If the test is not successful, showing 100% packet loss, then you should expect similar results when this same verification is run in "Task 4: Verify the 10 GbE Interfaces". If the test is successful, showing 0% packet loss, then you must see similar results after changing the 10 GbE connections.

Task 2: Back up the Current Interface Files

The following procedure describes how to back up the current interface files:

  1. Log in as the root user.

  2. Create hidden directories for the current and new 10 GbE files in the /etc/sysconfig/network-scripts directory similar to the following:

    # cd /etc/sysconfig/network-scripts
    # mkdir .Pre_10GigE_Settings
    # mkdir .Post_10GigE_Settings
    

    Note:

    Linux startup scripts search for files that begin with ifcfg, and assume files beginning with ifcfg are used for network setup. Placing the backup files in hidden directories avoids them from being used to set up the network interface.
  3. Identify the connected 10 GbE interfaces using the following command. Run the command for each 10 GbE interface.

    # ethtool interface
    

    In the preceding command, interface is the 10 GbE interface. The interface is eth4 and eth5 for Sun Fire X4170 M2 Oracle Database Servers, and eth8 through eth15 for Sun Fire X4800 Oracle Database Servers.

    The following is an example of the output from the command. The speed should be 10000Mb/s, Link detected should be yes, and Duplex should be full.

    # ethtool eth9
    Settings for eth9:
            Supported ports: [ FIBRE ]
            Supported link modes:  1000baseT/Full 
                                   10000baseT/Full 
            Supports auto-negotiation: No
            Advertised link modes:  1000baseT/Full 
                                    10000baseT/Full 
            Advertised auto-negotiation: No
            Speed: 10000Mb/s
            Duplex: Full
            Port: FIBRE
            PHYAD: 0
            Transceiver: external
            Auto-negotiation: on
            Supports Wake-on: umbg
            Wake-on: umbg
            Current message level: 0x00000007 (7)
            Link detected: yes
    
  4. Verify the current bonded interface using the following command. An example of the output from the command is also shown.

    # grep -i bondeth0 ifcfg-eth*
    
    ifcfg-eth1:MASTER=bondeth0
    ifcfg-eth2:MASTER=bondeth0
    
  5. Copy the 1 GbE interface files to the .Pre_10GigE_Settings directory using a command similar to the following:

    # cp -p ifcfg-eth1 ifcfg-eth2 ./.Pre_10GigE_Settings/.
    
  6. Copy the 10 GbE interface files to the .Pre_10GigE_Settings directory using a command similar to the following:

    # cp -p ifcfg-eth9 ifcfg-eth15 ./.Pre_10GigE_Settings/.
    
  7. Copy the files from the .Pre_10GigE_Settings directory to the .Post_10GigE_Settings directory using a command similar to the following:

    # cp -p ./.Pre_10GigE_Settings/* ./.Post_10GigE_Settings/.
    

Task 3: Edit the 10 GbE Interface Settings

The following procedure describes how to edit the ifcfg configuration files:

  1. Edit the ifcfg configuration files as shown in the following table. The files must be edited in the ./Post_10GigE_Settings/. directory.

    File Name Before Modification After Modification
    ifcfg-eth1
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth1
    USERCTL=no
    ONBOOT=yes
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:21:28:44:d2:5e
    MASTER=bondeth0
    SLAVE=yes
    
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth1
    USERCTL=no
    ONBOOT=no 
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:21:28:44:d2:5e
    
    ifcfg-eth2
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth2
    USERCTL=no
    ONBOOT=yes
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:21:28:44:d2:f2
    MASTER=bondeth0
    SLAVE=yes
    
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth2
    USERCTL=no
    ONBOOT=no 
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:21:28:44:d2:f2
    
    ifcfg-eth4 on Oracle Exadata Database Machine X2-2

    ifcfg-eth9 on Oracle Exadata Database Machine X2-8 Full Rack

    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth_interface
    ONBOOT=no
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:1b:21:66:4b:c0
    
    In the preceding syntax, eth_interface is eth4 for Oracle Exadata Database Machine X2-2, or eth9 for Oracle Exadata Database Machine X2-8 Full Rack
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth_interface
    USERCTL=no
    ONBOOT=yes
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:1b:21:66:4b:c0
    MASTER=bondeth0
    SLAVE=yes
    
    In the preceding syntax, eth_interface is eth4 for Oracle Exadata Database Machine X2-2, or eth9 for Oracle Exadata Database Machine X2-8 Full Rack
    ifcfg-eth5 on Oracle Exadata Database Machine X2-2

    ifcfg-eth15 on Oracle Exadata Database Machine X2-8 Full Rack

    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth_interface2
    ONBOOT=no 
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:1b:21:66:4b:c1
    
    In the preceding syntax, eth_interface is eth5 for Oracle Exadata Database Machine X2-2, or eth15 for Oracle Exadata Database Machine X2-8 Full Rack
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth_interface2
    USERCTL=no
    ONBOOT=yes
    BOOTPROTO=none
    HOTPLUG=no
    IPV6INIT=no
    MASTER=bondeth0
    SLAVE=yes
    HWADDR=00:1b:21:66:4b:c1
    
    In the preceding syntax, eth_interface is eth5 for Oracle Exadata Database Machine X2-2, or eth15 for Oracle Exadata Database Machine X2-8 Full Rack

  2. Copy the edited files to the /etc/sysconfig/network-scripts directory using the following command:

    # cp -fp /etc/sysconfig/network-scripts/.Post_10GigE_Settings/ifcfg-eth* \
      /etc/sysconfig/network-scripts/.
    
  3. Restart the database server using the console.

  4. Monitor the boot sequence to ensure no errors occurred during bondeth0 initialization.

Task 4: Verify the 10 GbE Interfaces

The following procedure describes how to verify the 10 GbE interfaces:

  1. Log in as the root user.

  2. Use the cat command to review the /proc/net/bonding/bondeth0 file. The following is an example of the command and output from the command:

    # cat /proc/net/bonding/bondeth0
    
    Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)
     
    Bonding Mode: fault-tolerance (active-backup)
    Primary Slave: None
    Currently Active Slave: eth9
    MII Status: up
    MII Polling Interval (ms): 100
    Up Delay (ms): 5000
    Down Delay (ms): 5000
     
    Slave Interface: eth9
    MII Status: up
    Link Failure Count: 0
    Permanent HW addr: 00:1b:21:66:4b:c0
     
    Slave Interface: eth15
    MII Status: up
    Link Failure Count: 0
    Permanent HW addr: 00:1b:21:66:4b:c1
    

    In the output, verify the slave interfaces are correct, and the MII statuses for the slave interface are up.

  3. Use the netstat -nr command to check the routing table. The routing table should not have changed. The following is an example of the command and output:

    # netstat -nr
    
    Kernel IP routing table
    Destination   Gateway      Genmask         Flags   MSS Window  irtt Iface
    scan_subnet 0.0.0.0        255.255.255.0   U         0 0          0 bondeth0
    192.168.80.0  0.0.0.0      255.255.254.0   U         0 0          0 bondib0
    192.168.80.0  0.0.0.0      255.255.254.0   U         0 0          0 bondib1
    192.168.80.0  0.0.0.0      255.255.254.0   U         0 0          0 bondib2
    192.168.80.0  0.0.0.0      255.255.254.0   U         0 0          0 bondib3
    mgmt_subnet 0.0.0.0        255.255.254.0   U         0 0          0 eth0
    0.0.0.0       scan_gw      0.0.0.0         UG        0 0          0 bondeth0
    
  4. Use the following commands to check the default gateway. The gateway is the SCAN network gateway, and should use bondeth0 on the 10 GbE interfaces.

    # grep GATEWAY /etc/sysconfig/network
    GATEWAY=gw_address
    GATEWAYDEV=bondeth0 
    
    # ping -c 2 gw_address
    PING gw_address (gw_address) 56(84) bytes of data.
    64 bytes from gw_address: icmp_seq=1 ttl=57 time=1.12 ms
    64 bytes from gw_address: icmp_seq=2 ttl=57 time=1.05 ms
    
    --- gw_address ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    rtt min/avg/max/mdev = 1.054/1.087/1.120/0.033 ms
    

    In the preceding commands and output, gw_address is the IP address of the default gateway.

  5. If the name servers were responding to the ping command in "Task 1: Verify ping Functionality", then use the following commands to check the name servers. Similar commands can be used to check other servers that connect to Oracle Exadata Database Machine.

    # grep "^nameserver" /etc/resolv.conf
    nameserver ip_address_1
    nameserver ip_address_2
    
    # ping -c 2 ip_address_1
    PING ip_address_1 (ip_address_1) 56(84) bytes of data.
    64 bytes from ip_address_1: icmp_seq=1 ttl=57 time=1.12 ms
    64 bytes from ip_address_1: icmp_seq=2 ttl=57 time=1.05 ms
     
    --- ip_address_1 ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    rtt min/avg/max/mdev = 1.054/1.087/1.120/0.033 ms
    

Maintaining the InfiniBand Network

The InfiniBand network connects the database servers and Exadata Storage Servers through the BONDIB0 interface to the InfiniBand switches in the rack. This section describes how to perform maintenance on the InfiniBand switches.

This section contains the following topics:

Backing Up and Restoring Switch Settings

The procedure for backing up and restoring switch settings depends on the firmware on the switch. The 1.1.3-2 firmware has Integrated Lights Out Manager (ILOM) which provides backup and restore capability. The 1.0.1 firmware does not have ILOM. You can either upgrade to the 1.1.3-2 firmware and then use the procedure in "Backing Up Settings on a Switch with 2.1.3-4 Firmware", or you can manually perform the backup and restore of individual files.

This section contains the following topics:

See Also:

For later switch firmware, refer to the following:

Backing Up Settings on a Switch with 2.1.3-4 Firmware

The following procedure describes how to back up a switch with 2.1.3-4 firmware. The backup only needs to be done once after the switch has been initially configured with the right settings.

  1. Navigate to the switch ILOM URL in a browser. For example: http://dbm002-i1.us.example.com.

  2. Log in as the ilom-admin user.

  3. Select the Maintenance tab.

  4. Select the Backup/Restore tab.

  5. Select the Backup operation and the Browser method.

  6. Enter a passphrase. This is used to encrypt sensitive information, such as user passwords, in the backup.

  7. Click Run, and save the resulting XML file in a secure location.

  8. Log in to the Sun Datacenter InfiniBand Switch 36 switch as the root user.

  9. Use the scp command to copy the following files:

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

    • host file: /etc/hosts

  10. Save the output from the version command.

Backing Up Settings on a Switch with 1.1.3-2 Firmware

The following procedure describes how to back up a switch with 1.1.3-2 firmware. The backup only needs to be done once after the switch has been initially configured with the right settings.

  1. Navigate to the switch ILOM URL in a browser. For example: http://dbm002-i1.us.example.com.

  2. Log in as the ilom-admin user.

  3. Select the Maintenance tab.

  4. Select the Backup/Restore tab.

  5. Select the Backup operation and the Browser method.

  6. Enter a passphrase. This is used to encrypt sensitive information, such as user passwords, in the backup.

  7. Click Run, and save the resulting XML file in a secure location.

  8. Log in to the Sun Datacenter InfiniBand Switch 36 switch as the root user.

  9. Use the scp command to copy the following files:

    • Network configuration: /etc/sysconfig/network-scripts/ifcfg-eth0

    • DNS information: /etc/resolv.conf

    • NTP information: /etc/ntp.conf

    • Time zone information: /etc/localtime

    • openSM settings: /etc/opensm/opensm.conf

    • Host name: /etc/sysconfig/network

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

  10. Run the hostname command, and then save the output. This is done in case the host name is not set in the /etc/sysconfig/network file.

  11. Save the passwords for the root and nm2user accounts.

  12. Run the nm2version command, and then save the output.

Backing Up Settings on a Switch with 1.0.1 Firmware

The following procedure describes how to back up the settings on a switch with 1.0.1 firmware:

  1. Log in to the switch as the root user. If you do not have the password for the root user, then contact Oracle Support Services.

  2. Make copies of the following files:

    • Network configuration: /etc/sysconfig/network-scripts/ifcfg-eth0

    • DNS information: /etc/resolv.conf

    • NTP information: /etc/ntp.conf

    • Time zone information: /etc/localtime

    • openSM settings: /etc/opensm/opensm.conf

    • Host name: /etc/sysconfig/network

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

  3. Run the hostname command and save the output, in case the host name is not set in the /etc/sysconfig/network file.

  4. Save the passwords for the root and nm2user accounts.

  5. Run the nm2version command and save the output.

Restoring Settings on a Switch with 2.1.3-4 Firmware

The following procedure describes how to restore the settings on a switch with 2.1.3-4 firmware:

  1. Run the version command, and ensure that the switch is at the right firmware level. If not, then upgrade the switch to the correct firmware level.

  2. Navigate to the switch ILOM URL in a browser. For example: http://dbm002-i1.us.example.com.

  3. Log in as the ilom-admin user.

  4. Select the Maintenance tab.

  5. Select the Backup/Restore tab.

  6. Select the Restore operation and the Browser method.

  7. Click Browse, and select the XML file that contains the switch configuration backup.

  8. Enter the passphrase that was used during the backup.

  9. Click Run to restore the configuration.

  10. Log in to the Sun Datacenter InfiniBand Switch 36 switch as the root user.

  11. Restore the following files from the backup:

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

    • host file: /etc/hosts

  12. Restart openSM from the switch CLI using the following commands:

    disablesm
    enablesm
    
  13. Log in as the root user.

  14. Restart the switch.

Restoring Settings on a Switch with 1.1.3-2 Firmware

The following procedure describes how to restore the settings on a switch with 1.1.3-2 firmware:

  1. Run the version command, and ensure that the switch is at the right firmware level. If not, then upgrade the switch to the correct firmware level.

  2. Navigate to the switch ILOM URL in a browser. For example: http://dbm002-i1.us.example.com.

  3. Log in as the ilom-admin user.

  4. Select the Maintenance tab.

  5. Select the Backup/Restore tab.

  6. Select the Restore operation and the Browser method.

  7. Click Browse, and select the XML file that contains the switch configuration backup.

  8. Type in the passphrase that was used during the backup.

  9. Click Run to restore the configuration.

  10. Log in to the Sun Datacenter InfiniBand Switch 36 switch as the root user.

  11. Restore the following files from the backup:

    • Network configuration: /etc/sysconfig/network-scripts/ifcfg-eth0

    • DNS information: /etc/resolv.conf

    • NTP information: /etc/ntp.conf

    • Time zone information: /etc/localtime

    • openSM settings: /etc/opensm/opensm.conf

    • Host name: /etc/sysconfig/network

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

  12. Restore the host name by adding the following line to the /etc/sysconfig/network file, if it not already in the file.

    HOSTNAME=switch_host_name
    
  13. Restore the passwords of the root and nm2user users using the passwd command.

  14. Run the following commands in the order shown to restart the services and openSM:

    service network restart 
    service ntpd restart 
    disablesm 
    enablesm 
    
  15. Log in as the root user.

  16. Restart the switch.

Restoring Settings on a Switch with 1.0.1 Firmware

The following procedure describes how to restore the settings to a switch with 1.0.1 firmware:

  1. Log in to the switch as the root user. If you do not have the password for the root user, then contact Oracle Support Services.

  2. Ensure that the switch is at the right firmware level. If not, then upgrade the switch to the correct firmware level.

  3. Restore the following files from the backup:

    • Network configuration: /etc/sysconfig/network-scripts/ifcfg-eth0

    • DNS information: /etc/resolv.conf

    • NTP information: /etc/ntp.conf

    • Time zone information: /etc/localtime

    • openSM settings: /etc/opensm/opensm.conf

    • Host name: /etc/sysconfig/network

    • root SSH keys: /root/.ssh/authorized_keys

    • nm2user SSH keys (if it exists): /home/nm2user/.ssh/authorized_keys

  4. Restore the host name by adding a HOSTNAME=switch_host_name line to the /etc/sysconfig/network file, if not already present.

  5. Restore the passwords of the root and nm2user users using the passwd command.

  6. Run the following commands in the order shown to restart the services and openSM:

    service network restart 
    service ntpd restart 
    disablesm 
    enablesm 
    
  7. Log in as the root user.

  8. Restart the switch.

Correcting InfiniBand Card Failures on Oracle Exadata Database Machine X3-8 Full Rack, and Oracle Exadata Database Machine X2-8 Full Rack Database Servers

InfiniBand bonding (BONDIB0 through BONDIB3) for database servers in Oracle Exadata Database Machine X3-8 Full Rack, and Oracle Exadata Database Machine X2-8 Full Rack use both ports on the same card for each of the four InfiniBand cards. If both ports on a single card are disabled, such as the ports fail or cables are removed, then the Oracle Clusterware stack halts. The following procedure describes how to isolate the card, and then restart Oracle Clusterware after correcting the InfiniBand card problem:

  1. Isolate the failed InfiniBand card as follows:

    1. Stop Oracle Clusterware as the root user using the following command. If Oracle Clusterware is already down, then go to step 1b.

      # crsctl stop crs
      
    2. Edit the cellinit.ora file to remove the affected IP address.

    3. Start Oracle Clusterware as the root user using the following command:

      # crsctl start crs
      
  2. Correct the problem with the InfiniBand card.

  3. Return the card to service as follows:

    1. Stop Oracle Clusterware as the root user using the following command:

      # crsctl stop crs
      
    2. Add the IP address to the cellinit.ora file.

    3. Start Oracle Clusterware as the root user using the following command:

      # crsctl start crs
      

Replacing a Failed Sun Datacenter InfiniBand Switch 36 Switch

The following procedure describes how to replace a failed Sun Datacenter InfiniBand Switch 36 switch:

  1. Power off both power supplies on the switch by removing the power plugs.

  2. Disconnect the cables from the switch. All InfiniBand cables should have labels at both ends indicating their locations. If there are any cables that do not have labels, then label them before disconnecting them.

  3. Remove the switch from the rack.

  4. Install the new switch in the rack.

    See Also:

    Oracle Exadata Database Machine Extending and Multi-Rack Cabling Guide for information about installing a Sun Datacenter InfiniBand Switch 36 switch
  5. Power on the switch by plugging in the power plugs.

  6. Restore the switch settings using the backup, as described in "Backing Up and Restoring Switch Settings".

  7. Disable the Subnet Manager using the disablesm command.

  8. Connect the cables to the new switch. Make sure to connect each cable to the correct port.

  9. Run the following command on any of the servers:

    # /opt/oracle.SupportTools/ibdiagtools/verify-topology
    

    The preceding command verifies that the right number of database servers are connected to the right number of Exadata Storage Servers.

  10. Run the following command on any host to verify that there are no errors on any of the links in the fabric:

    ibdiagnet -c 5000 -r
    
  11. Enable the Subnet Manager using the enablesm command.

    Note:

    If the replaced switch was the spine switch, then manually fail the Master Subnet Manager back to the switch by disabling the Subnet Managers on the other switches until this spine switch becomes the master, then re-enable the Subnet Manager on all the other switches. Refer to Oracle Exadata Database Machine Installation and Configuraton Guide.

Verifying InfiniBand Network Configuration

The following procedure describes how to verify the InfiniBand network configuration.

  1. Verify the proper OpenFabrics Enterprise Distribution (OFED) software and InfiniBand HCA firmware versions are being used on the database servers.

    The OFED software and InfiniBand HCA firmware versions are automatically maintained on Exadata Cell.

    See Also:

    My Oracle Support note 888828.1 for the current releases, and instructions about how to check the installed releases
  2. Verify the InfiniBand topology using the following command from a database server or Exadata Storage Server:

    # /opt/oracle.SupportTools/ibdiagtools/verify-topology
    

    If any errors occur, then contact Oracle Support Services.

    See Also:

    "Using the verify-topology Utility" for additional information about the verify-topology utility

Using the verify-topology Utility

Oracle Exadata Database Machine includes the verify-topology utility. This utility can be used to identify the following network connection problems:

  • Missing InfiniBand cable

  • Missing InfiniBand connection

  • Incorrectly-seated cable

  • Cable connected to the wrong endpoint

The utility is available in the ibdiagtools directory on all servers. To view the options for the verify-topology utility, use the following command:

./verify-topology -h

[ DB Machine Infiniband Cabling Topology Verification Tool ]
Usage: ./verify-topology 
    [-v|--verbose]
    [-r|--reuse (cached maps)]
    [-m|--mapfile]
    [-ibn|--ibnetdiscover (specify location of ibnetdiscover output)]
    [-ibh|--ibhosts (specify location of ibhosts output)]
    [-ibs|--ibswitches (specify location of ibswitches output)]
    [-t|--topology [torus | fattree | halfrack] default is fattree]

The following is an example of the output when using the verify-topology utility. In the example, the error shows the cables are connected incorrectly. Both cables from the server are going to same InfiniBand switch. If the switch fails, then the server loses connectivity to InfiniBand network.

[ DB Machine Infiniband Cabling Topology Verification Tool ]

Bad link:Switch 0x21283a8371a0a0 Port 11A - Sun Port 11B
        Reason : 2.5 Gbps Speed found. Could be 10 Gbps
        Possible cause : Cable isn't fully seated in

Bad link:Switch 0x21283a89eba0a0 Port 11B - Sun Port 11A
        Reason : 2.5 Gbps Speed found. Could be 10 Gbps
        Possible cause : Cable isn't fully seated in

Is every external switch connected to every internal switch..........[SUCCESS]
Are any external switches connected to each other....................[SUCCESS]
Are any hosts connected to spine switch..............................[SUCCESS]
Check if all hosts have 2 CAs to different switches..................[ERROR]
Node trnA-db01 has 1 endpoints. (Should be 2)
Port 2 of this node is not connected to any switch

--------fattree End Point Cabling verifation failed-----

Leaf switch check: cardinality and even distribution.................[ERROR]

Internal QDR Switch 0x21283a8371a0a0 has fewer than 4 compute nodes
It has only 3 links belonging to compute nodes
Check if each rack has an valid internal ring........................[SUCCESS]

Verifying InfiniBand Network Operation

If hardware maintenance has taken place with any component in the InfiniBand network, including replacing an InfiniBand HCA on a server, an InfiniBand switch, or an InfiniBand cable, or if operation of the InfiniBand network is suspected to be substandard, then verify the InfiniBand network is operating properly. The following procedure describes how to verify network operation:

Note:

The following procedure can be used any time the InfiniBand network is performing below expectations.
  1. Perform the steps in "Verifying InfiniBand Network Configuration".

  2. Run the ibdiagnet command to verify InfiniBand network quality using the following command:

    # ibdiagnet -c 1000
    

    All errors reported by this command should be investigated. This command generates a small amount of network traffic, and may be run while normal workload is running.

  3. Run the ibqueryerrors.pl command to report on switch port error counters and port configuration information using the command:

    #  ibqueryerrors.pl -rR -s RcvSwRelayErrors,XmtDiscards,XmtWait,VL15Dropped
    

    Errors such as LinkDowned, RcvSwRelayErrors, XmtDiscards, and XmtWait are ignored when using the preceding command.

    Notes:

    • The InfiniBand counters are cumulative and the errors may have occurred at any time in the past. If there are errors reported, then Oracle recommends clearing the InfiniBand counters using the ibclearcounters command. After running the command, let the system run for a few minutes under load, and then run the ibquerryerrors command.

    • Some counters, such as SymbolErrors or RcvErrors can increment when servers are rebooted. Small values for these counters which are less than the LinkDowned counter are generally not a problem. The LinkDowned counter indicates the number of times the port has gone down usually for valid reasons, such as a reboot, and is not usually an error indicator by itself.

    • Any links reporting high, persistent errors especially SymbolErrors, LinkRecovers, RcvErrors, or LinkIntegrityErrors may indicate a bad or loose cable or port.

    • If there are persistent, high InfiniBand network error counters, then investigate and correct the problem.

  4. If there is no load running on any portion of the InfiniBand network, such as no databases running, then run the infinicheck command to perform full InfiniBand network configuration, connectivity and performance evaluation.

    Note:

    This command evaluates full network maximum throughput and should not be run when there is workload running on any system on the InfiniBand network.

    This command relies on a fully-configured system. Run the first command to clear the files that were created during the last run of the commands.

    # /opt/oracle.SupportTools/ibdiagtools/infinicheck -z 
    
    # /opt/oracle.SupportTools/ibdiagtools/infinicheck
    

    The following is an example of the output from the command:

    Verifying User Equivalance of user=root to all hosts.
    (If it isn't setup correctly, an authentication prompt will appear to push keys
     to all the nodes)
     
     Verifying User Equivalance of user=root to all cells.
    (If it isn't setup correctly, an authentication prompt will appear to push keys
     to all the nodes)
     
     
                        ####  CONNECTIVITY TESTS  ####
                        [COMPUTE NODES -> STORAGE CELLS]
                               (30 seconds approx.)
    [SUCCESS]..............Connectivity verified
     
    [SUCCESS]....... All hosts can talk to all storage cells
     
            Verifying Subnet Masks on Hosts and Cells
    [SUCCESS] ......... Subnet Masks is same across the network
     
            Checking for bad links in the fabric
    [SUCCESS].......... No bad fabric links found
     
                        [COMPUTE NODES -> COMPUTE NODES]
                               (30 seconds approx.)
    [SUCCESS]..............Connectivity verified
     
    [SUCCESS]....... All hosts can talk to all other nodes
     
     
                        ####  PERFORMANCE TESTS  ####
     
                        [(1) Every COMPUTE NODE to its STORAGE CELL]
                              (15 seconds approx.)
    [SUCCESS]........ Network Bandwidth looks OK.
    .......... To view only performance results run ./infinicheck -d -p
     
                        [(2) Every COMPUTE NODE to another COMPUTE NODE]
                              (10 seconds approx.)
    [SUCCESS]........ Network Bandwidth looks OK.
    ...... To view only performance results run ./infinicheck -d -p
     
                        [(3) Every COMPUTE NODE to ALL STORAGE CELLS]
                      (45 seconds approx.) (looking for SymbolErrors)
     
    [SUCCESS]....... No port errors found
    

Understanding the Network Subnet Manager Master

The Subnet Manager manages all operational characteristics of the InfiniBand network, such as the following:

  • Discover the network topology

  • Assign a local identifier to all ports connected to the network

  • Calculate and program switch forwarding tables

  • Monitor changes in the fabric

The InfiniBand network can have more than one Subnet Manager, but only one Subnet Manager is active at a time. The active Subnet Manager is the Master Subnet Manager. The other Subnet Managers are the Standby Subnet Managers. If a Master Subnet Manager is shut down or fails, then a Standby Subnet Manager automatically becomes the Master Subnet Manager.

Each Subnet Manager has a priority that can be configured. When there is more than one Subnet Manager on the InfiniBand network, the Subnet Manager with the highest priority becomes the Master Subnet Manager. On Oracle Exadata Database Machine, the Subnet Managers on leaf switches should be configured as priority 5, and the Subnet Managers on spine switches should be configured as priority 8.

The following guidelines determine where Subnet Managers run on Oracle Exadata Database Machine:

  • Only run Subnet Managers on the Sun Datacenter InfiniBand Switch 36 switches in Oracle Exadata Database Machine. Running Subnet Manager on any other device is not supported.

  • In Exadata-only configurations, when the InfiniBand network consists of one, two or three racks cabled together, all switches should run Subnet Manager. The Master Subnet Manager should be run on a spine switch. If the network has only leaf switches, as in Oracle Exadata Database Machine Quarter Racks, then Subnet Manager Master runs on a leaf switch.

    In multirack configurations using different types of racks such as Exadata plus Exalogic, see My Oracle Support note 1682501.1.

  • When the InfiniBand network consists of four or more racks cabled together, then only spine switches should run Subnet Manager. The leaf switches should disable Subnet Manager.

Upgrading InfiniBand Switch Firmware

The patchmgr utility is used to upgrade and downgrade the InfiniBand switches. The minimum switch firmware release that can use the patchmgr utility is release 1.3.3-2. Switch firmware is upgraded in a rolling manner. If a spine switch is present in the rack, then the spine switch is upgraded first. If a spine switch is not in the rack, then upgrade the switch that is running the subnet manager. If the subnet manager is not running on the switches, then perform the upgrade in any order.

Create a file that lists the InfiniBand switches to be updated, one switch per line. The following is an example of the file:

# cat ibswitches.lst
myibswitch-01
myibswitch-02

To upgrade the InfiniBand switches, the switch firmware must be at release 1.3.3-2 or later. If the switch firmware is at an earlier release, then it is necessary to upgrade the firmware to release 1.3.3-2 using the instructions in My Oracle Support note 888828.1.

  1. Log in as the root user to a database server on Oracle Exadata Database Machine that has root user SSH access to the switches. The database server must be on the same InfiniBand network as the switches.

  2. Download the appropriate patch file to the database server. Refer to My Oracle Support note 888828.1 for patch information.

  3. Uncompress the patch files. The files are uncompressed to the patch_release.date directory.

  4. Create a file listing the InfiniBand switches that need to be updated, with one switch per line. The following is an example of the file:

    # cat ibswitches.lst
    myibswitch-01
    myibswitch-02
    
  5. Change to the patch_release.date directory.

  6. Run the prerequisite checks using the following command:

    # ./patchmgr -ibswitches ibswitches.lst -upgrade -ibswitch_precheck [-force] [-unkey]
    

    Notes:

    • The -unkey option removes passwordless SSH access to the InfiniBand switches before exiting.

    • The -force option overrides failures in the InfiniBand topology and connectivity from the servers to the switches. This does not affect the upgrade of the switch.

    If the output from the command shows overall status is SUCCESS, then proceed with the upgrade. If the output from the command shows overall status is FAIL, then review the error summary in the output to determine which checks failed, and then correct the errors. After the errors have been corrected, rerun the prerequisite checks until it is successful.

  7. Upgrade the switches using the following command:

    # ./patchmgr -ibswitches ibswitches.lst -upgrade [-force] [-unkey]
    
  8. Check the output from the command, and verify the upgrade. The output should show SUCCESS. If there are errors, then correct the errors and run the upgrade command again.

Downgrading the Switch Software

The only included downgrade is to release 2.1.6-2. Use the following commands to downgrade the firmware:

# ./patchmgr -ibswitches ibswitches.lst -downgrade -ibswitch_precheck [-force] [-unkey]
# ./patchmgr -ibswitches ibswitches.lst -downgrade [-force] [-unkey]

Configuring InfiniBand Partitioning

This topic is described in "Implementing InfiniBand Partitioning across OVM RAC Clusters on Oracle Exadata". You can use InfiniBand partitioning with or without OVM.

Changing InfiniBand IP Addresses and Host Names

It may be necessary to change the InfiniBand network information on an existing Oracle Exadata Rack. The change may be needed to support a media server with multiple InfiniBand cards, or keep InfiniBand traffic on a distinct InfiniBand network such as having production, test and QA environments in the same rack.

All InfiniBand addresses must be in the same subnet, with a minimum subnet mask of 255.255.240.0 (or /20). The subnet mask chosen should be wide enough to accommodate possible future expansion of the Oracle Exadata Rack and InfiniBand network.

See Also:

Oracle Fusion Middleware Exalogic Enterprise Deployment Guide for information about configuring the SDP listener to connect to an Oracle Exalogic system

Changing InfiniBand Network Information

The procedure described in this section is based on the following assumptions:

  • All changes should be done as the ilom-admin user using the ILOM interface.

  • Channel bonding is used for the client access network, such that the NET1 and NET2 interfaces are bonded to create BONDETH0. If channel bonding is not used, then replace BONDETH0 with NET1 in the procedure.

  • On X4-2 and later hardware, as of release 11.2.3.3.0, the name used for Infiniband bonding changed from BONDIB0 to IB0 and IB1. These interfaces are changed the same way as the ifcfg-bondib0 interface.

  • As of release 11.2.2.1.0, the names used for bonding changed. The names are BONDIB0 for the InfiniBand bonding and BONDETH0 for Ethernet bonding. In earlier releases, the names were BOND0 and BOND1, respectively.

  • The procedure uses the dcli utility and the root user. This significantly reduces the overall time to complete the procedure by running the commands in parallel on the database servers.

  • The dcli utility requires SSH user-equivalence. If SSH user-equivalence is not configured, then some commands must be run explicitly on each database server.

  • The database group file, dbs_group, must exist and be located in the /root directory.

  • Ensure recent backups of the Oracle Cluster Registry (OCR) exist before changing the InfiniBand network information. OCR backups are located in the $GRID_HOME/cdata/cluster-name directory.

  • Starting with Oracle Database 11g Release 2 (11.2) Grid Infrastructure, the private network configuration is stored in the Grid Plug and Play (gpnp) profile as well as the OCR. If the gpnp definition is not correct, then Oracle Clusterware CRS does not start. Take a backup of the gpnp profile on all nodes before changing the InfiniBand network information using the following commands:

    $ cd $GRID_HOME/gpnp/hostname/profiles/peer/
    $ cp -p profile.xml profile.xml.bk
    

The following procedure describes how to change the InfiniBand network information.

  1. Determine if the CLUSTER_INTERCONNECT parameter is used in the Oracle Database and Oracle ASM instances using the following command:

    SELECT inst_id, name,value FROM gv$parameter WHERE name = \
    'cluster_interconnects'
    

    If the CLUSTER_INTERCONNECT parameter is set in OCR, then no value is returned. If the CLUSTER_INTERCONNECT parameter is defined in the server parameter file (SPFILE), then the query returns an IP addresses for each instance, and they need to be changed to new IP addresses.

    The following is an example of the commands to change the IP addresses for the Oracle ASM instances. In the example, the IP address 192.168.10.1 is the new IP address assigned to BONDIB0 on the server where the +ASM1 instance runs, 192.168.10.2 is the IP address for BONDIB0 on the server where the +ASM2 instance runs, and so on.

    ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.10.1' SCOPE=SPFILE SID='+ASM1';
    ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.10.2' SCOPE=SPFILE SID='+ASM2';
    ALTER SYSTEM SET CLUSTER_INTERCONNECTS='192.168.10.3' SCOPE=SPFILE SID='+ASM3';
    ...
    

    Use a similar command to change the IP addresses for each Oracle Database instance that was returned.

  2. Verify the assignment of the new InfiniBand network information for all servers. Verification should include the InfiniBand IP addresses, netmask, broadcast, and network IP information.

  3. Shut down all cluster-managed services on each database server as the oracle user using the following command:

    $ srvctl stop home -o db_home -s state_filename -n node_name
    

    In the preceding command, db_home is the full directory name for the Oracle Database home directory, state_filename is the path name where you want the state file to be written, and node_name is the name of the database server. The following is an example of the command:

    $ srvctl stop home -o /u01/app/oracle/product/11.2.0.3/dbhome_1 -s \
    /tmp/dm02db01_dbhome -n dm02db01
    

    In the preceding example, /u01/app/oracle/product/11.2.0.3/dbhome_1 is the Oracle Database home directory, /tmp/dm02db01_dbhome is the state file name, and dm02db01 is the name of the database server.

    See Also:

    Oracle Real Application Clusters Administration and Deployment Guide for additional information about Server Control Utility (SRVCTL) commands
  4. Modify the cluster interconnect interface to use the BONDIB0 interface on the first database server as follows:

    Note:

    At this point, only Oracle Clusterware, Oracle Clusterware CRS, and Oracle ASM instances are up.
    1. Log in as the oracle user.

    2. Set ORACLE_HOME to the Grid Infrastructure home.

    3. Set the base for the ORACLE_SID environment variable using the following command. The ORACLE_HOME environment variable must be set to the Grid Infrastructure home.

      ORACLE_SID=+ASM1
      
    4. List the available cluster interfaces using the following command:

      $ oifcfg iflist
      

      The following is an example of the output:

      bondeth0 10.128.174.160
      bondeth1 10.128.176.0
      eth0 10.128.174.128
      ib0 192.168.160.0
      ib0 169.254.0.0
      ib1 192.168.160.0
      ib1 169.254.128.0
      
    5. List the currently-assigned cluster interfaces using the following command:

      $ oifcfg getif
      

      The following is an example of the output:

      bondeth0 10.204.76.0 global public
      bondib0 192.168.16.0 global cluster_interconnect
      
    6. Assign BONDIB0 and a new IP address as the global cluster interconnect interface using the following command:

      $ oifcfg setif -global c_interface/c_IP_subnet:cluster_interconnect
      

      In the preceding command, c_interface is the interface to be used for cluster interconnect, and c_IP_subnet is the IP address for the cluster interconnect. The following is an example of the command:

      $ oifcfg setif -global bondib0/192.168.8.0:cluster_interconnect
      
    7. List the current interfaces using the following command:

      $ oifcfg getif
      

      The following is an example of the output:

      ib0 192.168.160.0 global cluster_interconnect
      ib1 192.168.160.0 global cluster_interconnect
      bondeth1 10.128.176.0 global public
      

      The old private interface is removed at a later time.

  5. Shut down Oracle Clusterware and Oracle Clusterware CRS on each database server as follows:

    1. Log in as the root user.

    2. Shut down Oracle Clusterware CRS on each database server using the following command:

      # GRID_HOME/grid/bin/crsctl stop crs -f
      
    3. Disable automatic Oracle Clusterware CRS restart on each database server using the following command:

      # GRID_HOME/grid/bin/crsctl disable crs
      
  6. Change the InfiniBand IP addresses on each Exadata Storage Server as follows:

    1. Log in as the root user.

    2. Run the following commands:

      # cellcli -e alter cell shutdown services all
        Stopping the RS, CELLSRV, and MS services...  The SHUTDOWN of services was successful.
      
      # service ocrvottargetd stop
      
      # ipconf
      

      For the service command, respond to the prompts to change the BONDIB0 information. The following is an example of the prompts and responses for the ipconf command. Changes are applied after the prompt for basic ILOM settings.

      Logging started to /var/log/cellos/ipconf.log
      Interface ib0 is Linked.  hca: mlx4_0
      Interface ib1 is Linked.  hca: mlx4_0
      Interface eth0 is Linked.  driver/mac: ixgbe/00:00:00:00:cd:01
      Interface eth1 is ... Unlinked.  driver/mac: ixgbe/00:00:00:00:cd:02
      Interface eth2 is ... Unlinked.  driver/mac: ixgbe/00:00:00:00:cd:03
      Interface eth3 is ... Unlinked.  driver/mac: ixgbe/00:00:00:00:cd:04
       
      Network interfaces
      Name     State      IP address      Netmask         Gateway         Net type     Hostname
      ib0      Linked
      ib1      Linked
      eth0     Linked
      eth1     Unlinked
      eth2     Unlinked
      eth3     Unlinked
      Warning. Some network interface(s) are disconnected. Check cables and swicthes and retry
      Do you want to retry (y/n) [y]: n
       
      The current nameserver(s): 192.0.2.10 192.0.2.12 192.0.2.13
      Do you want to change it (y/n) [n]:
      The current timezone: America/Los_Angeles
      Do you want to change it (y/n) [n]:
      The current NTP server(s): 192.0.2.06 192.0.2.12 192.0.2.13
      Do you want to change it (y/n) [n]:
       
      Network interfaces
      Name     State           IP address    Netmask        Gateway       Net type            Hostname
      eth0     Linked       192.0.2.151  255.255.252.0 192.0.2.15    Management   myg.example.com
      eth1     Unlinked
      eth2     Unlinked
      eth3     Unlinked
      bondib0  ib0,ib1      192.168.13.101 255.255.252.0  Private             myg-priv.example.com
      Select interface name to configure or press Enter to continue: bondib0
      Selected interface. bondib0
      IP address or none [192.168.13.101]: 192.168.10.3
      Netmask [255.255.252.0]:255.255.248.0
      Fully qualified hostname or none [myg-priv.example.com]:
      Continue configuring or re-configuring interfaces? (y/n) [y]: n
       
      Select canonical hostname from the list below
      1: myg.example.com
      2: myg-priv.example.com 
      Canonical fully qualified domain name [1]:
       
      Select default gateway interface from the list below
      1: eth0
      Default gateway interface [1]:
       
      Canonical hostname: myg.example.com
      Nameservers: 192.0.2.10 192.0.2.12 192.0.2.13
      Timezone: America/Los_Angeles
      NTP servers: 192.0.2.06 192.0.2.12 192.0.2.13
      Default gateway device: eth0
      Network interfaces
      Name     State      IP address      Netmask         Gateway         Net type     Hostname
      eth0     Linked     192.0.2.151   255.255.252.0 192.0.2.15     Management   myg.example.com
      eth1     Unlinked
      eth2     Unlinked
      eth3     Unlinked
      bondib0  ib0,ib1    192.168.10.3    255.255.248.0                   Private      myg-priv.example.com
      Is this correct (y/n) [y]:
       
      Do you want to configure basic ILOM settings (y/n) [y]: n
      
      Starting the RS services...
      Getting the state of RS services...  running
       
      Starting MS services...
      The STARTUP of MS services was successful.
      A restart of all services is required to put new network configuration into
      effect. MS-CELLSRV communication may be hampered until restart.
      Cell myg successfully altered
       
      Stopping the RS, CELLSRV, and MS services...
      The SHUTDOWN of services was successful.
      ipaddress1=192.168.10.3/21
      
    3. Restart the Exadata Storage Server using the following command:

      # reboot
      

    See Also:

    Oracle Exadata Storage Server Software User's Guide for additional information about the ipconf command
  7. Restart the cell services using the following command:

    # cellcli -e alter cell restart services all
    
  8. Verify the newly-assigned InfiniBand address on Exadata Storage Server using the following command:

    # cellcli -e list cell detail | grep ipaddress1
    

    The following is an example of the output:

    ipaddress1: 192.168.10.3/21
    
  9. Change the InfiniBand IP addresses on each database server as follows:

    1. Log in as the root user.

    2. Change to the /etc/sysconfig/network-scripts directory.

    3. Copy the ifcfg-bondib0 file. The copied file name must not start with ifcfg. The following is an example of the copy command:

      # cp ifcfg-bondib0 orig_ifcfg-bondib0
      
    4. Edit the ifcfg-bondib0 file to update the IPADDR, NETMASK, NETWORK and BROADCAST fields. The following is an example of the original file, and an updated file:

      Example of original ifcfg-bondib0 file:

      #### DO NOT REMOVE THESE LINES ####
      #### %GENERATED BY CELL% ####
      DEVICE=bondib0
      USERCTL=no
      BOOTPROTO=none
      ONBOOT=yes
      IPADDR=192.168.20.8
      NETMASK=255.255.248.0
      NETWORK=192.168.16.0
      BROADCAST=192.168.23.255
      BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000"
      IPV6INIT=no
      MTU=65520
      

      Example of updated ifcfg-bondib0 file:

      #### DO NOT REMOVE THESE LINES ####
      #### %GENERATED BY CELL% ####
      DEVICE=bondib0
      USERCTL=no
      BOOTPROTO=none
      ONBOOT=yes
      IPADDR=192.168.10.8
      NETMASK=255.255.248.0
      NETWORK=192.168.8.0
      BROADCAST=192.168.15.255
      BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000"
      IPV6INIT=no
      MTU=65520
      

      Note:

      The MTU size for the InfiniBand interfaces on the database servers should be set as follows:
      • For Oracle Exadata Storage Server Software release 11.2.3.3 and later, set the MTU size to 7000.

      • For Oracle Exadata Storage Server Software releases earlier than release 11.2.3.3, set the MTU size to 65520 to ensure a high transfer rate to external devices using TCP/IP over InfiniBand such as media servers or NFS servers.

    5. Restart the database server using the following command:

      # reboot
      
    6. Verify the InfiniBand IP address information using the following command:

      # ifconfig -a
      

      The following is an example of the BONDIB0 information. It shows the updated InfiniBand network information:

      inet addr:192.168.10.8 Bcast:192.168.15.255 Mask:255.255.248.0
      
  10. Update the cellinit.ora and cellip.ora files on each database server as follows:

    Note:

    Do not edit the cellinit.ora or cellip.ora files when the database or Oracle ASM instance are running. To make changes to the files, perform a procedure similar to the following:
    1. Create a copy of the file, such as the following:

      cp cellinit.ora cellinit.new
      
    2. Edit the cellinit.new file with a text editor.

    3. Replace the old cellinit.ora file with the updated cellinit.new file as follows:

      mv cellinit.new cellinit.ora
      
    1. Log in as the root user.

    2. Change to the /etc/oracle/cell/network-config directory.

    3. Make a backup copy of the cellip.ora file. The following is an example of the command:

      # cp cellip.ora orig_cellip.ora
      

      Note:

      If using SSH user-equivalence, then the dcli utility can be used. The following is an example of the dcli command:
      # dcli -l root -g /root/dbs_group "cp cellip.ora orig_cellip.ora"
      
    4. Make a backup copy of the cellinit.ora file. The following is an example of the command:

      # cp cellinit.ora orig_cellinit.ora
      

      Note:

      If using SSH user-equivalence, then the dcli utility can be used. The following is an example of the dcli command:
      # dcli -l root -g /root/dbs_group "cp cellinit.ora \
      orig_cellinit.ora"
      
    5. Change the InfiniBand IP addresses in the cellip.ora file. The following is an example of the original file, and an updated file:

      Example of original file:

      cell="192.168.20.1"
      cell="192.168.20.2"
      cell="192.168.20.3"
      cell="192.168.20.4"
      cell="192.168.20.5"
      cell="192.168.20.6"
      cell="192.168.20.7"
      

      Example of updated file:

      cell="192.168.10.1"
      cell="192.168.10.2"
      cell="192.168.10.3"
      cell="192.168.10.4"
      cell="192.168.10.5"
      cell="192.168.10.6"
      cell="192.168.10.7"
      

      Note:

      If using SSH user-equivalence, then the dcli utility can be used to copy the updated file from the first database server to the other database servers. The following is an example of the dcli command:
      # dcli -l root -g /root/dbs_group -f \
      /etc/oracle/cell/network-config/cellip.ora 
      
      # dcli -l root -g /root/dbs_group "mv /root/cellip.ora \
      /etc/oracle/cell/network-config/"
      
    6. Change the InfiniBand IP addresses in the cellinit.ora file.

      The file is updated with the subnet ID and its subnet mask.

      The following is an example of the original file, and an updated file:

      Example of original file:

      ipaddress="192.168.20.8/21"
      

      Example of updated file:

      ipaddress="192.168.10.8/21"
      

      Update the cellinit.ora file on each database server. The contents of the file is specific to the database server. The dcli utility cannot be used for this step.

    7. Run the ALTER DBSERVER command on each database server to update the /etc/oracle/cell/network-config/cellinit.ora file. For example:

      # dbmcli -e alter dbserver interconnect1 = "ib0"
      # dbmcli -e alter dbserver interconnect2 = "ib1"
      # dbmcli -e alter dbserver interconnect3 = "ib2"
      # dbmcli -e alter dbserver interconnect4 = "ib3"
      
  11. Update the /etc/hosts file on each database server and Exadata Storage Servers to use the new InfiniBand IP addresses as follows:

    1. Log in as the root user.

    2. Make a backup copy of the /etc/hosts file. The following is an example of the command:

      # cp /etc/hosts /etc/orig_hosts
      
    3. Change the InfiniBand IP addresses for the database servers and Exadata Storage Servers files.

  12. Start Oracle Clusterware on each server using the following command:

    /u01/app/11.2.0.3/grid/bin/crsctl start crs
    
  13. Verify the cluster interconnect is using the RDS protocol on each database server by examining the Oracle ASM alert.log. The log is in the /u01/app/oracle/diag/asm/+asm/+ASM1/trace directory. An entry similar to the following should be listed for the most-recent Oracle ASM restart:

    CELL interconnect IPC version: Oracle RDS/IP (generic)
    

    Note:

    For releases 11.2.0.2 and later, the following command can be used to verify cluster interconnect. The command is run as the oracle user on each database server.
    $ORACLE_HOME/bin/skgxpinfo
    

    The output from the command should be rds.

    If the instance is not using the RDS protocol over InfiniBand, then relink the Oracle binary as follows:

    Note:

    Do not use the relink all command to relink the Oracle binary.
    1. As the oracle user, shut down any processes using the Oracle binary.

    2. As the root user, run the following command if relinking the Grid Infrastructure home. Do not perform this step if you are not relinking the Grid Infrastructure home.

      # GRID_HOME/crs/install/rootcrs.pl -unlock
      
    3. As the oracle user, change to the ORACLE_HOME/rdbms/lib directory.

    4. As the oracle user, run the following command:

      $ make -f ins_rdbms.mk ipc_rds ioracle
      
    5. As the root user, run the following command if relinking the Grid Infrastructure home. Do not perform this step if you are not relinking the Grid Infrastructure home.

      # GRID_HOME/crs/install/rootcrs.pl -patch
      
  14. Start all cluster-managed services using the SRVCTL utility as follows:

    1. Log in as the oracle user.

    2. Start the database using the following command:

      $ srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 \
      -s /tmp/dm02db01_dbhome -n dm02db01
      
    3. Verify the database instances are running using the following command:

      $ srvctl status database -d dbm
      
  15. Verify the Oracle ASM and database instances are using the new network settings as follows:

    1. Log in to an Oracle ASM and database instance.

    2. Run the following command:

      $ SELECT inst_id, name,value FROM gv$parameter WHERE name = \
      'cluster_interconnects'
      
  16. Delete the old private network using the following command:

    $ oifcfg delif -global bondib0/192.168.16.0
    
  17. Verify that the old interface is not present using the following command:

    $ oifcfg getif
    

    The following is an example of the output:

    bondeth0  10.204.76.0  global public
    bondib0   192.168.8.0  global cluster_interconnect
    
  18. Enable Oracle Clusterware CRS automatic restart on each database server as follows:

    1. Log in as the root user.

    2. Enable Oracle Clusterware CRS using the following command:

      # GRID_HOME/grid/bin/crsctl enable crs
      

      Note:

      To use the dcli utility to enable Oracle Clusterware CRS, use the following command:
      # dcli -l root -g dbs_group "GRID_HOME/grid/bin/crsctl \
      enable crs"
      
  19. Perform a full restart of Oracle Clusterware on all nodes.

  20. Perform a health check of Oracle Exadata Rack using the steps described in My Oracle Support note 1070954.1.

    Note:

    Oracle Exadata Rack HealthCheck utility collects data for key software, hardware, and firmware releases, and configuration best practices for Oracle Exadata Rack.

    Oracle recommends you periodically review the current data for key components of Oracle Exadata Rack, and compare them to the supported release levels, and recommended best practices.

    Oracle Exadata Rack HealthCheck is not a database, network, or SQL performance analysis tool. It is not a continuous monitoring utility, and does not duplicate other monitoring or alerting tools, such as Integrated Lights Out Manager (ILOM), or Oracle Enterprise Manager Grid Control.

  21. Verify the private network configuration using the clusterware verification utility cluvfy.

    See Also:

    See My Oracle Support Note 316817.1 for information on cluvfy

Configuring Network Routing on Database Servers

There are three logical network interfaces configured on the database servers. The interfaces are the management network (eth0), the client access network (BOND1 or BONDETH0), and the private InfiniBand network (BOND0, BONDIB0, or IB0 and IB1).

Note:

The tasks in this section are for database servers that were configured prior to release 11.2.3.2.1.

Starting with release 11.2.2.3.0, connections that come in on the management network have their responses sent out on the management network interface, and connections on the client access network have their responses sent out on the client access network interface. The private InfiniBand network traffic is direct communication between the two endpoints, and no routers are involved in the communication.

For Oracle Exadata Storage Server Software releases earlier than release 11.2.2.3.0, the default route for outbound traffic not destined for an IP address on the management or private InfiniBand network is sent out using the client access network. The tasks in this section modify the routing such that traffic that comes in on the management network has the responses sent out on the management network. Similarly, traffic coming in on the client network has the responses sent out on the client network.

The tasks for network routing are for boot-time routing or real-time routing. The following apply to both types of routing:

  • These tasks are for database servers running a release earlier than Oracle Exadata Storage Server Software release 11.2.2.3.0.

  • The following sample IP addresses, netmasks, and gateways are used in the tasks:

    • Management network has IP address 10.149.49.12, netmask 255.255.252.0 (network 10.149.48.0/22), and gateway 10.149.48.1.

    • Client access network has IP address 10.204.78.15, netmask 255.255.255.0 (network 10.204.78.0/24), and gateway 10.1.78.1.

Note:

If the database server has additional networks configured, then files should be set up for the additional networks.

Task 1: Configure for Boot-Time Routing

To configure network routing for boot-time routing, rule and routing files must be created for each database server. The rule and routing files must be located in the /etc/sysconfig/network-scripts directory on each database server. For each Ethernet interface on the management network that has a configured IP address, the database server must have route-ethn and rule-ethn files. For each bonded Ethernet interface, the database server must have route-bondethn and rule-bondethn files. The following are examples of the content in the files:

File Content
/etc/sysconfig/network-scripts/rule-eth0
from 10.149.49.12 table 220
to 10.149.49.12 table 220
/etc/sysconfig/network-scripts/route-eth0
10.149.48.0/22 dev eth0 table 220
default via 10.149.48.1 dev eth0 table 220
/etc/sysconfig/network-scripts/rule-bondeth0
from 10.204.78.0/24 table 210
to 10.204.78.0/24 table 210
/etc/sysconfig/network-scripts/route-bondeth0
10.204.78.0/24 dev bondeth0 table 210
default via 10.204.78.1 dev bondeth0 table 210

Task 2: Configure for Real-Time Routing

To configure the rules on a running system, use the /sbin/ip command to create the same configuration that is performed at startup. The following commands result in the same configuration as the boot-time files:

/sbin/ip rule add from 10.149.49.12 table 220
/sbin/ip rule add to 10.149.49.12 table 220
/sbin/ip route add 10.149.48.0/22 dev eth0 table 220
/sbin/ip route add default via 10.149.48.1 dev eth0 table 220

/sbin/ip rule add from 10.204.78.0/24 table 210
/sbin/ip rule add to 10.204.78.0/24 table 210
/sbin/ip route add 10.204.78.0/24 dev bondeth0 table 210
/sbin/ip route add default via 10.204.78.1 dev bondeth0 table 210

Oracle recommends restarting the database server after running the commands to validate that the boot-time configuration is correct.

Task 3: Verify Network Routing Rules and Routes

Use the following command to verify the network routing rules. The command output shows all the rules on the system.

# /sbin/ip rule list
0:      from all lookup 255 
32762:  from all to 10.204.78.0/24 lookup 210 
32763:  from 10.204.78.0/24 lookup 210 
32764:  from all to 10.149.49.12 lookup 220 
32765:  from 10.149.49.12 lookup 220 
32766:  from all lookup main 
32767:  from all lookup default 

The default routing table is not changed because two new routing tables are created during the preceding tasks. The new routing tables are used when the rules dictate their use. The following commands show how to check the default and new routing tables:

  • To check the default routing table. The following is an example of the command and output.

    # /sbin/ip route list
    10.204.78.0/24 dev bondeth0  proto kernel  scope link  src 10.204.78.15
    192.168.10.0/24 dev bondib0  proto kernel  scope link  src 192.168.10.8 
    10.149.48.0/22 dev eth0  proto kernel  scope link  src 10.149.49.12 
    default via 10.149.52.1 dev bondeth0
    
  • To check that the supplemental tables include the table name with the command. The following is an example of the command and output.

    # /sbin/ip route list table 220
    10.149.48.0/22 dev eth0  scope link 
    default via 10.149.48.1 dev eth0 
    root@dbhost# ip route list table 210
    10.204.78.0/24 dev bondeth0  scope link 
    default via 10.204.78.1 dev bondeth0
    

Removing Network Routing Configuration for Troubleshooting

The network routing configuration can be removed to configure or troubleshoot Oracle Exadata Database Machine. Use the following commands to remove the rules and routes:

/sbin/ip route del default via 10.149.48.1 dev eth0 table 220
/sbin/ip route del 10.149.48.0/22 dev eth0 table 220
/sbin/ip rule del to 10.149.49.12 table 220
/sbin/ip rule del from 10.149.49.12 table 220

/sbin/ip route del default via 10.204.78.1 dev bondeth0 table 210
/sbin/ip route del 10.204.78.0/24 dev bondeth0 table 210
/sbin/ip rule del to 10.204.78.0/24 table 210
/sbin/ip rule del from 10.204.78.0/24 table 210

Returning to Default Routing

To return to the default network routing, delete the supplemental files from the /etc/sysconfig/network-scripts directory, and then restart the server. The following is an example of the commands to remove the files, and restart the server:

/bin/rm -f /etc/sysconfig/network-scripts/rule-eth0
/bin/rm -f /etc/sysconfig/network-scripts/route-eth0
/bin/rm -f /etc/sysconfig/network-scripts/rule-bondeth0
/bin/rm -f /etc/sysconfig/network-scripts/route-bondeth0
reboot

Changing the DNS Servers

The configuration settings for the Domain Name System (DNS) servers can be changed after initial setup. All servers and switches in Oracle Exadata Database Machine should reference the same DNS servers. All domains that Oracle Exadata Database Machine references should be resolvable through each individual DNS server. This section contains the tasks and procedures for setting the Oracle Exadata Database Machine servers and switches to the same DNS servers. Oracle recommends changing each server one at a time.

Task 1: Change the DNS Server Address on the Sun Datacenter InfiniBand Switch 36 Switch

All configuration procedures should be done as the ilom-admin user using the ILOM interface. Use one of the following procedures to change the DNS server, depending on firmware release:

  • Firmware earlier than 2.0.4:

    1. Log in to the Sun Datacenter InfiniBand Switch 36 switch as the root user.

    2. Edit the /etc/resolv.conf file to set the DNS server and domain name using an editor such as vi. There should be a line for each DNS server.

    3. Save the file.

  • Firmware 2.0.4 or later:

    1. Log in as the ilom-admin user.

    2. Set the DNS address using one of the following options:

      • Using the ILOM web interface:

        Select the Configuration tab and set the DNS server addresses.

      • Using the command line interface, set the DNS server using the following command:

        set /SP/clients/dns nameserver=dns_ip
        

        In the preceding command, dns_ip is the IP address of the DNS server. If there is more than one DNS server, then enter a comma-separated list such as set /SP/clients/dns nameserver=dns_ip1,dns_ip2,dns_ip3.

Task 2: Change the DNS Server Address on the Cisco Ethernet Switch

The following procedure describes how to change the DNS server address on the Cisco Ethernet switch:

  1. Access the switch using one of the following methods, based on the firmware release:

    • Firmware earlier than release 12.2:

      Access the switch using Telnet, and log in as the administrator using the administrative password.

    • Firmware release 12.2 or later:

      Access the switch using SSH, and log in as the admin user with the administrator password.

      Note:

      If SSH has not been configured, then use Telnet to access the switch as the admin user.
  2. Change to enable mode using the following command. When prompted for a password, use the administrator password.

    Switch> enable
    
  3. Review the current configuration using the following command:

    Switch# show running-config
    
  4. Erase the current DNS server information using the following commands:

    Switch# configure terminal
    Enter configuration commands,one per line.End with CNTL/Z.
    Switch(config)# no ip name-server 10.7.7.2
    Switch(config)# no ip name-server 129.148.5.4
    Switch(config)# no ip name-server 10.8.160.2
    Switch(config)# end
    Switch# write memory
    Building configuration...
    Compressed configuration from 2603 bytes to 1158 bytes [OK ]
    

    Note:

    Each current DNS IP address to be changed needs to be erased. Invalid IP addresses must also be erased.
  5. Configure up to three DNS servers. The following is an example:

    Switch# configure terminal
    Enter configuration commands,one per line.End with CNTL/Z.
    Switch(config)# ip name-server 10.7.7.3
    Switch(config)# ip name-server 129.148.5.5
    Switch(config)# ip name-server 10.8.160.1
    Switch(config)# write memory 
    Building configuration...
    Compressed configuration from 2603 bytes to 1158 bytes [OK ]
    
  6. Verify the changes by running the following command, and reviewing the output:

    Switch# show running-config
    

    The following is an example of the output from the command:

    Building configuration...
    ...
    ip domain-name example.com
    ip name-server 192.168.10.2
    ip name-server 192.168.10.3
    ip name-server 192.168.10.4
    ...
    
  7. Save the configuration using the following command:

    Switch# copy running-config startup-config
    Destination filename [startup-config]? 
    Building configuration...
    Compressed configuration from 14343 bytes to 3986 bytes[OK]
    
  8. Exit the session using the following command:

    Switch# exit
    

Task 3: Change the DNS Server Address on the Database Server

The following procedure describes how to change the DNS server address on the database servers:

  1. Log in as the root user.

  2. Edit the /etc/resolv.conf file to set the DNS server and domain name using an editor such as vi. There should be a name server line for each DNS server. The following is an example of the updated file:

    search        example.com
    nameserver    10.7.7.3
    
  3. Set the DNS server in the server ILOM using the following command on the database server:

    ipmitool sunoem cli 'set /SP/clients/dns nameserver=dns_ip'
    

    In the preceding command, dns_ip is the IP address of the DNS server. If there is more than one DNS server, then enter a comma-separated list such as set /SP/clients/dns nameserver=dns_ip1,dns_ip2,dns_ip3.

  4. Repeat this procedure for each database server.

Task 4: Change the DNS Server on Exadata Storage Server

The following procedure describes how to change the DNS server on Exadata Storage Servers:

Note:

The NTP settings can also be set during this procedure.
  1. Log in as the root user.

  2. Specify a time interval to repair the disk and bring it back online as follows. The default DISK_REPAIR_TIME attribute value of 3.6 hours should be long enough for most environments.

    1. Check the repair time for all mounted disk groups by logging in to the Oracle ASM instance, and running the following query:

      SQL> select dg.name,a.value from v$asm_diskgroup dg, v$asm_attribute \
      a where dg.group_number=a.group_number anda.name='disk_repair_time';
      
    2. Adjust the parameter, as needed, using the following command:

      SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'DISK_REPAIR_TIME'='h.nH';
      

      In the preceding command, h.n is the amount of time, such as 4.6.

  3. Check that putting the grid disks offline will not cause a problem for Oracle ASM using the following command:

    cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
    

    The value Yes should be returned for the grid disks. If one or more disks does not return a Yes value, then restore data redundancy for the disk group, and repeat the command until all disk groups return a Yes value.

  4. Inactivate all grid disks on the cell using the following command:

    # cellcli -e alter griddisk all inactive
    

    This command may take more than 10 minutes to complete. Inactivating the grid disks automatically sets the disk to offline in the Oracle ASM instance.

  5. Confirm the grid disks are offline as follows:

    1. Check the status of the grid disks using the following command:

      # cellcli -e list griddisk attributes name, asmmodestatus,    \
      asmdeactivationoutcome
      

      The output should show asmmodestatus=OFFLINE or asmmodestatus=UNUSED, and asmdeactivationoutcome=Yes for all grid disks.

    2. List the grid disk to confirm that they are offline using the following command:

      # cellcli -e list griddisk
      
  6. Shut down the cell services and ocrvottargetd service using the following commands:

    # cellcli -e alter cell shutdown services all
    # service ocrvottargetd stop
    

    Note:

    The ocrvottargetd service is not included in some releases.
  7. Use the ipconf utility to change the DNS settings using the following command:

    # /usr/local/bin/ipconf
    
  8. Restart the server using the following command. The server does not need to reboot.

    # service ocrvottargetd start
    # cellcli -e alter cell startup services all
    
  9. Activate the grid disks when the cell comes online using the following command:

    # cellcli -e alter griddisk all active
    
  10. Verify the disks are active using the following command. The output should show active.

    # cellcli -e list griddisk
    
  11. Verify the grid disk status as follows:

    1. Check that all grid disks are online using the following command:

      # cellcli -e list griddisk attributes name, asmmodestatus
      
    2. Wait as Oracle ASM synchronization completes for all grid disks. Each disk will go to a SYNCING state first then ONLINE. The following is an example of the output:

      DATA_CD_00_dm01cel01 ONLINE
      DATA_CD_01_dm01cel01 SYNCING
      DATA_CD_02_dm01cel01 OFFLINE
      DATA_CD_03_dm01cel01 OFFLINE
      DATA_CD_04_dm01cel01 OFFLINE
      DATA_CD_05_dm01cel01 OFFLINE
      DATA_CD_06_dm01cel01 OFFLINE
      DATA_CD_07_dm01cel01 OFFLINE
      DATA_CD_08_dm01cel01 OFFLINE
      DATA_CD_09_dm01cel01 OFFLINE
      DATA_CD_10_dm01cel01 OFFLINE
      DATA_CD_11_dm01cel01 OFFLINE
      

      Oracle ASM synchronization is complete when all grid disks show asmmodestatus=ONLINE.

  12. Repeat this procedure for each Exadata Storage Server.

Task 5: Change the DNS Server on the KVM Switch

The following procedure describes how to change the DNS server configuration using the KVM switch:

Notes:

  • The KVM switch is only available in Oracle Exadata Database Machine X2-2 racks and Oracle Exadata Storage Expansion Racks with Exadata Storage Server with Sun Fire X4270 M2 Servers.

  • The KVM switch does not support NTP.

  1. Log in to the KVM switch. You can log in directly on the KVM switch or access the switch using the host name or IP address over the Internet.

  2. Select Appliance from Unit View.

  3. Select DNS from Appliance Settings.

  4. Select DNS Configuration.

  5. Enter the DNS configuration. The following configuration options are available:

    • DNS Mode (Manual, DHCP, DHCPv6)

    • DNS Server Addresses (Primary, Secondary, Tertiary)

  6. Click Save.

Changing the NTP Servers

The configuration settings for the Network Time Protocol (NTP) servers can be changed after initial setup. All servers and switches in Oracle Exadata Database Machine should reference the same NTP servers so that the servers are synchronized to the same time. This section contains the tasks and procedures for setting the Oracle Exadata Database Machine servers and switches to the same NTP server addresses. Oracle recommends changing each server one at a time.

This section contains the following tasks:

Notes:

  • It is necessary to restart Oracle Exadata Database Machine after completing these tasks.

  • Up to two NTP servers can be configured for use with Oracle Exadata Database Machine.

Task 1: Set the NTP Server Address on the Database Servers

The following procedure describes how to set the NTP server address on the database server:

  1. Stop the NTP services on the database server using the following command:

    service ntpd stop
    
  2. Update the following file with the IP address of the new NTP server using the following command:

    /etc/ntp.conf
    
  3. Start the NTP services on the database server using the following command:

    service ntpd start
    
  4. Repeat steps 1 through 3 for each database server.

Task 2: Set the NTP Server Address on the Sun Datacenter InfiniBand Switch 36 Switch

The following procedure describes how to set the NTP server address on the Sun Datacenter InfiniBand Switch 36 switch:

Note:

Do not manually edit the files on the InfiniBand switches.
  1. Log in as the ilom-admin user.

  2. Set the date, time zone, and Network Time Protocol (NTP) using one of the following methods:

    • Using the Configuration page on the ILOM graphical interface.

    • Using the following commands:

      set /SP/clock timezone=preferred_tz
      set /SP/clients/ntp/server/1 address=ntp_ip1
      set /SP/clients/ntp/server/2 address=ntp_ip2
      set /SP/clock usentpserver=enabled 
      

      In the preceding commands, preferred_tz is the preferred time zone, and ntp_ip1 and ntp_ip2 are the NTP server IP addresses. It is not necessary to configure both NTP servers, but at least one should be configured.

Task 3: Set the NTP Server Address on the Cisco Ethernet Switch

The following procedure describes how to set the NTP server on the Cisco Ethernet switch:

  1. Access the switch using one of the following methods, based on the firmware version:

    • Firmware version earlier than version 12.2: Access the switch using Telnet, and log in as the administrator using the administrative password.

    • Firmware version 12.2 or later: Access the switch using SSH, and log in as the admin user with the admin password.

      Note:

      If SSH has not been configured, then use Telnet to access the switch as the admin user.
  2. Change to enable mode using the following command. When prompted for a password, use the administrator password.

    Switch> enable
    
  3. Review the current configuration using the following command:

    Switch# show running-config
    
  4. Erase the current NTP server configuration using commands similar to the following. In the example, the current IP addresses are 10.10.10.1 and 10.8.8.1.

    Switch# configure terminal
    Enter configuration commands,one per line.End with CNTL/Z.
    Switch(config)# no ntp server 10.10.10.1
    Switch(config)# no ntp server 10.8.8.1
    Switch(config)# end
    Switch# write memory
    Building configuration...
    Compressed configuration from 2603 bytes to 1158 bytes [OK ]
    

    Note:

    Each current NTP IP address being changed needs to be erased. Invalid IP addresses must also be erased.
  5. Configure up to two NTP servers. The following is an example:

    Switch# configure terminal
    Enter configuration commands,one per line.End with CNTL/Z.
    Switch(config)# ntp server 10.7.7.1 prefer
    Switch(config)# ntp server 10.9.9.1
    Switch(config)# end
    Switch# write memory
    Building configuration...
    Compressed configuration from 2603 bytes to 1158 bytes [OK ]
    
  6. Verify the changes by running the following command and reviewing the output:

    Switch# show running-config
    

    The following is an example of the output from the command:

    Building configuration...
    ...
    ntp server 192.168.10.10 prefer
    ...
    
  7. Save the configuration using the following command:

    Switch# copy running-config startup-config
    Destination filename [startup-config]? 
    Building configuration...
    Compressed configuration from 14343 bytes to 3986 bytes[OK]
    
  8. Exit from the session using the following command:

    Switch# exit
    

Task 4: Set the NTP Server on Exadata Storage Servers

The following procedure describes how to set the NTP server on Exadata Storage Servers:

Note:

The DNS settings can also be set during this procedure.
  1. Log in as the root user.

  2. Follow steps 2 through 5 of "Task 4: Change the DNS Server on Exadata Storage Server".

  3. Shut down the cell services and ocrvottargetd service using the following commands:

    # cellcli -e alter cell shutdown services all
    # service ocrvottargetd stop
    

    Note:

    The ocrvottargetd service is not included in some releases.
  4. Use the ipconf utility to change the NTP settings using the following command:

    # /usr/local/bin/ipconf
    
  5. Restart the server using the following command. The server does not need to reboot.

    # service ocrvottargetd start
    # cellcli -e alter cell startup services all
    
  6. Follow steps 9 through 11 of "Task 4: Change the DNS Server on Exadata Storage Server".

Task 5: Restart Oracle Exadata Database Machine

After changing the servers and switches, restart Oracle Exadata Database Machine.

Changing the Time Zone Settings

This section provides information about changing the time zones on Oracle Exadata Database Machine after initial configuration and deployment. The following components need to be modified when changing the time zone settings:

  • Exadata Storage Servers

  • Database servers

  • Sun Datacenter InfiniBand Switch 36 switches

  • Cisco switch

Note:

Cell services must be stopped before changing the time zone settings on the Exadata Storage Servers. Oracle Clusterware Services must be stopped before changing the time zone settings.

The following tasks describe how to change the time zone settings on the components:

Task 1: Change Time Zone Settings on Exadata Storage Servers

The following procedure describes how to change the time zone setting on Exadata Storage Servers. Complete the setting changes to all storage servers before changing the settings on the database servers.

  1. Log in as the root user on the storage server.

  2. Stop the processes on the cells using the following command:

    # cellcli -e alter cell shutdown services all
    
  3. Run the ipconf script using the following command:

    # /opt/oracle.cellos/ipconf
    
  4. Proceed through the script prompts till getting to the time zone prompts. Do not change any other settings. The following is an example of the time zone prompts for changing the time zone from Antarctica to the United States. The number for the United States is 230.

    The current timezone: Antarctica/McMurdo
    Do you want to change it (y/n) [n]: y
     
    Setting up local time...
     
    1) Andorra
    2) United Arab Emirates
    3) Afghanistan
    .
    .
    .
    15) Aruba
    16) Aaland Islands
    Select country by number, [n]ext, [l]ast: 230
    
    Selected country: United States (US). Now choose a zone
     
    1) America/New_York
    2) America/Detroit
    3) America/Kentucky/Louisville
    .
    .
    .
    15) America/North_Dakota/New_Salem
    16) America/Denver
    Select zone by number, [n]ext: 1
    
    Selected timezone: America/New_York
    Is this correct (y/n) [y]:
    
  5. Proceed through the rest of the script prompts, but do not change any other values.

  6. Ensure the time zone changes are in the following files. Examples of the changes are shown for the files.

    • /opt/oracle.cellos/cell.conf file

      $VAR1 = {
                'Hostname' => 'xdserver.us.example.com',
                'Ntp servers' => [
                                   '10.141.138.1'
                                 ],
                'Timezone' => 'America/New_York',
      
    • /etc/sysconfig/clock file

      ZONE="America/New_York"
      UTC=false
      ARC=false
      #ZONE="Antarctica/McMurdo"
      #ZONE="America/New_York"
      #ZONE="America/Los_Angeles"
      

      The uncommented value, not preceded by #, is the current setting.

    • /etc/localtime command

      Run the strings /etc/localtime command to verify the change. The last line shows the time zone.

      ~^Ip
      EST5EDT,M3.2.0,M11.1.0
      
  7. Restart the server.

  8. Use the date command to see the current time zone. The following is an example of the output from the command:

    # date
    Tue Jan 29 17:37:01 EDT 2013
    
  9. Review the $ADR_BASE/diag/asm/cell/host_name/alert.log file. The time that processes were restarted should match the current and correct time.

Task 2: Change Time Zone Settings on the Database Servers

The following procedure describes how to change the time zone setting on the database servers:

  1. Log in as the root user on the database server.

  2. Shut down Oracle Clusterware Services using the following command:

    # GI/bin/crsctl stop crs
    
  3. Use the following command to copy the /etc/sysconfig/clock file from one of the Exadata Storage Servers.

    # scp root@storage_cell:/etc/sysconfig/clock /etc/sysconfig/clock
    
  4. Use the following command to change the Oracle Clusterware settings to avoid CRS from starting after restarting the server:

    # GI/bin/crsctl disable crs
    
  5. Restart the database server.

  6. Use the date command verify the change for the time zone.

  7. Use the following command to change the Oracle Clusterware settings to start after restarting the server:

    # GI/bin/crsctl enable crs
    
  8. Use the following command to start CRS on the server:

    # GI/bin/crsctl start crs
    

Task 3: Change Time Zone Settings on the Sun Datacenter InfiniBand Switch 36 Switches

The following procedure describes how to change the time zone setting on the Sun Datacenter InfiniBand Switch 36 switches:

  1. Log in to the switch using SSH as the root user.

  2. Use the nm2version command to check the version of the switch software. The following is an example of the output from the command:

    # version
    SUN DCS 36p version: 1.3.3-2
    Build time: Apr  4 2011 11:15:19
    SP board info:
    Manufacturing Date: 2013.02.19
    Serial Number: "ABCDE1234"
    Hardware Revision: 0x0007
    Firmware Revision: 0x0000
    BIOS version: SUN0R100
    BIOS date: 06/22/2010
    
  3. Administer the switch as follows, depending on the software version:

    • If the software version is 1.1.3-2 or later, then administration of the switch is done using the ILOM as follows:

      1. Log in to the ILOM using the web address http://switch_alias.

      2. Select the Configuration tab.

      3. Select the Clock tab.

      4. Ensure the Synchronize Time Using NTP field is enabled.

      5. Enter the correct IP address for the servers.

      6. Click Save.

    • If the software version is earlier than 1.1.3-2, then log in to the switch using SSH as follows:

      1. Log in to the switch using the following command:

        #ssh -l root {switch_ip | switch_name}
        
      2. Stop the ntpd daemon using the following command:

        # service ntpd stop
        
      3. Save a copy of the /etc/localtime file using the following command:

        # cp /etc/localtime /etc/localtime.backup
        
      4. Identify the file in the /usr/share/zoneinfo directory for the time zone. The following is an example for the United States:

        #cd /usr/share/zoneinfo/US
        #ls 
        Alaska  Aleutian  Arizona  Central  Eastern  East-Indiana  Hawaii 
        Indiana-Starke  Michigan  Mountain  Pacific  Samoa
        
      5. Copy the appropriate file to the /etc/localtime directory. The following is an example of the command:

        # cp /usr/share/zoneinfo/US/Eastern /etc/localtime
        
      6. Manually set the current date and time to values near the current time.

      7. Synchronize the time to the NTP server for the new time zone using the date command with the MMddHHmmCCyy format for Month, Day, Hour, Minute, Century, Year. The following is an example of the command:

        # date 013110452013
        # ntpd -q -g
        
      8. Validate the date using the following command:

        # date
        
      9. Restart the nptd daemon using the following command:

        # service ntpd start
        

Task 4: Change Time Zone Settings on the Cisco Switch

The following procedure describes how to change the time zone setting on the Cisco switch:

  1. Use Telnet to connect to the Cisco switch as the root user.

  2. Use the enable command to enter privileged mode.

  3. Use the configure terminal command to begin configuration.

  4. Set the clock using the following command:

    clock timezone zone hours_offset [minutes_offset]
    

    In the preceding command, zone is the name of the time zone to be displayed when standard time is in effect, hours_offset is the hours offset from UTC, and minutes_offset are the minutes offset from UTC. The default time zone is UTC.

    To set summer time (daylight savings time), use the following command:

    clock summer-time zone recurring [week day month hh:mm week day month   \
          hh:mm [offset]]
    

    In the preceding command, week is the week of the month, between 1 to 5, day is the day of the week, such as Sunday or Monday, month is the month, such as January or June, hh:mm is the time in 24-hour format, and offset is the number of minutes to add during summer time. The default for hh:mm is 60. Summer time is disabled by default.

    The following is an example of setting the time zone to US Eastern time with summer time enabled:

    $ telnet dmcisco-ip
    Connected to switch name
    Escape character is '^]'.
    
    User Access Verification
    
    Password: 
    dmcisco-ip>enable
    Password: 
    dmcisco-ip#configure terminal
    Enter configuration commands, one per line.  End with CNTL/Z.
    dmcisco-ip(config)#clock timezone EST -5
    dmcisco-ip(config)#clock summer-time EDT recurring
    dmcisco-ip(config)#end
    dmcisco-ip#write memory
    Building configuration...
    Compressed configuration from 6421 bytes to 2041 bytes[OK]
    dmcisco-ip#show clock
    12:03:43.516 EDT Wed May 12 2012
    dmcisco-ip#
    

Configuring the KVM Switch

The following procedure describes how to configure the KVM (Keyboard, Video, Mouse) switch. The switch is configured with all the connected components powered off.

Note:

The KVM switch is only available in Oracle Exadata Database Machine X2-2 racks and Oracle Exadata Storage Expansion Racks with Exadata Storage Server with Sun Fire X4270 M2 Servers.
  1. Pull the KVM tray out from the front of the rack, and open it using the handle.

  2. Touch the touch pad.

  3. Toggle between the host and KVM interface by pressing the Ctrl key on the left side twice, similar to a double-click on a mouse.

  4. Select Target Devices from the Unit View of the user interface. The number of sessions shown should be 22 for Oracle Exadata Database Machine Full Rack, 11 for Oracle Exadata Database Machine Half Rack, and 5 for Oracle Exadata Database Machine Quarter Rack. The number of sessions should be 18 for Oracle Exadata Storage Expansion Full Rack, 9 for Oracle Exadata Storage Expansion Half Rack, and 4 for Oracle Exadata Storage Expansion Quarter Rack.

    Note:

    If all sessions are not shown, then select IQ Adaptors from the Ports heading. Click the table heading, and then Port, to sort the sessions by port number. Note any missing items. The sessions are numbered from the bottom of the rack to the top.
  5. Return to the Target Devices screen.

  6. Select Local from User Accounts.

  7. Click Admin under Users.

  8. Set a password for the Admin account. Do not modify any other parameters.

  9. Click Save.

  10. Select Network from Appliance Settings. The Network Information screen appears.

  11. Select IPv4 or IPv6.

  12. Enter the values for Address, Subnet, Gateway, and the IP addresses of the DNS servers.

  13. Click Save.

  14. Connect the KVM LAN1 Ethernet port to the management network.

  15. Verify the port has been configured correctly by checking the MAC address on the Network Information screen. The address should match the label next to the LAN1/LAN2 ports on the rear of the KVM switch.

  16. Select Overview from Appliance.

  17. Enter a name for the KVM switch.

  18. Click Save.

  19. Restart the KVM switch by selecting Reboot under Overview.

  20. Examine the firmware version of the switch by selecting Versions from Appliance Settings. There are two version numbers shown, Application and Boot, as shown in the following:

    Required version is:
    Application 1.2.10.15038
    Boot  1.6.15020
    

    Note:

    The recommended firmware version is 1.2.8 or later.

    If the firmware is 1.2.3 or earlier, then it can be upgraded from a network browser. If it is version 1.2.3 or later, then it can be upgraded from the local keyboard using a flash drive plugged in to the KVM USB port. To upgrade the firmware, do the following:

    1. Select Overview from Appliance.

    2. Select Upgrade Firmware from the Tools list.

    3. Select the method to upgrade.

    4. Click Upgrade.

    5. Confirm the firmware version.

    The firmware is available at

    http://www.avocent.com/Pages/GenericTwoColumn.aspx?id=12541

Configuring the KVM Switch to Access a Server

The following procedure describes how to configure the KVM switch to access the servers:

Note:

The KVM switch is only available in Oracle Exadata Database Machine X2-2 racks and Oracle Exadata Storage Expansion Racks with Exadata Storage Server with Sun Fire X4270 M2 Servers.
  1. Select Target Devices from Unit View.

  2. Power on the server. The power button is on the front panel. If the button seems stuck, then use a small tool to loosen the button.

  3. Click the system name in the Name column using the left mouse button.

  4. Click Overview, and overwrite the name with the Oracle standard naming format of customer prefix, node type, and number. For example, trnacel03 has the prefix trna, and is storage cell 3 from the bottom of the rack, and trnadb02 has the prefix trna, and is database server 2 from the bottom of the rack.

  5. Press Save.

  6. Repeat steps 2 through 5 for each server in the rack. Each server boots up through BIOS, and boots the operating system with the default factory IP configuration.

Accessing a Server Using the KVM Switch

The following procedure describes how to access a server using the KVM switch:

Note:

The KVM switch is only available in Oracle Exadata Database Machine X2-2 racks and Oracle Exadata Storage Expansion Racks with Exadata Storage Server with Sun Fire X4270 M2 Servers.
  1. Select Target Devices from Unit View.

  2. Click the system name in the Name column using the left mouse button.

  3. Click the KVM session.