15 Using InfiniBand Partitions in Exalogic Physical Environments

This chapter describes how to use InfiniBand partitions for network isolation on Exalogic's InfiniBand fabric in the Exalogic physical environment.

Note:

If you are connecting your Exalogic machine to Oracle Exadata Database Machine on the same InfiniBand fabric, you must use the default partition for data traffic between Exalogic machine and Oracle Exadata Database Machine. In this scenario, if you wish to implement network isolation, you can configure IP subnets on the default IPoIB network interface.

In addition, see Section 15.9, "Important Notes for Combined Exalogic-Exadata Fabric Users" for more information about using partitions in this scenario.

This chapter contains the following sections:

15.1 Overview of Partitioning

An InfiniBand partition defines a group of InfiniBand nodes that are allowed to communicate with one another. You can use InfiniBand partitions to increase security by implementing network isolation on Exalogic machine's InfiniBand fabric. In addition, you can associate InfiniBand nodes with specific VLANs.

An InfiniBand node can be a member of multiple partitions. When a packet arrives at a compute node, the partition key (pkey) of the packet is matched with the Subnet Manager configuration. This validation prevents a compute node from communicating with another compute node outside its partition.

Based on your requirements, you can create additional partitions as follows:

  • Create a unique partition for Exalogic's private InfiniBand fabric by setting nondefault partition keys.

    This scenario applies to both single rack and multiple Exalogic racks.

  • Create Virtual LANs (VLANs) on the client access network for EoIB configuration by specifying nondefault partition keys.

    VLAN tagging for a virtual network interface (VNIC) on the EoIB network is optional.

15.2 Understanding Partition Keys

A partition key (pkey) is a unique ID assigned to an InfiniBand partition. The pkey of the default partition is 0x7fff. When a pkey is created, it is a 15-bit number. After the membership type is set, the pkey value becomes a 16-bit number. The Most Significant Bit (MSB) of the 16-bit pkey value denotes the membership type. A limited member has a value of 0, and a full member has a value of 1.

A full member can communicate with both full and limited members of the partition. However, a limited member can only communicate with a full member.

When assigning a pkey value for a unique, nondefault partition, you should select a 15-bit value. For example, 0x1234 with values from 0x0001 to 0x7fff. A total of 32767 pkeys are available. Do not assign pkeys that differ only in the MSB of their 16-bit numbers (for example, 0x8005 and 0x0005).

Guidelines for Managing pkey Allocation in a Hybrid Rack

The term hybrid rack denotes an Exalogic machine on which half the compute nodes are in a physical configuration and the other half constitutes a virtualized data center. For more information about hybrid racks, see the Exalogic Elastic Cloud Release Notes.

On a hybrid Exalogic rack, Exalogic Control ensures that a unique pkey is assigned to each partition in the virtual environment. However, in the physical half of the rack, pkeys continue to be assigned manually, typically by the network administrator. The following guidelines will help ensure that the pkeys assigned manually to the partitions created in the physical half of a hybrid rack are different from those that Exalogic Control assigns automatically to partitions created in the virtual half of the rack.

  1. Make a list (say, L1) of all the pkeys assigned to partitions that were created before the rack was converted to a hybrid configuration.

    This set includes the pkey for the IPoIB-default partition (0x7fff) and pkeys for any nondefault partitions that were created in the physical configuration. Note that these pkeys are not guaranteed to be sequential, because they are assigned manually by administrators, who may be using different conventions for assigning pkeys to partitions. For example, for EoIB partitions, some administrators may follow the convention of assigning pkey values that match the VLAN IDs used for the EoIB networks.

  2. Identify a list (say, L2) of pkeys to be assigned to partitions created in the physical half of the hybrid rack.

    Select a list that is preferably near the upper end of the 0x0001–0x7ffe range. For example, if you identify 0x7000 to 0x7ffe as you range, you can create up to 4096 partitions. We recommend the upper end, because Exalogic Control assigns pkeys starting from lower end—that is, 0x0001.

  3. In this list (L2), mark or remove the pkeys that were assigned before the rack was converted to a hybrid configuration—that is, the L1 list you created earlier.

  4. As you create partitions, select pkeys from only the predetermined list (L2) and keep track of the pkeys that you are assigning.

    Such an approach provides a reasonable guarantee that pkeys assigned in the physical environment are different from the pkeys that Exalogic Control assigns in the virtual half of the rack.

After the Exalogic machine is converted to a hybrid rack, for every network (either IPoIB or EoIB) that you create in the virtual half of the hybrid rack, Exalogic Control automatically assigns a unique pkey, starting from 0x0001. While selecting an unused pkey for a new partition, Exalogic Control will skip any pkeys (both L1 and L2) that are used for partitions in the physical half of the configuration. This way, every partition on the hybrid rack—regardless of whether it is on the physical or virtual part—will have a unique pkey.

15.3 Before You Begin

Before you can start creating unique InfiniBand partitions, you must complete the following tasks:

  1. Verify the switch firmware version

  2. Gather the port GUIDs of compute nodes and BridgeX ports of gateway switches

  3. Identify the InfiniBand switches in your Exalogic machine's InfiniBand fabric and note down their IP addresses

  4. Determine which InfiniBand switch is running the master Subnet Manager (SM)

  5. Log in to the InfiniBand switch that is running the master Subnet Manager (SM)

15.3.1 Verifying InfiniBand Switch Firmware

Ensure that the InfiniBand switches in your Exalogic machine are installed with firmware versions 2.0.4 or above. This requirement is mandatory.

15.3.2 Gathering Port GUIDs of Compute Nodes and BridgeX Ports of Gateway Switches

Before creating an InfiniBand partition, you must identify the port GUIDs of Exalogic compute nodes that will be added to the partition. In addition, you must identify the BridgeX ports of the gateway switches that are connected to those Exalogic compute nodes.

Identifying Port GUIDs on Compute Nodes

To identify the port GUIDs on an Exalogic compute node, run the following command on the command line:

# ibstat

This command displays output, as in the following example:

Surrounding text describes ibstat_eg.gif.

In the above example, Port GUID values are highlighted in a rectangle for illustration purposes only. The actual command does not highlight Port GUID. You must see the command output and note down the values for both InfiniBand ports on each compute node.

Alternatively, you can run the following command to display only GUIDs:

# ibstat | grep 'Port GUID:'

Identifying BridgeX Ports on Gateway Switches

To identify the BridgeX ports on the gateway switches that are connected to your compute nodes, run the following command at the command prompt on each gateway switch that your compute node is connected to:

# showgwports

This command displays the BridgeX ports. Note down the values in the INTERNAL PORTS section of the output, as in the following example:

INTERNAL PORTS:
---------------
Device Port Portname PeerPort PortGUID LID IBState GWState
--------------------------------------------------------------------
Bridge-0 1 Bridge-0-1 4 0x002128548062c001 0x0015 Active Up
Bridge-0 2 Bridge-0-2 3 0x002128548062c002 0x000d Active Up
Bridge-1 1 Bridge-1-1 2 0x002128548062c041 0x000f Active Up
Bridge-1 2 Bridge-1-2 1 0x002128548062c042 0x0010 Active Up

Tip:

In an Exalogic machine full rack, compute nodes 1 to 15 (start from the bottom of the rack) connect their InfiniBand port 1 to gateway switch1 and their InfiniBand port2 to gateway switch 2. Similarly, compute nodes 16 to 30 are connected to gateway switches 3 and 4.

15.3.3 Identifying All InfiniBand Switches in the Fabric

To identify all InfiniBand switches (Sun Network InfiniBand Gateway Switch or Sun Datacenter InfiniBand Switch 36) running master or standby instances of Subnet Manager (SM) on the fabric, run the following command on any of the InfiniBand switches:

# ibswitches

This command displays the GUID, name, LID, and LMC for each switch. The output of the command is a mapping of GUID to LID for switches in the fabric.

15.3.4 Determining the SM Priority on an InfiniBand Switch

After identifying the InfiniBand switches and their IP addresses, you must log in to each of the switches and run the following command to identify the InfiniBand switch where the master Subnet Manager (SM) is running:

# getmaster

This command displays output, as shown in the following example:

Local SM enabled and running
20111122 08:45:02 Master SubnetManager on sm lid 11 sm guid 0x21283bad45c0a0 : SUN IB QDR GW switch el01gw04 10.10.10.10

15.3.5 Logging In to the InfiniBand Switch That Runs Master SM

After identifying the InfiniBand switch where master SM is running, log in to the ILOM shell for the InfiniBand switch as the ILOM administrator (ilom-admin). After logging in, run the show /SYS/Fabric_Mgmt command to log in to the restricted Linux shell. To view a list of available commands, you can run the help all command.

15.4 Moving from a Default Partition to a Custom Partition

Moving from a configuration that does not use InfiniBand partitions (that is, uses the default partition only) to a configuration with partitions involves the following steps:

  • Making all Exalogic compute nodes limited members of the default partition

    Note:

    By default, all Exalogic compute nodes are full members of the default partition.

  • Disabling IPoIB on the default partition

    Note:

    Do not complete this step if your Exalogic machine is connected to Oracle Exadata Database Machine on the same InfiniBand fabric.

See the following example:

  1. Run the following command to start the process:

    # smpartition start

  2. Run the following command:

    # smpartition list modified

  3. Run the following command to make Exalogic compute nodes limited members of the default partition and to disable IPoIB on the default partition:

    # smpartition modify -n Default -port ALL_CAS -m limited -flag

15.5 Creating an IPoIB Partition and Adding Ports

In this example procedure, you are creating a unique, non-default partition named myIPoIB for network isolation on Exalogic's private InfiniBand fabric by configuring a non-default partition key (pkey) value 0x005.

To do so, complete the following steps:

  1. Log in to the InfiniBand switch where master SM is running. For more information, see Section 15.3, "Before You Begin".

  2. To start the configuration process, run the following command:

    # smpartition start
    
  3. Create the myIPoIB partition with the pkey 0x005 with full membership by running the following command:

    # smpartition create -n myIPoIB -pkey 0x8005 -m full -flag ipoib
    
  4. Run the following command to add the compute node port GUIDs, which you noted down in Section 15.3.2, "Gathering Port GUIDs of Compute Nodes and BridgeX Ports of Gateway Switches", to the myIPoIB partition.

    # smpartition add -n myIPoIB -port portGUID1 portGUID2
    

    In this example, portGUID1 and portGUID2 are the ports that you want to add to the partition. This command example shows a few port entries only. You can add as many ports as necessary. An example port value is 0021280001cef8e3.

  5. If you intend to use the partition for creating vNICs, run the following command to add the gateway switch's BridgeX ports to the myIPoIB partition; otherwise, proceed to the next step. The gateway switch's BridgeX ports were noted down in Section 15.3.2, "Gathering Port GUIDs of Compute Nodes and BridgeX Ports of Gateway Switches".

    # smpartition add -n myIPoIB -port BridgeXPort1 BridgeXPort2 BridgeXPort3 BridgeXPort4
    

    In this example, BridgeXPort1, BridgeXPort2, BridgeXPort3, and BridgeXPort4 are the BridgeX ports that you want to add to the partition.

  6. Follow these steps to add the ibp0 and the ibp1 network device port GUIDs to the partition:

    1. SSH to the storage appliance and run the following commands to determine the port GUID of the ibp0 network device:

      :> configuration
      :configuration> net
      :configuration net> devices
      :configuration net devices> select ibp0
      :configuration net devices ibp0> show
      Properties:
                               speed = 32000 Mbit/s
                                  up = true
                              active = false
                               media = Infiniband
                         factory_mac = not available
                                port = 1
                                guid = 0x212800013e8fbf
      configuration net devices ibp0> done
      
    2. Repeat the previous step with select ibp1 to determine the port GUID of the ibp1 device.

    3. Run the following command to add storage appliance GUIDs to the myIPoIB partition:

      # smpartition add -n myIPoIB -port ibp0GUID ibp1GUID
      
  7. Run the following command to view the changed partition configuration:

    # smpartiiton list modified
    

    This command displays the new partition with its pkey, ports added to the partition, and membership type.

  8. Run the following command to confirm the partition configuration:

    # smpartition commit
    
  9. Create interfaces for the ibp0 and the ibp1 network devices and bond them by running these steps:

    1. Log in to the Browser User Interface (BUI) of the storage appliance in your Exalogic machine.

    2. Under the Configuration tab, select Network.

    3. Create a new datalink with the following properties by dragging ibp0 under Devices to the Datalinks column to:

      i. In the Name field, enter ibp0.8005, where 8005 is the partition key.

      ii. In the Partition Key field, enter the partition key specified for the partition.

      iii. Set the Link Mode as Connected Mode.

    4. Create a datalink for ibp1 called ibp1.8005 by performing Step c.

    5. Create an interface for the ibp0 network device with the following properties by dragging the ibp0.8005 datalink to the interfaces column:

      i. In the Name field, enter ib0.8005, where 8005 is the partition key.

      ii. In the Configure with field, select Static Address List.

      iii. Enter the IPv4 Address/Mask as 0.0.0.0/8.

    6. Create an interface for ibp1 called ib1.8005 by performing Step e.

    7. Bond the interfaces by adding another interface. Click the plus button next to the Interfaces column and enter the following information:

      i. In the Name field, enter a name that denotes the IB partition, such as IB_IF_8005.

      ii. In the Configure with field, select Static Address List.

      iii. Enter the IPv4 Address/Mask to configure as the Infiniband device. You can use any unused IP address. In this example, we will use 192.168.33.15/24.

      iv. Check the IP MultiPathing Group box.

  10. The network must be configured on the compute node and the share mounted by following these steps:

    1. SSH to a compute node that is a member of the partition as the root user.

    2. Determine the active infiniband device by running the following commands on the compute node:

      # ifconfig ib0
      # ifconfig ib1
      

      The active device will have non-zero Rx/Tx bytes.

    3. Configure the network for the active device by running the following commands:

      # echo 0x8005 > /sys/class/net/ib0/create_child
      # ifconfig -a | grep -A6 8005
      ib0.8005  Link encap:InfiniBand  HWaddr 80:50:05:4C:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
                BROADCAST MULTICAST  MTU:2044  Metric:1
                RX packets:0 errors:0 dropped:0 overruns:0 frame:0
                TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:256
                RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
       
      # ifconfig ib0.8005 192.168.33.15
      
    4. Unmount any mounted shares using the umount command.

    5. Mount the share from the remote device. Use the IP specified in the configuration step. In our example we used 192.168.33.15.

      # mount 192.168.33.15:/export/share_name /mnt/share_name
      
    6. Repeat the previous steps for all compute nodes that are members of the partition.

15.6 Deleting a Partition

You can delete a non-default partition by running the following command:

# smpartition delete -n myIPoIB

This command deletes the myIPoIB partition.

Note:

Do not attempt to delete the default partition.

15.7 Creating a Partition for EoIB and Associating the pkey with a VNIC and VLAN

You can create a partition for EoIB (both inbound and outbound) and associate the partition's pkey with a VLAN and VNIC on the edge network.

Note:

The port GUID values, MAC addresses, VLAN IDs, compute node names, gateway switch names, and Ethernet connector names used in this procedure are examples only.

  1. At the command prompt on one of the gateway switches, run the following command:

    el01gw04# listlinkup | grep Bridge
    

    The following is an example of the output of the lislinkup command:

    Connector 0A-ETH Present
      Bridge-0 Port 0A-ETH-1 (Bridge-0-2) up (Enabled)
      Bridge-0 Port 0A-ETH-2 (Bridge-0-2) up (Enabled)
      Bridge-0 Port 0A-ETH-3 (Bridge-0-1) up (Enabled)
      Bridge-0 Port 0A-ETH-4 (Bridge-0-1) up (Enabled)
      Bridge-0 Port 1A-ETH-1 (Bridge-1-2) down (Enabled)
      Bridge-0 Port 1A-ETH-2 (Bridge-1-2) down (Enabled)
      Bridge-0 Port 1A-ETH-3 (Bridge-1-1) up (Enabled)
      Bridge-0 Port 1A-ETH-4 (Bridge-1-1) up (Enabled)
    

    From this example, identify the uplinks. You can determine that you can use any of the following Ethernet connectors for creating a VNIC:

    • 0A-ETH-1

    • 0A-ETH-2

    • 0A-ETH-3

    • 0A-ETH-4

    • 1A-ETH-3

    • 1A-ETH-4

      Note:

      This procedure uses 1A-ETH-3 as an example.

  2. Determine GUIDs of the Exalogic compute node the requires the VNIC as follows:

    1. On the compute node that requires the VNIC, log in as root, and run the ibstat command on the command line. For example, log in to el01cn01 as root.

      Example:

      el01cn01# ibstat
      CA 'mlx4_0'
              CA type: MT26428
              Number of ports: 2
              Firmware version: 2.7.8100
              Hardware version: b0
              Node GUID: 0x0021280001a0a364
              System image GUID: 0x0021280001a0a367
              Port 1:
                      State: Active
                      Physical state: LinkUp
                      Rate: 40
                      Base lid: 120
                      LMC: 0
                      SM lid: 6
                      Capability mask: 0x02510868
                      Port GUID: 0x0021280001a0a365
                      Link layer: IB
              Port 2:
                      State: Active
                      Physical state: LinkUp
                      Rate: 40
                      Base lid: 121
                      LMC: 0
                      SM lid: 6
                      Capability mask: 0x02510868
                      Port GUID: 0x0021280001a0a366
                      Link layer: IB
      

      In the output, information about two ports is displayed. Identify the GUID and Base lid of the port that you want to use for creating the VNIC.

      For the example illustrated in this procedure, we will use the port with GUID 0x0021280001a0a366 and Base lid 121.

    2. On the same compute node, run the following command to view information about all the active links in the InfiniBand fabric:

      hostname# iblinkinfo.pl -R | grep hostname
      

      hostname is the name of the compute node. You can also specify the bonded IPoIB address of the compute node.

      Example:

      el01cn01# iblinkinfo.pl -R | grep el01cn01
      65   15[  ] ==( 4X 10.0 Gbps Active/  LinkUp)==>    121   2[  ] "el01cn01 EL-C 192.168.10.29 HCA-1" (Could be 5.0 Gbps)
      64   15[  ] ==( 4X 10.0 Gbps Active/  LinkUp)==>    120   1[  ] "el01cn01 EL-C 192.168.10.29 HCA-1" (Could be 5.0 Gbps)
      

      From the output of the iblinkinfo command, note the switch lid value (65, in first column) associated with the Base lid of the compute node port that you noted earlier (121, in the first line):

  3. Determine the gateway switch that corresponds to the switch lid 65 by running the ibswitches command, as in the following example:

    Example:

    el01cn01# ibswitches
    Switch  : 0x002128548042c0a0 ports 36 "SUN IB QDR GW switch el01gw03" enhanced port 0 lid 63 lmc 0
    Switch  : 0x002128547f22c0a0 ports 36 "SUN IB QDR GW switch el01gw02" enhanced port 0 lid 6 lmc 0
    Switch  : 0x00212856d0a2c0a0 ports 36 "SUN IB QDR GW switch el01gw04" enhanced port 0 lid 65 lmc 0
    Switch  : 0x00212856d162c0a0 ports 36 "SUN IB QDR GW switch el01gw05" enhanced port 0 lid 64 lmc 0
    

    lid 65 corresponds to gateway switch el01gw04 with GUID 0x00212856d0a2c0a0.

  4. Define a dummy MAC address in the following format:

    last3_octets_of_switchGUID : last3_octets_of_computenode_adminIP_in_hex_format
    

    Example:

    GUID of switch: 00:21:28:56:d0:a2:c0:a0

    Last three octets: a2:c0:a0

    Administrative IP of the compute node that requires the VNIC: 192.168.1.1

    Last three octets: 168.1.1 (in hexadecimal notation: a8:01:01)

    MAC address: a2:c0:a0:a8:01:01

    Note:

    The dummy MAC address should be unique to the Exalogic network. Only even numbers are supported for the most significant byte of the MAC address (unicast). The above address is an example only.

  5. Ensure that you have noted down all port GUIDs and BridgeX ports.

  6. Log in to the InfiniBand switch where master SM is running. For more information, see Section 15.3, "Before You Begin".

  7. Run the following command to start the configuration process:

    # smpartition start
    
  8. Run the following command to create a myEoIB partition with the pkey 0x005 with a full membership:

    # smpartition create -n myEoIB -pkey 0x005 -m full
    
  9. Run the following command to add port GUIDs and BridgeX ports, which you noted down in Section 15.3.2, "Gathering Port GUIDs of Compute Nodes and BridgeX Ports of Gateway Switches", to the myEoIB partition:

    # smpartition add -n myEoIB -port port_guid1 port_guid2 bridgex_port1 bridgex_port2
    

    Where port_guid1, port_guid2, bridgex_port1, and bridgex_port2 are the ports that you want to add to the partition. This command example shows a few port entries only. You can add as many ports as necessary. An example port value is 0021280001cef8e3.

  10. Run the following command to view the changed partition configuration:

    # smpartition list modified
    

    This command displays the new partition with its pkey, ports added to the partition, and membership type.

  11. Run the following command to confirm the partition configuration:

    # smpartition commit
    

    The myEoIB partition with 0x005 pkey is created.

  12. Log in to the gateway switch interface as root, and run the following commands:

    # createvlan 1A-ETH-3 -vlan 10 -pkey 0x005
    

    Where 1A-ETH-3 is the Ethernet connector on the gateway switch, 10 is the VLAN identifier, and 0x005 is the partition key that you created earlier.

  13. To verify, run the following command:

    # showvlan
    

    The following information is displayed:

    Connector/LAG    VLN    PKEY
    --------------   ---    -----
    1A-ETH-3          10    0x005
    0A-ETH-1          11     ffff
    
  14. As root, log in to el01gw04 that you identified in Step 4. Use its IP address or host name to log in.

  15. Upon login, run the following command to create a VNIC:

    # createvnic 1A-ETH-3 -GUID 00212856d0a2c0a0 -mac a2:c0:a0:a8:01:01 -vlan 10 -pkey 0x005
    

    Where 1A-ETH-3 is the Ethernet connector, 00:21:28:56:d0:a2:c0:a0 is the GUID, a2:c0:a0:a8:01:01 is the dummy MAC address defined in Step 4, 10 is the VLAN identifier, and 0x005 is the partition key that you created earlier.

    This example creates a VNIC, such as eth4 (on Oracle Linux) or eoib0 (on Oracle Solaris) associated with VLAN 10 associated with a partition with 0x005 as the pkey.

  16. Run the following command to verify the VNICs:

    # showvnics
    

    The following message is displayed:

    Surrounding text describes vnicinfo3.png.

    Tip:

    After creating the interfaces, you can run the ifconfig command with the -a option to verify the MAC address on the compute node. For example, to verify the new interface and its MAC address, run the following command on the Oracle Linux compute node for which the VNIC was created:

    # ifconfig -a eth4

    The output of this command shows the HWADDR, which is the MAC address you defined for the VNIC in Step 5.

  17. On the compute node, run the following command to display the list of VNICs available on the compute node:

    el01cn01# mlx4_vnic_info -l
    

    This command displays the name of the new interface, as seen on the compute node, such as eth4. Note this ID.

  18. Create another VNIC for the same compute node, but using a connector on a different gateway switch. Note the ethX ID of this VNIC too.

    It is recommended that you configure the two EoIB interfaces as a bonded interface, such as bond1.

  19. Create interface files for the VNICs on the compute node.

    To ensure correct failover behavior, the name of the VNIC interface file and the value of the DEVICE directive in the interface file must not be based on the kernel-assigned ethX interface name (eth4, eth5, and so on). Instead, Oracle recommends that the interface file name and value of the DEVICE directive in the interface file be derived from the EPORT_ID and IOA_PORT values, as follows:

    Note:

    Any other unique naming scheme is also acceptable.

    1. Run the following command to find the EPORT_ID:

      #mlx4_vnic_info -i ethX | grep EPORT_ID
      

      Example:

      e101cn01#mlx4_vnic_info -i eth4 | grep EPORT_ID
      EPORT_ID     331
      

      Note the EPORT_ID that is displayed, 331 in this example.

    2. Run the following command to find the IOA_PORT:

      #mlx4_vnic_info -i ethX | grep IOA_PORT
      

      Example:

      e101cn01#mlx4_vnic_info -i eth4 | grep IOA_PORT
      IOA_PORT     mlx4_0:1
      

      Note the number after the colon (:) in the IOA_PORT value that is displayed, in this case 1.

    3. Build the interface file name and device name by using the following convention:

      Interface file name: ifcfg-ethA_B

      Device name: ethA_B

      A is the EPORT_ID, and B is the number after the colon (:) in the IOA_PORT value.

      Example:

      Interface file name: ifcfg-eth331_1

      Device name: eth331_1

      In this example, 331 is the EPORT_ID, and 1 is the value derived from the IOA_PORT.

  20. Create the interface file for the first VNIC, eth4 in the example, by using a text editor such as vi.

    Save the file in the /etc/sysconfig/network-scripts directory.

    Example:

    # more /etc/sysconfig/network-scripts/ifcfg-eth331_1
    DEVICE=eth331_1
    BOOTPROTO=none 
    ONBOOT=yes 
    HWADDR=a2:c0:a0:a8:01:01
    MASTER=bond1 
    SLAVE=yes 
    
    • Make sure that the name of the interface file (ifcfg-eth331_1 in the example) is the name derived in step 19.

    • For the DEVICE directive, specify the device name (eth331_1 in the example) derived in step 19.

    • For the HWADDR directive, specify the dummy MAC address created in step 4.

  21. Create an interface file for the second VNIC, say eth5. Be sure to name the interface file and specify the DEVICE directive by using a derived interface name and not the kernel-assigned name, as described earlier. In addition, be sure to specify the relevant dummy MAC address for the HWADDR directive.

  22. After creating the interface files, create the ifcfg-bond1 file. If the file already exists, verify its contents.

    Example:

    # more /etc/sysconfig/network-scripts/ifcfg-bond1
    DEVICE=bond1 
    IPADDR=192.168.48.128 
    NETMASK=255.255.255.0 
    BOOTPROTO=none 
    USERCTL=no 
    TYPE=Ethernet 
    ONBOOT=yes 
    IPV6INIT=no 
    BONDING_OPTS="mode=active-backup miimon=100 downdelay=5000 updelay=5000" 
    GATEWAY=192.168.48.1
    
  23. Restart the network services by running the following command:

    # service network restart
    
  24. Bring up the new bond1 interface using the ifup command.

    You must also reboot the compute node for the changes to take effect.

15.8 Post-Configuration Steps

After creating a partition on the InfiniBand switch, you must create a child interface for the IPoIB interface on your Exalogic compute node.

For example, on the InfiniBand switch, if you defined a partition with pkey 0x33, with IPoIB enabled, you must complete the following steps on a compute node with port 1 that is either full or limited member of that partition:

Note:

Even though the example uses port 1, you can create child interfaces for both ib0 and ib1 and bond them together on the partitioned network.

  1. Log in as a root user.

  2. Run the following commands on the command line:

    # cd /sys/class/net/ib0

    echo 0x8033 > create_child

  3. Run the following command to verify that the child interface was created:

    # ifconfig ib0.8033

  4. Specify your setup for the child interface in an ibcfg-ib0.8033 file in the /etc/sysconfig/networks-scripts directory. Note that it is .8033 even if it might be limited member.

15.9 Important Notes for Combined Exalogic-Exadata Fabric Users

Read the following notes if you are using partitions in a scenario where your Exalogic machine is connected to an Oracle Exadata Database Machine on the same InfiniBand fabric:

  • Oracle Exadata Database Machine currently uses the default InfiniBand partition only. Therefore, Oracle Exadata Database Machine nodes are full members of the default partition.

  • If your Exalogic machine is connected to the Oracle Exadata Database Machine on the same InfiniBand fabric, ensure that all Exalogic compute nodes are limited members of the default partition. By default, all Exalogic compute nodes are full members of the default partition. To make an Exalogic compute node a limited member of the default partition, add the port GUIDs of the compute node as limited members of the default partition. In addition, ensure that IPoIB is enabled on the default partition.

  • Exalogic nodes as limited members of the default partition will not be able to communicate with any other Exalogic node in the default partition. However, client access to Oracle Exadata Database Machine is provided via IPoIB in the default partition.

  • You must disable Subnet Manager (SM) on all InfiniBand switches that are not using firmware 2.0.4 or above. Exalogic's InfiniBand switches use firmware versions 2.0.4 or above for partitioning support, and SM should run on one of Exalogic's InfiniBand switches.

15.10 Partitioning Limitations

Consider the following limitations when creating non-default partitions:

  • Once a new partition configuration is successfully committed using the smpartition command on the current master SM, the configuration is kept highly available among the defined set of SM instances. However, all Sun Network QDR InfiniBand Gateway Switches defined to have SM enabled (that is, defined by the smnodes command on each gateway switch) must be operational and able to communicate with the other smnodes gateway switches in order for any change in the partition configuration to take place.

  • The limitation for number of partitions per end-port is a constraint defined by the various end-port implementations. For ConnectX2 and BridgeX, this limit is 128, which includes the default partition. Hence the maximum number of other partitions is 127. The CLI interface of the gateway switch does not verify this explicitly. However, if you specify partitions more than the maximum limit for any port (GUID), the SM only handles the maximum number of partitions and then logs a message.