5 Configuring Oracle Exadata Database Machine for Use with Oracle Big Data Appliance

This chapter provides information about optimizing communications between Oracle Exadata Database Machine and Oracle Big Data Appliance. It describes how you can configure Oracle Exadata Database Machine to use InfiniBand alone, or SDP over InfiniBand, to communicate with Oracle Big Data Appliance.

This chapter contains the following sections:

5.1 About Optimizing Communications

Oracle Exadata Database Machine and Oracle Big Data Appliance use Ethernet by default, although typically they are also connected by an InfiniBand network. Ethernet communications are much slower than InfiniBand. After you configure Oracle Exadata Database Machine to communicate using InfiniBand, it can obtain data from Oracle Big Data Appliance many times faster than before.

Moreover, client applications that run on Oracle Big Data Appliance and push the data to Oracle Database can use Sockets Direct Protocol (SDP) for an additional performance boost. SDP is a standard communication protocol for clustered server environments, providing an interface between the network interface card and the application. By using SDP, applications place most of the messaging burden upon the network interface card, which frees the CPU for other tasks. As a result, SDP decreases network latency and CPU utilization, and thereby improves performance.

5.1.1 About Applications that Pull Data Into Oracle Exadata Database Machine

Oracle SQL Connector for Hadoop Distributed File System (HDFS) is an example of an application that pulls data into Oracle Exadata Database Machine. The connector enables an Oracle external table to access data stored in either HDFS files or a Hive table.

The external table provide access to the HDFS data. You can use the external table for querying HDFS data or for loading it into an Oracle database table.

Oracle SQL Connector for HDFS functions as a Hadoop client running on the database servers in Oracle Exadata Database Machine.

If you use Oracle SQL Connector for HDFS or another tool that pulls the data into Oracle Exadata Database Machine, then for the best performance, you should configure the system to use InfiniBand. See "Specifying the InfiniBand Connections to Oracle Big Data Appliance."

See Also :

Oracle Big Data Connectors User's Guide for information about Oracle SQL Connector for HDFS

5.1.2 About Applications that Push Data Into Oracle Exadata Database Machine

Oracle Loader for Hadoop is an example of an application that pushes data into Oracle Exadata Database Machine. The connector is an efficient and high-performance loader for fast movement of data from a Hadoop cluster into a table in an Oracle database. You can use it to load data from Oracle Big Data Appliance to Oracle Exadata Database Machine.

Oracle Loader for Hadoop functions as a database client running on the Oracle Big Data Appliance. It must make database connections from Oracle Big Data Appliance to Oracle Exadata Database Machine over the InfiniBand network. Use of Sockets Direct Protocol (SDP) for these database connections further improves performance.

If you use Oracle Loader for Hadoop or another tool that pushes the data into Oracle Exadata Database Machine, then for the best performance, you should configure the system to use SDP over InfiniBand as described in this chapter.

See Also :

Oracle Big Data Connectors User's Guide for information about Oracle Loader for Hadoop

5.2 Prerequisites

Oracle Big Data Appliance and Oracle Exadata Database Machine racks must be cabled together using InfiniBand cables. The IP addresses must be unique across all racks and use the same subnet for the InfiniBand network.

See Also:

5.3 Specifying the InfiniBand Connections to Oracle Big Data Appliance

You can configure Oracle Exadata Database Machine to use the InfiniBand IP addresses of the Oracle Big Data Appliance servers. Otherwise, the default network is Ethernet. Use of the InfiniBand network improves the performance of all data transfers between Oracle Big Data Appliance and Oracle Exadata Database Machine.

To identify the Oracle Big Data Appliance InfiniBand IP addresses: 

  1. If you have not done so already, install a CDH client on Oracle Exadata Database Machine. See "Providing Remote Client Access to CDH."

  2. Obtain a list of host names and InfiniBand IP addresses for all Oracle Big Data Appliance servers.

    An Oracle Big Data Appliance rack can have 6, 12, or 18 servers.

  3. Log in to Oracle Exadata Database Machine with root privileges.

  4. Edit /etc/hosts on Oracle Exadata Database Machine and add the Oracle Big Data Appliance host names and InfiniBand IP addresses. The following example shows the sequential IP numbering:

    192.168.8.1       bda1node01.example.com    bda1node01
    192.168.8.2       bda1node02.example.com    bda1node02
    192.168.8.3       bda1node03.example.com    bda1node03
    192.168.8.4       bda1node04.example.com    bda1node04
    192.168.8.5       bda1node05.example.com    bda1node05
    192.168.8.6       bda1node06.example.com    bda1node06
    
  5. Check /etc/nsswitch.conf for a line like the following:

    hosts:      files dns 
    

    Ensure that the line does not reverse the order (dns files); if it does, your additions to /etc/hosts will not be used. Edit the file if necessary.

  6. Ping all Oracle Big Data Appliance servers. Ensure that ping completes and shows the InfiniBand IP addresses.

    # ping bda1node01.example.com
    PING bda1node01.example.com (192.168.8.1) 56(84) bytes of data.
    64 bytes from bda1node01.example.com (192.168.8.1): icmp_seq=1 ttl=50 time=20.2 ms
         .
         .
         .
    
  7. Run CDH locally on Oracle Exadata Database Machine and test HDFS functionality by uploading a large file to a Oracle Big Data Appliance server. Check that your network monitoring tools (such as sar) show I/O activity on the InfiniBand devices.

    To upload the file, use syntax like the following, which copies localfile.dat to the HDFS testdir directory on node05 of Oracle Big Data Appliance:

    hadoop fs -put localfile.dat hdfs://bda1node05.example.com/testdir/
    

5.4 Enabling SDP on Exadata Database Nodes

The following procedure describes how to enable SDP on database nodes in an Oracle Exadata Database Machine running Oracle Linux. SDP improves the performance of client applications that run on Oracle Big Data Appliance and push data to Oracle Exadata Database Machine.

To enable SDP on Oracle Exadata Database Machine: 

  1. Open /etc/infiniband/openib.conf file in a text editor, and add the following line:

    set: SDP_LOAD=yes
    
  2. Save these changes and close the file.

  3. To enable both SDP and TCP, open /etc/ofed/libsdp.conf in a text editor, and add the use both rule:

    use both server * : 
    use both client * : 
    
  4. Save these changes and close the file.

  5. Open /etc/modprobe.conf file in a text editor, and add this setting:

    options ib_sdp sdp_zcopy_thresh=0 recv_poll=0
    
  6. Save these changes and close the file.

  7. Replicate these changes across all database nodes in the Oracle Exadata Database Machine rack.

  8. Restart all database nodes for the changes to take effect.

  9. If you have multiple Oracle Exadata Database Machine racks, then repeat these steps on all of them.

5.5 Configuring a JDBC Client for SDP

The following procedure explains how to configure a JDBC client to use SDP.

To enable SDP support for JDBC: 

  1. Configure the database to support InfiniBand, as described in the Oracle Database Net Services Administrator's Guide. Ensure that you set the protocol to SDP.

  2. Set the LD_PRELOAD environment variable to libsdp.so before starting the Java virtual machine. This example uses the Bash shell:

    export LD_PRELOAD="libsdp.so"
    

5.6 Creating an SDP Listener on the InfiniBand Network

To add a listener for the Oracle Big Data Appliance connections coming in on the InfiniBand network, first add a network resource for the InfiniBand network with virtual IP addresses.

Note:

This example lists two nodes for an Oracle Exadata Database Machine quarter rack. If you have an Oracle Exadata Database Machine half or full rack, you must repeat node-specific lines for each node in the cluster.
  1. Edit /etc/hosts on each node in the Exadata rack to add the virtual IP addresses for the InfiniBand network. Make sure that these IP addresses are not in use. For example:

    # Added for Listener over IB
    192.168.10.21 dm01db01-ibvip.example.com dm01db01-ibvip
    192.168.10.22 dm01db02-ibvip.example.com dm01db02-ibvip 
    
  2. As the root user, create a network resource on one database node for the InfiniBand network. For example:

    # /u01/app/grid/product/11.2.0.2/bin/srvctl add network -k 2 -S 192.168.10.0/255.255.255.0/bondib0
    
  3. Verify that the network was added correctly with a command like the following examples:

    # /u01/app/grid/product/11.2.0.2/bin/crsctl stat res -t | grep net
    ora.net1.network
    ora.net2.network -- Output indicating new Network resource 
    

    or

    # /u01/app/grid/product/11.2.0.2/bin/srvctl config network -k 2
    Network exists: 2/192.168.10.0/255.255.255.0/bondib0, type static -- Output indicating Network resource on the 192.168.10.0 subnet 
    
  4. Add the virtual IP addresses on the network created in Step 2, for each node in the cluster. For example:

    # srvctl add vip -n dm01db01 -A dm01db01-ibvip/255.255.255.0/bondib0 -k 2
    #
    # srvctl add vip -n dm01db02 -A dm01db02-ibvip/255.255.255.0/bondib0 -k 2 
    
  5. As the oracle user who owns Grid Infrastructure Home, add a listener for the virtual IP addresses created in Step 4.

    # srvctl add listener -l LISTENER_IB -k 2 -p TCP:1522,/SDP:1522
    
  6. For each database that will accept connections from the middle tier, modify the listener_networks init parameter to allow load balancing and failover across multiple networks (Ethernet and InfiniBand). You can either enter the full TNSNAMES syntax in the initialization parameter or create entries in tnsnames.ora in the $ORACLE_HOME/network/admin directory. The TNSNAMES.ORA entries must exist in GRID_HOME. The following example first updates tnsnames.ora.

    Complete this step on each node in the cluster with the correct IP addresses for that node. LISTENER_IBREMOTE should list all other nodes that are in the cluster. DBM_IB should list all nodes in the cluster.

    Note:

    The TNSNAMES entry is only read by the database instance on startup, if you modify the entry that is referred to by any init.ora parameter (LISTENER_NETWORKS), you must restart the instance or issue an ALTER SYSTEM SET LISTENER_NETWORKS command for the modifications to take affect by the instance.
    DBM =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01-scan)(PORT = 1521))
    (CONNECT_DATA =
    (SERVER = DEDICATED)
    (SERVICE_NAME = dbm)
    ))
     
    DBM_IB =
    (DESCRIPTION =
    (LOAD_BALANCE=on)
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01db01-ibvip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01db02-ibvip)(PORT = 1522))
    (CONNECT_DATA =
    (SERVER = DEDICATED)
    (SERVICE_NAME = dbm)
    ))
     
    LISTENER_IBREMOTE =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01db02-ibvip.mycompany.com)(PORT = 1522))
    ))
     
    LISTENER_IBLOCAL =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01db01-ibvip.mycompany.com)(PORT = 1522))
    (ADDRESS = (PROTOCOL = SDP)(HOST = dm01db01-ibvip.mycompany.com)(PORT = 1523))
    ))
     
    LISTENER_IPLOCAL =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm0101-vip.mycompany.com)(PORT = 1521))
    ))
     
    LISTENER_IPREMOTE =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = dm01-scan.mycompany.com)(PORT = 1521))
    ))
    
  7. Connect to the database instance as sysdba.

  8. Modify the listener_networks init parameter by using the SQL ALTER SYSTEM command:

    SQL> alter system set listener_networks=
         '((NAME=network2) (LOCAL_LISTENER=LISTENER_IBLOCAL)
            (REMOTE_LISTENER=LISTENER_IBREMOTE))', 
         '((NAME=network1)(LOCAL_LISTENER=LISTENER_IPLOCAL)
            (REMOTE_LISTENER=LISTENER_IPREMOTE))' scope=both; 
    
  9. On the Linux command line, use the srvctl command to restart LISTENER_IB to implement the modification in Step 7:

    # srvctl stop listener -l LISTENER_IB 
    # srvctl start listener -l LISTENER_IB