1.1.1 About UEK Release 1

Release 1 of the UEK is based on a stable 2.6.32 Linux kernel and provides additional performance improvements, including:

  • Improved IRQ (interrupt request) balancing.

  • Reduced lock contention across the kernel.

  • Improved network I/O by the use of receive packet steering and RDS improvements.

  • Improved virtual memory performance.

The UEK release 1 includes optimizations developed in collaboration with Oracle’s Database, Middleware, and Hardware engineering teams to ensure stability and optimal performance for demanding enterprise workloads. In addition to performance improvements for large systems, the following UEK features are relevant to using Linux in the data center:

  • The Infiniband OpenFabrics Enterprise Distribution (OFED) 1.5.1 implements Remote Direct Memory Access (RDMA) and kernel bypass mechanisms to deliver high-efficiency computing, wire-speed messaging, ultra-low microsecond latencies and fast I/O for servers, block storage and file systems. This also includes an improved RDS (reliable datagram sockets) stack for high-speed, low-latency networking. As an InfiniBand Upper Layer Protocol (ULP), RDS allows the reliable transmission of IPC datagrams up to 1 MB in size, and is currently used in Oracle Real Application Clusters (RAC), and in the Exadata and Exalogic products.

  • A number of additional patches significantly improve the performance of Non-Uniform Memory Access (NUMA) systems with many CPUs, CPU cores, and memory nodes.

  • Receive Packet Steering (RPS) is a software implementation of Receive Side Scaling (RSS) that improves overall networking performance, especially for high loads. RPS distributes the load of received network packet processing across multiple CPUs and ensures that the same CPU handles all packets for a specific combination of IP address and port.

    To configure the list of CPUs to which RPS can forward traffic, use /sys/class/net/interface/queues/rx-N/rps_cpus, which implements a CPU bitmap for a specified network interface and receive queue. The default value is zero, which disables RPS and results in the CPU that is handling the network interrupt also processing the incoming packet. To enable RPS and allow a particular set of CPUs to handle interrupts for the receive queue on an interface, set the value of their positions in the bitmap to 1. For example, to enable RPS to use CPUs 0, 1, 2, and 3 for the rx-0 queue on eth0, set the value of rps_cpus to f (that is, 1+2+4+8 = 15 in hexadecimal):

    # cat f > /sys/class/net/eth0/queues/rx-0/rps_cpus

    There is no benefit in configuring RPS on a system with a multiqueue network device as RSS is usually automatically configured to map a CPU to each receive queue.

    For an interface with a single transmit queue, you should typically set rps_cpus for CPUs in the same memory domain so that they share the same queue. On a non-NUMA system, this means that you would set all the available CPUs in rps_cpus.

    Tip

    To verify which CPUs are handling receive interrupts, use the command watch -n1 cat /proc/softirqs and monitor the value of NET_RX for each CPU.

  • Receive Flow Steering (RFS) extends RPS to coordinate how the system processes network packets in parallel. RFS performs application matching to direct network traffic to the CPU on which the application is running.

    To configure RFS, use /proc/sys/net/core/rps_sock_flow_entries, which sets the number of entries in the global flow table, and /sys/class/net/interface/queues/rx-N/rps_flow_cnt, which sets the number of entries in the per-queue flow table for a network interface. The default values are both zero, which disables RFS. To enable RFS, set the value of rps_sock_flow_entries to the maximum expected number of concurrently active connections, and the value of rps_flow_cnt to rps_sock_flow_entries/Nq, where Nq is the number of receive queues on a device. Any value that you enter is rounded up to the nearest power of 2. The suggested value of rps_sock_flow_entries is 32768 for a moderately loaded server.

  • The kernel can detect solid state disks (SSDs), and tune itself for their use by bypassing the optimization code for spinning media and by dispatching I/O without delay to the SSD.

  • The data integrity features verify data from the database all the way down to the individual storage spindle or device. The Linux data integrity framework (DIF) allows applications or kernel subsystems to attach metadata to I/O operations, allowing devices that support DIF to verify the integrity before passing them further down the stack and physically committing them to disk. The Data Integrity Extensions (DIX) feature enables the exchange of protection metadata between the operating system and the host bus adapter (HBA), and helps to prevent silent data corruption. The data-integrity enabled Automatic Storage Manager (ASM) that is available as an add-on with Oracle Database also protects against data corruption from application to disk platter.

    For more information about the data integrity features, including programming with the block layer integrity API, see http://www.kernel.org/doc/Documentation/block/data-integrity.txt.

  • Oracle Cluster File System 2 (OCFS2) version 1.6 includes a large number of features. For more information, see Chapter 7, Oracle Cluster File System Version 2.