To achieve maximum performance with Coherence it is suggested that you test and tune your operating environment. Tuning recommendations are available for:
To help minimization of packet loss, the OS socket buffers need to be large enough to handle the incoming network traffic while your Java application is paused during garbage collection. By default Coherence will attempt to allocate a socket buffer of 2MB. If your OS is not configured to allow for large buffers Coherence will utilize smaller buffers. Most versions of Unix have a very low default buffer limit, which should be increased to at least 2MB.
Starting with Coherence 3.1 you will receive the following warning if the OS failed to allocate the full size buffer.
Though it is safe to operate with the smaller buffers it is recommended that you configure your OS to allow for larger buffers.
On Linux execute (as root):
On Solaris execute (as root):
On AIX execute (as root):
|Note that AIX only supports specifying receive buffer sizes of 1MB, 4MB, and 8MB. Additionally there is an issue with IBM's 1.4.2, and 1.5 JVMs which may prevent them from allocating socket buffers larger then 64K. This issue has been addressed in IBM's 1.4.2 SR7 SDK and 1.5 SR3 SDK.|
Windows does not impose a buffer size restriction by default.
For information on increasing the buffer sizes for other OSs please refer to your OS's documentation.
You may configure Coherence to request alternate sized buffers via the coherence/cluster-config/packet-publisher/packet-buffer/maximum-packets and coherence/cluster-config/unicast-listener/packet-buffer/maximum-packets elements.
Linux has a number of high resolution timesources to choose from, the fastest TSC (Time Stamp Counter) unfortunately is not always reliable. Linux chooses TSC by default, and during boot checks for inconsistencies, if found it switches to a slower safe timesource. The slower time sources can be 10 to 30 times more expensive to query then the TSC timesource, and may have a measurable impact on Coherence performance. Note that Coherence and the underlying JVM are not aware of the timesource which the OS is utilizing. It is suggested that you check your system logs (/var/log/dmesg) to verify that the following is not present.
As the log messages suggest, this can be caused by a variable rate CPU (SpeedStep), having DMA disabled, or incorrect TSC synchronization on multi CPU machines. If present it is suggested that you work with your system administrator to identify the cause and allow the TSC timesource to be utilized.
Microsoft Windows supports a fast I/O path which is utilized when sending "small" datagrams. The default setting for what is considered a small datagram is 1024 bytes; increasing this value to match your network MTU (normally 1500) can significantly improve network performance.
To adjust this parameter:
|Note: Included in Coherence 3.1 and above is an optimize.reg script which will perform this change for you, it can be found in the coherence/bin directory of your installation. After running the script you must reboot your computer for the changes to take effect.|
For more details on this parameter see Appendix C of http://technet.microsoft.com/en-us/library/bb726981.aspx
Windows (including NT, 2000 and XP) is optimized for desktop application usage. If you run two console (" DOS box ") windows, the one that has the focus can use almost 100% of the CPU, even if other processes have high-priority threads in a running state. To correct this imbalance, you must configure the Windows thread scheduling to less-heavily favor foreground applications.
|Note: Coherence includes an optimize.reg script which will perform this change for you, it can be found in the coherence/bin directory of your installation.|
Ensure that you have sufficient memory such that you are not making active use of swap space on your machines. You may monitor the swap rate using tools such as vmstat and top. If you find that you are actively moving through swap space this will likely have a significant impact on Coherence's performance. Often this will manifest itself as Coherence nodes being removed from the cluster due to long periods of unresponsiveness caused by them having been "swapped out".
Verify that your Network card (NIC) is configured to operate at it's maximum link speed and at full duplex. The process for doing this varies between OSs.
On Linux execute (as root):
See the man page on ethtool for further details and for information on adjust the interface settings.
On Solaris execute (as root):
This will display the link settings for interface 0. Items of interest are link_duplex (2 = full), and link_speed which is reported in Mbps. For further details on Solaris network tuning see this article from Prentice Hall.
|If running on Solaris 10, please review Sun issues 102712 and 102741 which relate to packet corruption and multicast disconnections. These will most often manifest as either EOFExceptions, "Large gap" warnings while reading packet data, or frequent packet timeouts. It is highly recommend that the patches for both issues be applied when using Coherence on Solaris 10 systems.|
For 1Gb and faster PCI network cards the system's bus speed may be the limiting factor for network performance. PCI and PCI-X busses are half-duplex, and all devices will run at the speed of the slowest device on the bus. Standard PCI buses have a maximum throughput of approximately 1Gb/sec and thus are not capable of fully utilizing a full-duplex 1Gb NIC. PCI-X has a much higher maximum throughput (1GB/sec), but can be hobbled by a single slow device on the bus. If you find that you are not able to achieve satisfactory bidirectional data rates it is suggested that you evaluate your machine's bus configuration. For instance simply relocating the NIC to a private bus may improve performance.
If you experience frequent multi-second communication pauses across multiple cluster nodes you may need to increase your switch's buffer space. These communication pauses can be identified by a series of Coherence log messages identifying communication delays with multiple nodes which are not attributable to local or remote GCs.
Some switches such as the Cisco 6500 series support configuration the amount of buffer space available to each ethernet port or ASIC. In high load applications it may be necessary to increase the default buffer space. On Cisco this can be accomplished by executing:
See Cisco's documentation for additional details on this setting.
Full duplex Ethernet includes a flow-control feature which allows the receiving end of a point to point link to slow down the transmitting end. This is implemented by the receiving end sending an Ethernet PAUSE frame to the transmitting end, the transmitting end will then halt transmissions for the interval specified by the PAUSE frame. Note that this pause blocks all traffic from the transmitting side, even traffic destined for machines which are not overloaded. This can induce a head of line blocking condition, where one overloaded machine on a switch effectively slows down all other machines. Most switch vendors will recommend that Ethernet flow-control be disabled for inter switch links, and at most be utilized on ports which are directly connected to machines. Even in this setup head of line blocking can still occur, and thus it is advisable to disable Ethernet flow-control all together. Higher level protocols such as TCP/IP and Coherence TCMP include their own flow-control mechanisms which are not subject to head of line blocking, and also negate the need for the lower level flow-control.
For more details on this subject see http://www.networkworld.com/netresources/0913flow2.html.
By default Coherence assumes a 1500 byte network MTU, and uses a default packet size of 1468 based on this assumption. Having a packet size which does not fill the MTU will result is an under utilized network. If your equipment uses a different MTU, you should configure Coherence accordingly, by specifying a packet size which is 32 bytes smaller then the network path's minimal MTU. The packet size may be specified in coherence/cluster-config/packet-publisher/packet-size/maximum-length and preferred-length configuration elements.
If you are unsure of your equpiment's MTU along the full path between nodes you can use either the standard ping or traceroute utility to determine it. To do this, execute a series of ping or traceroute operations between the two machines. With each attempt you will specify a different packet size, starting from a high value and progressively moving downward until the packets start to make it through without fragmentation. You will need to specify a particular packet size, and to not fragment the packets.
On Linux execute:
On Solaris execute:
On Windows execute:
On other OSs, consult the documentation for the ping or traceroute command to see how to disable fragmentation, and specify the packet size.
If you receive a message stating that packets must be fragmented then the specified size is larger then the path's MTU. Decrease the packet size until you find the point at which packets can be transmitted without fragmentation. If you find that you need to use packets smaller then 1468 you may wish to contact your network administrator to get the MTU increased to at least 1500.
It is recommended that you run all your Coherence JVMs in server mode, by specifying the "-server" on the JVM command line. This allows for a number of performance optimizations for long running applications.
It is generally recommended that heap sizes be kept at 1GB or below as larger heaps will have a more significant impact on garbage collection times. On 1.5 and higher JVMs larger heaps are reasonable, but will likely require additional GC tuning.
Running with a fixed sized heap will save your JVM from having to grow the heap on demand and will result in improved performance. To specify a fixed size heap use the -Xms and -Xmx JVM options, setting them to the same value. For example:
Note that the JVM process will consume more system memory then the specified heap size, for instance a 1GB JVM will consume 1.3GB of memory. This should be taken into consideration when determining the maximum number of JVMs which you will run on a machine. The actual allocated size can be monitored with tools such as top. See Best Practices for additional details on heap size considerations.
Frequent garbage collection pauses which are in the range of 100ms or more are likely to have a noticeable impact on network performance. During these pauses a Java application is unable to send or receive packets, and in the case of receiving, the OS buffered packets may be discarded and need to be retransmitted.
Specify "-verbose:gc" or "-Xloggc:" on the JVM command line to monitor the frequency and duration of garbage collection pauses.
See http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html for details on GC tuning.
Starting with Coherence 3.2 log messages will be generated when one cluster node detects that another cluster node has been unresponsive for a period of time, generally indicating that a target cluster node was in a GC cycle.
The PauseRate indicates the percentage of time for which the node has been considered unresponsive since the stats were last reset. Nodes reported as unresponsive for more then a few percent of their lifetime may be worth investigating for GC tuning.
Coherence includes configuration elements for throttling the amount of traffic it will place on the network; see the documentation for traffic-jam, flow-control and burst-mode, these settings are used to control the rate of packet flow within and between cluster nodes.
To determine how these settings are affecting performance you need to check if you're cluster nodes are experiencing packet loss and/or packet duplication. This can be obtained by looking at the following JMX stats on various cluster nodes:
For information on using JMX to monitor Coherence see Managing Coherence using JMX.