Datagram Test

Datagram Test Utility

Included with Coherence is a Datagram Test utility which can be used to test and tune network performance between two or more machines.  The Datagram test operates in one of three modes, either as a packet publisher, a packet listener, or both.  When run a publisher will transmit UDP packet to the listener who will measure the throughput, success rate, and other statistics.

Syntax

The Datagram test supports a large number of configuration options, though only a few are required for basic operation. To run the Datagram Test utility use the following synctax from the command line:

java com.tangosol.net.DatagramTest <command value ...> <addr:port ...>

Command Options

Command Optional Applicability Description Default
-local True Both The local address to bind to, specified as addr:port localhost:9999
-packetSize True Both The size of packet to work with, specified in bytes. 1468
-processBytes True Both The number of bytes (in multiples of 4) of each packet to process. 4
-rxBufferSize True Listener The size of the receive buffer, specified in packets. 1428
-txBufferSize True Publisher The size of the transmit buffer, specified in packets. 16
-txRate True Publisher The rate at which to transmit data, specified in megabytes. unlimited
-txIterations True Publisher Specifies the number of packets to publish before exiting. unlimited
-txDurationMs True Publisher Specifies how long to publish before exiting. unlimited
-reportInterval True Both The interval at which to output a report, specified in packets. 100000
-tickInterval True Both The interval at which to output tick marks. 1000
-log True Listener The name of a file to save a tabular report of measured performance. none
-logInterval True Listener The interval at which to output a measurement to the log. 100000
-polite True Publisher Switch indicating if the publisher should wait for the listener to be contacted before publishing. off
arguments True Publisher Space separated list of addresses to publish to, specified as addr:port. none

Usage Examples

Listener

java com.tangosol.net.DatagramTest -local box1:9999 -packetSize 1468

Publisher

java com.tangosol.net.DatagramTest -local box2:9999 -packetSize 1468 box1:9999

For ease of use, datagram-test.sh and datagram-test.cmd scripts are provided in the Coherence bin directory, and can be used to execute this test.

Example

Let's say that we want to test network performance between two servers servers - Server A with IP address 1{{95.0.0.1}} and Server B with IP address 195.0.0.2. One server will act as a packet publisher and the other as a packet listener, the publisher will transmit packets as fast as possible and the listener will measure and report performance statistics. First start the listener on Server A.

datagram-test.sh

After pressing ENTER, you should see the Datagram Test utility showing you that it is ready to receive packets.

starting listener: at /195.0.0.1:9999
packet size: 1468 bytes
buffer size: 1428 packets
  report on: 100000 packets, 139 MBs
    process: 4 bytes/packet
        log: null
     log on: 139 MBs

As you can see by default the test will try to allocate a network receive buffer large enough to hold 1428 packets, or about 2 MB. If it is unable to allocate this buffer it will report an error and exit. You can either decrease the requested buffer size using the -rxBufferSize parameter or increase you OS network buffer settings. For best performance it is recommended that you increase the OS buffers. See the following forum post for details on tuning your OS for Coherence.

Once the listener process is running you may start the publisher on Server B, directing it to publish to Server A.

datagram-test.sh servera

After pressing ENTER, you should see the new Datagram test instance on Server B start both a listener and a publisher. Note in this configuration Server B's listener will not be used. The following output should appear in the Server B command window.

starting listener: at /195.0.0.2:9999
packet size: 1468 bytes
buffer size: 1428 packets
  report on: 100000 packets, 139 MBs
    process: 4 bytes/packet
        log: null
     log on: 139 MBs

starting publisher: at /195.0.0.2:9999 sending to servera/195.0.0.1:9999
packet size: 1468 bytes
buffer size: 16 packets
  report on: 100000 packets, 139 MBs
    process: 4 bytes/packet
      peers: 1
       rate: no limit

no packet burst limit
oooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooO

The series of "o" and "O" tick marks appear as data is (O)utput on the network. Each "o" represents 1000 packets, with "O" indicators at every 10,000 packets.

On Server A you should see a corresponding set of "i" and "I" tick marks, representing network (I)nput. This indicates that the two test instances are communicating.

Reporting

Periodically each side of the test will report performance statistics.

Publisher Statistics

The publisher simply reports the rate at which it is publishing data on the network. A typical report is as follows:

Tx summary 1 peers:
   life: 97 MB/sec, 69642 packets/sec
    now: 98 MB/sec, 69735 packets/sec

The report includes both the current transmit rate (since last report) and the lifetime transmit rate.

Listener Statistics

The listener reports more detailed statistics including:

Element Description
Elapsed The time interval that the report covers.
Packet size The received packet size.
Throughput The rate at which packets are being received.
Received The number of packets received.
Missing The number of packets which were detected as lost.
Success rate The percentage of received packets out of the total packets sent.
Out of order The number of packets which arrived out of order.
Average offset An indicator of how out of order packets are.

As with the publisher both current and lifetime statistics are report. A typical report is as follows:

Lifetime:
Rx from publisher: /195.0.0.2:9999
             elapsed: 8770ms
         packet size: 1468
          throughput: 96 MB/sec
                      68415 packets/sec
            received: 600000 of 611400
             missing: 11400
        success rate: 0.9813543
        out of order: 2
          avg offset: 1


Now:
Rx from publisher: /195.0.0.2:9999
             elapsed: 1431ms
         packet size: 1468
          throughput: 98 MB/sec
                      69881 packets/sec
            received: 100000 of 100000
             missing: 0
        success rate: 1.0
        out of order: 0
          avg offset: 0

The primary items of interest are the throughput and success rate. The goal is to find the highest throughput while maintaining a success rate as close to 1.0 as possible. On a 100 Mb network setup you should be able to achieve rates of around 10 MB/sec. On a 1 Gb network you should be able to achieve rates of around 100 MB/sec. Achieving these rates will likely require some tuning (see below).

Bidirectional Testing

You may also run the test in a bidirectional mode where both servers act as publishers and listeners. To do this simply restart test instances, supplying the instance on Server A with Server B's address, by running the following on Server A.

datagram-test.sh -polite serverb

And then run the same command as before on Server B. The -polite parameter instructs this test instance to not start publishing until it is starts to receive data.

You may also use more then two machines in testing, for instance you can setup two publishers to target a single listener.

Tuning

Once you have the test functioning you may begin to tune your machine, network infrastructure, and the test parameters to achieve a maximum transfer rate.

OS Tuning

If you have not already done so you should start by following the Coherence OS tuning suggestions, see the above mentioned forum post for details on tuning your OS for Coherence.

Network Tuning

Once you've tuned your OS, if you are still not achieving acceptable rates you can tune you network. Most important is to make sure that you have a consistent MTU across both machines, and any switches or routers which sits between them.

The test assumes a 1500 byte network MTU, and uses a default packet size of 1468 based on this assumption. Having a packet size which does not fill the MTU will result is an under utilized network. If you are using a different MTU, you should configure the test accordingly, by specifying a packet size which is 32 bytes smaller then the network path's minimal MTU. This can be specified with the -packetSize parameter on both the publisher and listener.

If you are unsure of the minimal MTU along the full path between servers you can use the standard ping utility to determine it. To do this, execute a series of ping operations between the two servers. With each attempt you will specify a different packet size, starting from a high value and progressively moving downward until the pings start to make it through without fragmentation. You will need to instruct ping to use a particular packet size, and to not fragment the packets.

On Windows execute:

ping -n 3 -f -l 1500 serverb

On Linux execute:

ping -c 3 -M do -s 1500 serverb

On other OSs, consult the documentation for the ping command to see how to disable fragmentation, and specify the packet size.

If you receive a message stating that packets must be fragmented then the specified size is larger then the path's MTU. Decrease the packet size until you find the point at which packets can be transmitted without fragmentation. If you find that you need to use packets smaller then 1468 you may wish to contact your network administrator to get the MTU increased to at least 1500.

Test Tuning

If after having performed the above optimizations you still cannot achieve a success rate of 1.0, you may throttle the test's publisher to determine what data rate does allow a 1.0 success rate. On the publisher you may use the -txRate to specify the target transmit rate in megabytes per second, i.e. -txRate 50 would limit the publisher to 50 MB/sec. A simple way to find the data rate which will give a 1.0 success rate is to specify a transmit data rate close to the data rate previously reported by the test listener. For instance if the test publisher reports a transmit rate of 90 MB/sec and the listener reports a receive rate of 80 MB/sec, specifying -txRate 80 on the publisher should allow you to achieve a 1.0 success rate.

Coherence Tuning

Once you have achieved the desired rates with the Datagram test you may utilize these tuning parameters with Coherence. The OS and network tuning would already be in effect, but any tuned test parameter can be mirrored in your Coherence override configuration file. Packet size may be specified in coherence/cluster-config/packet-publisher/packet-size/maximum-length and preferred-length configuration elements. If you've adjusted either the transmit or receive buffers, these can be specified in coherence/cluster-config/packet-publisher/packet-buffer/maximum-packets and coherence/cluster-config/unicast-listener/packet-buffer/maximum-packets elements, respectively.

Additionally see the documentation for traffic-jam and burst-mode, these setting are used to control the rate of packet flow within and between cluster nodes.

Validation

To determine how these settings are affecting performance you need to check if you're cluster nodes are experiencing packet loss and/or packet duplication. This can be obtained by looking at the following JMX stats on various cluster nodes:

For information on using JMX to monitor Coherence see Managing Coherence using JMX. If you see that either rate is less then 1.0 additional tuning may be required.