Oracle® Solaris 11.2 Dynamic Tracing Guide

Exit Print View

Updated: July 2014
 
 

tcp Provider

he tcp provider provides probes for tracing the TCP protocol.

Probes

The tcp probes are described in the table below.

Table 11-40  tcp Probes
Probe
Description
state-change
Probe that fires a TCP session changes its TCP state. Previous state is noted in the tcplsinfo_t * probe argument. The tcpinfo_t * and ipinfo_t * arguments are NULL.
send
Probe that fires whenever TCP sends a segment (either control or data).
receive
Probe that fires whenever TCP receives a segment (either control or data).
connect-request
Probe that fires when a TCP active open is initiated by sending an initial SYN segment. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the initial SYN segment sent.
connect-established
This probe fires when either of the following occurs: either a TCP active OPEN succeeds - the initial SYN has been sent and a valid SYN,ACK segment has been received in response. TCP enters the ESTABLISHED state, and the tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the SYN,ACK segment received; or a simultaneous active OPEN succeeds and a final ACK is received from the peer TCP. TCP has entered the ESTABLISHED state and the tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers of the final ACK received. The common thread in these cases is that an active-OPEN connection is established at this point, in contrast with tcp:::accept-established which fires on passive connection establishment. In both cases above, the TCP segment that is presented via the tcpinfo_t * is the segment that triggers the transition to ESTABLISHED - the received SYN,ACK in the first case and the final ACK segment in the second.
connect-refused
A TCP active OPEN connection attempt was refused by the peer - a RST segment was received in acknowledgment of the initial SYN. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the RST,ACK segment received.
accept-established
A passive open has succeeded - an initial active OPEN initiation SYN has been received, TCP responded with a SYN,ACK and a final ACK has been received. TCP has entered the ESTABLISHED state. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the final ACK segment received.
accept-refused
An incoming SYN has arrived for a destination port with no listening connection, so the connection initiation request is rejected by sending a RST segment ACKing the SYN. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the RST segment sent.

The send and receive probes trace packets on physical interfaces and also packets on loopback interfaces that are processed by tcp. On Solaris, loopback TCP connections can bypass the TCP layer when transferring data packets - this is a performance feature called tcp fusion; these packets are also traced by the tcp provider.

Arguments

The argument types for the tcp probes are listed in the table below. The arguments are described in the following section. All probes expect state-change have 5 arguments - state-change has 6.

Probe
args[0]
args[1]
args[2]
args[3]
args[4]
args[5]
state-change
null
csinfo_t *
null
tcpsinfo_t *
null
tcplsinfo_t *
send
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
receive
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
connect-request
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
connect-established
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
connect-refused
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
accept-established
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
accept-refused
pktinfo_t *
csinfo_t *
ipinfo_t *
tcpsinfo_t *
tcpinfo_t *
pktinfo_t Structure

The pktinfo_t structure is where packet ID info can be made available for deeper analysis if packet IDs become supported by the kernel in the future.

The pkt_addr member is currently always NULL.

typedef struct pktinfo {
        uintptr_t pkt_addr;             /* currently always NULL */
} pktinfo_t;
csinfo_t Structure

The csinfo_t structure is where connection state info is made available. It contains a unique (system-wide) connection ID, and the process ID and zone ID associated with the connection.

typedef struct csinfo {
        uintptr_t cs_addr;
	uint64_t cs_cid;
	pid_t cs_pid;
	zoneid_t cs_zoneid;
 } csinfo_t;
cs_addr
Address of translated ip_xmit_attr_t *.
cs_cid
Connection id. A unique per-connection identifier which identifies the connection during its lifetime.
cs_pid
Process ID associated with the connection.
cs_zoneid
Zone ID associated with the connection.
ipinfo_t Structure

The ipinfo_t structure contains common IP info for both IPv4 and IPv6.

typedef struct ipinfo {
        uint8_t ip_ver;                 /* IP version (4, 6) */
        uint16_t ip_plength;            /* payload length */
        string ip_saddr;                /* source address */
        string ip_daddr;                /* destination address */
} ipinfo_t;

These values are read at the time the probe fired in TCP, and so ip_plength is the expected IP payload length - however the IP layer may add headers (such as AH and ESP) which will increase the actual payload length. To examine this, also trace packets using the ip provider.

Table 11-41  ipinfo_t Members
ip_ver
IP version number. Currently either 4 or 6.
ip_plength
Payload length in bytes. This is the length of the packet at the time of tracing, excluding the IP header.
ip_saddr
Source IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits.
ip_daddr
Destination IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits.
tcpsinfo_t Structure

The tcpsinfo_t structure contains tcp state info.

typedef struct tcpsinfo {
        uintptr tcps_addr;
        int tcps_local;       /* is delivered locally, boolean */
        int tcps_active;       /* active open (from here), boolean */
        uint16_t tcps_lport;      /* local port */
        uint16_t tcps_rport;      /* remote port */
string tcps_laddr;		/* local address, as a string */
string tcps_raddr;		/* remote address, as a string */
int32_t tcps_state;/* TCP state. Use inline tcp_state_string[]to convert to string */
        uint32_t tcps_iss;     /* initial sequence # sent */
        uint32_t tcps_suna;     /* sequence # sent but unacked */
        uint32_t tcps_snxt;     /* next sequence # to send */
        uint32_t tcps_rack;     /* sequence # we have acked */
        uint32_t tcps_rnxt;     /* next sequence # expected */
        uint32_t tcps_swnd;     /* send window size */
        uint32_t tcps_snd_ws;   /* send window scaling */
        uint32_t tcps_rwnd;     /* receive window size */
        uint32_t tcps_rcv_ws;   /* receive window scaling */
	uint32_t tcps_cwnd;		/* congestion window */
	uint32_t tcps_cwnd_ssthresh;	/* threshold for congestion avoidance */
	uint32_t tcps_sack_fack;	/* SACK sequence # we have acked */
	uint32_t tcps_sack_snxt;	/* next SACK seq # for retransmission */
        uint32_t tcps_rto;              /* round-trip timeout, msec */
	uint32_t tcps_mss;		/* max segment size */
        int tcps_retransmit;            /* retransmit send event, boolean */
} tcpsinfo_t;

It may seem redundant to supply the local and remote ports and addresses here as well as in the tcpinfo_t below, but the tcp:::state-change probes do not have associated tcpinfo_t data, so in order to map the state change to a specific port, we need this data here.

Table 11-42  tcpsinfo_t Members
tcps_addr
Address of translated tcp_t *.
tcps_local
is local, boolean. 0: is not delivered locally (uses a physical network interface), 1: is delivered locally (including loopback interfaces, eg lo0).
tcps_active
is an active open, boolean. 0: TCP connection was created from a remote host, 1: TCP connection was created from this host.
tcps_lport
local port associated with the TCP connection.
tcps_rport
remote port associated with the TCP connection.
tcps_laddr
local address associated with the TCP connection, as a string.
tcps_raddr
remote address associated with the TCP connection, as a string.
tcps_state
TCP state. Inline definitions are provided for the various TCP states: TCP_STATE_CLOSED, TCP_STATE_SYN_SENT, etc. Use inline tcp_state_string[] to convert state to a string.
tcps_iss
Initial sequence number sent.
tcps_suna
Lowest sequence number for which we have sent data but not received acknowledgement.
tcps_snxt
Next sequence number to send. tcps_snxt - tcps_suna gives the number of bytes pending acknowledgement for the TCP connection.
tcps_rack
Highest sequence number for which we have received and sent acknowledgement.
tcps_rnxt
Next sequence number expected on receive side. tcps_rnxt - tcps_rack gives the number of bytes we have received but not yet acknowledged for the TCP connection.
tcps_swnd
TCP send window size.
tcps_snd_ws
TCP send window scale. tcps_swnd << tcp_snd_ws gives the scaled window size if window scaling options are in use.
tcps_rwnd
TCP receive window size.
tcps_rcv_ws
TCP receive window scale. tcps_rwnd << tcp_rcv_ws gives the scaled window size if window scaling options are in use.
tcps_cwnd
TCP congestion window size. tcps_cwnd_ssthresh TCP congestion window threshold. When the congestion window is greater than ssthresh, congestion avoidance begins.
tcps_cwnd_ssthresh
TCP congestion window threshold. When the congestion window is greater than ssthresh, congestion avoidance begins.
tcps_sack_fack
Highest SACK-acked sequence number.
tcps_sack_snxt
Next sequence num to be retransmitted using SACK.
tcps_rto
Round-trip timeout. If we do not receive acknowledgement of data sent tcps_rto msec ago, retransmit is required.
tcps_mss
Maximum segment size.
tcps_retransmit
send is a retransmit, boolean. 1 for tcp:::send events that are retransmissions, 0 for tcp events that are not send events, and for send events that are not retransmissions.
tcplsinfo_t Structure

The tcplsinfo_t structure contains the previous tcp state during a state change.

typedef struct tcplsinfo {
        int32_t tcps_state;              /* TCP state */
} tcplsinfo_t;
Table 11-43  tcplsinfo_t Members
tcps_state
previous TCP state. Inline definitions are provided for the various TCP states: TCP_STATE_CLOSED, TCP_STATE_SYN_SENT, etc. Use inline tcp_state_string[] to convert state to a string.
tcpinfo_t Structure

The tcpinfo_t structure is a DTrace translated version of the TCP header.

typedef struct tcpinfo {
        uint16_t tcp_sport;             /* source port */
        uint16_t tcp_dport;             /* destination port */
        uint32_t tcp_seq;               /* sequence number */
        uint32_t tcp_ack;               /* acknowledgment number */
        uint8_t tcp_offset;             /* data offset, in bytes */
        uint8_t tcp_flags;              /* flags */
        uint16_t tcp_window;            /* window size */
        uint16_t tcp_checksum;          /* checksum */
        uint16_t tcp_urgent;            /* urgent data pointer */
        tcph_t *tcp_hdr;                /* raw TCP header */
} tcpinfo_t;
Table 11-44  tcpinfo_t Members
tcp_sport
TCP source port.
tcp_dport
TCP destination port.
tcp_seq
TCP sequence number.
tcp_ack
TCP acknowledgment number.
tcp_offset
Payload data offset, in bytes (not 32-bit words).
tcp_flags
TCP flags. See the tcp_flags table below for available macros.
tcp_window
TCP window size, bytes.
tcp_checksum
Checksum of TCP header and payload.
tcp_urgent
TCP urgent data pointer, bytes.
tcp_hdr
Pointer to raw TCP header at time of tracing.
Table 11-45  tcp_flags Values
TH_FIN
No more data from sender (finish).
TH_SYN
Synchronize sequence numbers (connect).
TH_RST
Reset the connection.
TH_PUSH
TCP push function.
TH_ACK
Acknowledgment field is set.
TH_URG
Urgent pointer field is set.
TH_ECE
Explicit congestion notification echo (see RFC-3168).
TH_CWR
Congestion window reduction.

See RFC-793 for a detailed explanation of the standard TCP header fields and flags.

Examples

Some simple examples of tcp provider usage follow.

Connections by Host Address

This DTrace one-liner counts inbound TCP connections by source IP address:

# dtrace -n 'tcp:::accept-established { @[args[3]->tcps_raddr] = count(); }'
dtrace: description 'tcp:::state-change' matched 1 probes
^C

  127.0.0.1                                                         1
  192.168.2.88                                                      1
  fe80::214:4fff:fe8d:59aa                                          1
  192.168.1.109                                                     3

The output above shows there were 3 TCP connections from 192.168.1.109, a single TCP connection from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.

Connections by TCP Port

This DTrace one-liner counts inbound TCP connections by local TCP port:

# dtrace -n 'tcp:::accept-established { @[args[3]->tcps_lport] = count(); }'
dtrace: description 'tcp:::state-change' matched 1 probes
^C

 40648                1
    22                3

The output above shows there were 3 TCP connections for port 22 (ssh), a single TCP connection for port 40648 (an RPC port).

Who is Connecting to What

Combining the previous two examples produces a useful one liner, to quickly identify who is connecting to what:

# dtrace -n 
'tcp:::accept-established { @[args[3]->tcps_raddr, args[3]->tcps_lport] = count(); }' 
dtrace: description 'tcp:::state-change' matched 1 probes
^C

  192.168.2.88                                       40648                1
  fe80::214:4fff:fe8d:59aa                              22                1
  192.168.1.109                                         22                3

The output above shows there were 3 TCP connections from 192.168.1.109 to port 22 (ssh), etc.

Who Isn't Connecting to What

It may be useful when troubleshooting connection issues to see who is failing to connect to their requested ports. This is equivalent to seeing where incoming SYNs arrive when no listener is present, as per RFC793:

# dtrace -n 'tcp:::accept-refused 
{ @[args[2]->ip_daddr, args[4]->tcp_sport] = count(); }' 
dtrace: description 'tcp:::receive ' matched 1 probes
^C

  192.168.1.109                                         23                2

Here we traced two failed attempts by host 192.168.1.109 to connect to port 23 (telnet).

Packets by Host Address

This DTrace one-liner counts TCP received packets by host address:

# dtrace -n 'tcp:::receive { @[args[2]->ip_saddr] = count(); }'
dtrace: description 'tcp:::receive ' matched 5 probes
^C

  127.0.0.1                                                         7
  fe80::214:4fff:fe8d:59aa                                         14
  192.168.2.30                                                     43
  192.168.1.109                                                    44
  192.168.2.88                                                   3722

The output above shows that 7 TCP packets were received from 127.0.0.1, 14 TCP packets from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.

Packets by Local Port

This DTrace one-liner counts TCP received packets by the local TCP port:

# dtrace -n 'tcp:::receive { @[args[4]->tcp_dport] = count(); }'
dtrace: description 'tcp:::receive ' matched 5 probes
^C

 42303                3
 42634                3
  2049               27
 40648               36
    22              162

The output above shows that 162 packets were received for port 22 (ssh), 36 packets were received for port 40648 (an RPC port), 27 packets for 2049 (NFS), and a few packets to high numbered client ports.

Sent Size Distribution

This DTrace one-liner prints distribution plots of IP payload size by destination, for TCP sends:

# dtrace -n 'tcp:::send { @[args[2]->ip_daddr] = quantize(args[2]->ip_plength); }'
dtrace: description 'tcp:::send ' matched 3 probes
^C

  192.168.1.109                                     
           value  ------------- Distribution ------------- count    
              32 |                                         0        
              64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    14       
             128 |@@@                                      1        
             256 |                                         0        

  192.168.2.30                                      
           value  ------------- Distribution ------------- count    
              16 |                                         0        
              32 |@@@@@@@@@@@@@@@@@@@@                     7        
              64 |@@@@@@@@@                                3        
             128 |@@@                                      1        
             256 |@@@@@@                                   2        
             512 |@@@                                      1        
            1024 |                                         0        
tcpstate.d

This DTrace script demonstrates the capability to trace TCP state changes:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option switchrate=10

int last[int];

dtrace:::BEGIN
{
        printf(" %3s %12s  %-20s    %-20s\n", "CPU", "DELTA(us)", "OLD", "NEW");
}

tcp:::state-change
/ last[args[1]->cs_cid] /
{
        this->elapsed = (timestamp - last[args[1]->cs_cid]) / 1000;
        printf(" %3d %12d  %-20s -> %-20s\n", cpu, this->elapsed,
            tcp_state_string[args[5]->tcps_state], 
					 tcp_state_string[args[3]->tcps_state]);
        last[args[1]->cs_cid] = timestamp;
}

tcp:::state-change
/ last[args[1]->cs_cid] == 0 /
{
        printf(" %3d %12s  %-20s -> %-20s\n", cpu, "-",
            tcp_state_string[args[5]->tcps_state],
            tcp_state_string[args[3]->tcps_state]);
        last[args[1]->cs_cid] = timestamp;

This script was run on a system for a couple of minutes:

# ./tcpstate.d 

 CPU    DELTA(us)  OLD                     NEW                 
   0            -  state-listen         -> state-syn-received  
   0          613  state-syn-received   -> state-established   
   0            -  state-idle           -> state-bound         
   0           63  state-bound          -> state-syn-sent      
   0          685  state-syn-sent       -> state-bound         
   0           22  state-bound          -> state-idle          
   0          114  state-idle           -> state-closed  

In the above example output, an inbound connection is traced, It takes 613 us to go from syn-received to established. An outbound connection attempt is also made to a closed port. It takes 63us to go from bound to syn-sent, 685 us to go from syn-sent to bound etc.

The fields printed are:

Field
Description
CPU
CPU id for the event
DELTA(us)
time since previous event for that connection, microseconds
OLD
old TCP state
NEW
new TCP state
tcpio.d

The following DTrace script traces TCP packets and prints various details:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option switchrate=10hz

dtrace:::BEGIN
{
        printf(" %3s %15s:%-5s      %15s:%-5s %6s  %s\n", "CPU",
            "LADDR", "LPORT", "RADDR", "RPORT", "BYTES", "FLAGS");
}

tcp:::send
{
        this->length = args[2]->ip_plength - args[4]->tcp_offset;
        printf(" %3d %16s:%-5d -> %16s:%-5d %6d  (", cpu,
            args[2]->ip_saddr, args[4]->tcp_sport,
            args[2]->ip_daddr, args[4]->tcp_dport, this->length);
}

tcp:::receive
{
        this->length = args[2]->ip_plength - args[4]->tcp_offset;
        printf(" %3d %16s:%-5d <- %16s:%-5d %6d  (", cpu,
            args[2]->ip_daddr, args[4]->tcp_dport,
            args[2]->ip_saddr, args[4]->tcp_sport, this->length);
}

tcp:::send,
tcp:::receive
{
        printf("%s", args[4]->tcp_flags & TH_FIN ? "FIN|" : "");
        printf("%s", args[4]->tcp_flags & TH_SYN ? "SYN|" : "");
        printf("%s", args[4]->tcp_flags & TH_RST ? "RST|" : "");
        printf("%s", args[4]->tcp_flags & TH_PUSH ? "PUSH|" : "");
        printf("%s", args[4]->tcp_flags & TH_ACK ? "ACK|" : "");
        printf("%s", args[4]->tcp_flags & TH_URG ? "URG|" : "");
        printf("%s", args[4]->tcp_flags & TH_ECE ? "ECE|" : "");
        printf("%s", args[4]->tcp_flags & TH_CWR ? "CWR|" : "");
        printf("%s", args[4]->tcp_flags == 0 ? "null " : "");
        printf("\b)\n");
}

This example output has captured a TCP handshake:

# ./tcpio.d
 CPU           LADDR:LPORT                RADDR:RPORT  BYTES  FLAGS
   1     192.168.2.80:22    ->    192.168.1.109:60337    464  (PUSH|ACK)
   1     192.168.2.80:22    ->    192.168.1.109:60337     48  (PUSH|ACK)
   2     192.168.2.80:22    ->    192.168.1.109:60337     20  (PUSH|ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337      0  (SYN)
   3     192.168.2.80:22    ->    192.168.1.109:60337      0  (SYN|ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337      0  (ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337      0  (ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337     20  (PUSH|ACK)
   3     192.168.2.80:22    ->    192.168.1.109:60337      0  (ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337      0  (ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337    376  (PUSH|ACK)
   3     192.168.2.80:22    ->    192.168.1.109:60337      0  (ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337     24  (PUSH|ACK)
   2     192.168.2.80:22    ->    192.168.1.109:60337    736  (PUSH|ACK)
   3     192.168.2.80:22    <-    192.168.1.109:60337      0  (ACK)

The fields printed are:

Field
Description
CPU
CPU id that event occurred on
LADDR
local IP address
LPORT
local TCP port
RADDR
remote IP address
RPORT
remote TCP port
BYTES
TCP payload bytes
FLAGS
TCP flags

Note - The output may be shuffled slightly on multi-CPU servers due to DTrace per-CPU buffering, and events such as the TCP handshake can be printed out of order. Keep an eye on changes in the CPU column, or add a timestamp column to this script and post sort.

Stability

The tcp provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information about the stability mechanism, see Chapter 18, Stability.

Element
Name Stability
Data Stability
Dependency Class
Provider
Evolving
Evolving
ISA
Module
Private
Private
Unknown
Function
Private
Private
Unknown
Name
Evolving
Evolving
ISA
Arguments
Evolving
Evolving
ISA