What to Look For (Sun Java System Portal Server 7 Deployment Planning Guide)

Sun Java System Portal Server 7 Deployment Planning Guide

What to Look For

colls - collisions. If your network is not switched, then a low level of collisions is expected. As the network becomes increasingly saturated, collision will increase and eventually will become a bottleneck. The best solution for collisions is a switched network.
errs - errors. The presence of errors could indicate device errors. If your network is switched, errors indicate that you are nearly consuming the bandwidth capacity of your network. The solution to this problem is to give the system more bandwidth, which can be achieved through more network interfaces or a network bandwidth upgrade. This is highly dependent on your particular network architecture.

Considerations

If network saturation is occuring quickly (saturation at less than 8CPUs for an application server running on a 100mbit Ethernet), then an investigation to ensure conservative network usage is a good first step.
Increase network bandwidth. Steps that possibly can be taken: upgrade to a switched network, more network interfaces are a possible solution or upgrade to a higher bandwidth network to accommodate your network traffic demand.

netstat —sP tcp options are used to analyze the TCP kernel module. Many of the fields reported represent fields in the kernel module that indicate bottlenecks. These bottlenecks can be addressed using the ndd command and the tuning parameters.

netstat -sP tcp Output

#netstat -sP tcp

TCP     tcpRtoAlgorithm     =     4     tcpRtoMin           =   400

        <snip>

        tcpInDupSegs        =  1144     tcpInDupBytes       =132520
        tcpInPartDupSegs    =     1     tcpInPartDupBytes   =   416
        tcpInPastWinSegs    =     0     tcpInPastWinBytes   =     0
        tcpInWinProbe       =    46     tcpInWinUpdate      =    48
        tcpInClosed         =   251     tcpRttNoUpdate      =   344
        tcpRttUpdate        =1105386    tcpTimRetrans       =   989
        tcpTimRetransDrop   =     5     tcpTimKeepalive     =   818
        tcpTimKeepaliveProbe=   183     tcpTimKeepaliveDrop =     0
        tcpListenDrop       =     0     tcpListenDropQ0     =     0
        tcpHalfOpenDrop     =     0     tcpOutSackRetrans   =    56

What to look for

tcpListenDrop - If after several looks at the command output the tcpListenDrop continues to increase, it could indicate a problem with queue size.

Considerations:

A possible cause of increasing tcpListenDrop is the application throughput being bottlenecked by the number of executing threads. At this point increasing application threads may be a good thing to try.
Increase queue size. Increase the request queue sizes using ndd. More information on other ndd commands referenced in the Solaris Administration Guide.

ondd -set /dev/tcp tcp_conn_req_max_q value

ondd -set /dev/tcp tcp_conn_req_max_q0 value

netstat -a | grep your_hostname | wc -l

Running this command gives a rough count of socket connections on the system. The number of connections open at one time is limited; you can use this tool to look for bottlenecks.

netstat -a | grep <your_hostname> | wc -l Output

#netstat -a | wc -l

34567

What to Look For

socket count - If the number returned is greater than 20,000 then the number of socket connections could be a possible bottleneck.

Consider the following:

Decrease the point where number of anonymous socket connections start.

ondd -set /dev/tcp tcp_smallest_anon_port value
Decrease the time a TCP connection stays in TIME_WAIT.

ondd -set /dev/tcp tcp_time_wait_interval value