NFS Server Performance and Tuning Guide for Sun Hardware

Checking Each Client

The overall tuning process must include client tuning. Sometimes, tuning the client yields more improvement than fixing the server. For example, adding 4 Mbytes of memory to each of 100 clients dramatically decreases the load on an NFS server.

To Check Each Client

Check the client statistics for NFS problems by typing nfsstat -c at the % prompt.

Look for errors and retransmits.

client % nfsstat -c
Client rpc:
calls      badcalls   retrans    badxids    timeouts  
waits      newcreds   
384687     1          52         7          52        
0          0          
badverfs   timers     toobig     nomem      cantsend  
bufulocks  
0          384        0          0          0         
0          
Client nfs:
calls      badcalls   clgets     cltoomany  
379496     0          379558     0          
Version 2: (379599 calls)
null       getattr    setattr    root       lookup    
readlink   read       
0 0%       178150 46% 614 0%     0 0%       39852 10% 
28 0%      89617 23%  
wrcache    write      create     remove     rename    
link       symlink    
0 0%       56078 14%  1183 0%    1175 0%    71 0%     
51 0%      0 0%       
mkdir      rmdir      readdir    statfs     
49 0%      0 0%       987 0%     11744 3%

The output of the nfsstat -c command shows that there were only 52 retransmits (retrans ) and 52 time-outs (timeout) out of 384687 calls.

The nfsstat -c display contains the following fields:

Table 3-9 Output of the nfsstat -c Command


`calls`	Total number of calls sent
`badcalls`	Total number of calls rejected by RPC
`retrans`	Total number of retransmissions
`badxid`	Number of times that a duplicate acknowledgment was received for a single NFS request
`timeout`	Number of calls that timed out
`wait`	Number of times a call had to wait because no client handle was available
`newcred`	Number of times the authentication information had to be refreshed

Table 3-9, shown earlier in this chapter, describes the NFS operations. Table 3-10 explains the output of the nfsstat -c command and what action to take.

Table 3-10 Description of the nfsstat -c Command Output


If	Then
`retrans` > 5% of the calls	The requests are not reaching the server.
`badxid` is approximately equal to `badcalls`	The network is slow. Consider installing a faster network or installing subnets.
`badxid` is approximately equal to `timeouts`	Most requests are reaching the server but the server is slower than expected. Watch expected times using `nfsstat -m`.
`badxid` is close to 0	The network is dropping requests. Reduce `rsize` and `wsize` in the `mount` options.
`null` > 0	A large amount of `null` calls suggests that the automounter is retrying the mount frequently. The timeout values for the mount are too short. Increase the mount timeout parameter, `timeo`, on the automounter command line

The third-party tools you can use for NFS and networks include:

NetMetrix (Hewlett-Packard)
SharpShooter (Network General)

Display statistics for each NFS mounted file system by typing nfsstat -m.

The statistics include the server name and address, mount flags, current read and write sizes, transmission count, and the timers used for dynamic transmission.

client % nfsstat -m
/export/home from server:/export/home
 Flags:   vers=2,hard,intr,dynamic,rsize=8192,wsize=8192,retrans=5
 Lookups: srtt=10 (25ms), dev=4 (20ms), cur=3 (60ms)
 Reads:   srtt=9 (22ms), dev=7 (35ms), cur=4 (80ms)
 Writes:  srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
 All:     srtt=11 (27ms), dev=4 (20ms), cur=3 (60ms)

Descriptions of the following terms, used in the output of the nfsstat -m command, follow:

Table 3-11 Description of the Output of the nfsstat -m Command


`srtt`	Smoothed round-trip time
`dev`	Estimated deviation
`cur`	Current backed-off timeout value

The numbers in parentheses in the previous code example are the actual times in milliseconds. The other values are unscaled values kept by the operating system kernel. You can ignore the unscaled values. Response times are shown for lookups, reads, writes, and a combination of all of these operations (all). Table 3-12 shows the appropriate action for the nfsstat -m command.

Table 3-12 Results of the nfsstat -m Command


If	Then
`srtt` > 50 ms	That mount point is slow. Check the network and the server for the disk(s) that provide that mount point. See "To Check the Network"" and "To Check the NFS Server"" earlier in this chapter.
The message "`NFS server not responding"` is displayed	Try increasing the `timeo` parameter in the `/etc/vfstab` file to eliminate the messages and improve performance. Doubling the initial `timeo` parameter value is a good baseline. After changing the `timeo` value in the `vfstab` file, invoke the `nfsstat -c` command and observe the `badxid` value returned by the command. Follow the recommendations for the `nfsstat -c` command earlier in this section.
`Lookups: cur` > 80 ms	The requests are taking too long to process. This indicates a slow network or a slow server.
`Reads: cur` > 150 ms	The requests are taking too long to process. This indicates a slow network or a slow server.
`Writes: cur` > 250 ms	The requests are taking too long to process. This indicates a slow network or a slow server.