NFS Server Performance and Tuning Guide for Sun Hardware

Checking Each Client

The overall tuning process must include client tuning. Sometimes, tuning the client yields more improvement than fixing the server. For example, adding 4 Mbytes of memory to each of 100 clients dramatically decreases the load on an NFS server.

To Check Each Client
  1. Check the client statistics for NFS problems by typing nfsstat -c at the % prompt.

    Look for errors and retransmits.


    client % nfsstat -c
    Client rpc:
    calls      badcalls   retrans    badxids    timeouts  
    waits      newcreds   
    384687     1          52         7          52        
    0          0          
    badverfs   timers     toobig     nomem      cantsend  
    bufulocks  
    0          384        0          0          0         
    0          
    Client nfs:
    calls      badcalls   clgets     cltoomany  
    379496     0          379558     0          
    Version 2: (379599 calls)
    null       getattr    setattr    root       lookup    
    readlink   read       
    0 0%       178150 46% 614 0%     0 0%       39852 10% 
    28 0%      89617 23%  
    wrcache    write      create     remove     rename    
    link       symlink    
    0 0%       56078 14%  1183 0%    1175 0%    71 0%     
    51 0%      0 0%       
    mkdir      rmdir      readdir    statfs     
    49 0%      0 0%       987 0%     11744 3%   
    

    The output of the nfsstat -c command shows that there were only 52 retransmits (retrans ) and 52 time-outs (timeout) out of 384687 calls.

    The nfsstat -c display contains the following fields:

    Table 3-9 Output of the nfsstat -c Command

    calls

    Total number of calls sent 

    badcalls

    Total number of calls rejected by RPC 

    retrans

    Total number of retransmissions 

    badxid

    Number of times that a duplicate acknowledgment was received for a single NFS request 

    timeout

    Number of calls that timed out 

    wait

    Number of times a call had to wait because no client handle was available 

    newcred

    Number of times the authentication information had to be refreshed 

    Table 3-9, shown earlier in this chapter, describes the NFS operations. Table 3-10 explains the output of the nfsstat -c command and what action to take.

    Table 3-10 Description of the nfsstat -c Command Output

    If 

    Then 

    retrans > 5% of the calls

    The requests are not reaching the server. 

    badxid is approximately equal to badcalls

    The network is slow. Consider installing a faster network or installing subnets. 

    badxid is approximately equal to timeouts

    Most requests are reaching the server but the server is slower than expected. Watch expected times using nfsstat -m.

    badxid is close to 0

    The network is dropping requests. Reduce rsize and wsize in the mount options.

    null > 0

    A large amount of null calls suggests that the automounter is retrying the mount frequently. The timeout values for the mount are too short. Increase the mount timeout parameter, timeo, on the automounter command line

    The third-party tools you can use for NFS and networks include:

    • NetMetrix (Hewlett-Packard)

    • SharpShooter (Network General)

  2. Display statistics for each NFS mounted file system by typing nfsstat -m.

    The statistics include the server name and address, mount flags, current read and write sizes, transmission count, and the timers used for dynamic transmission.


    client % nfsstat -m
    /export/home from server:/export/home
     Flags:   vers=2,hard,intr,dynamic,rsize=8192,wsize=8192,retrans=5
     Lookups: srtt=10 (25ms), dev=4 (20ms), cur=3 (60ms)
     Reads:   srtt=9 (22ms), dev=7 (35ms), cur=4 (80ms)
     Writes:  srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
     All:     srtt=11 (27ms), dev=4 (20ms), cur=3 (60ms)
    

    Descriptions of the following terms, used in the output of the nfsstat -m command, follow:

    Table 3-11 Description of the Output of the nfsstat -m Command

    srtt

    Smoothed round-trip time 

    dev

    Estimated deviation 

    cur

    Current backed-off timeout value 

    The numbers in parentheses in the previous code example are the actual times in milliseconds. The other values are unscaled values kept by the operating system kernel. You can ignore the unscaled values. Response times are shown for lookups, reads, writes, and a combination of all of these operations (all). Table 3-12 shows the appropriate action for the nfsstat -m command.

    Table 3-12 Results of the nfsstat -m Command

    If 

    Then 

    srtt > 50 ms

    That mount point is slow. Check the network and the server for the disk(s) that provide that mount point. See "To Check the Network"" and "To Check the NFS Server"" earlier in this chapter.

    The message "NFS server not responding" is displayed

    Try increasing the timeo parameter in the /etc/vfstab file to eliminate the messages and improve performance. Doubling the initial timeo parameter value is a good baseline. After changing the timeo value in the vfstab file, invoke the nfsstat -c command and observe the badxid value returned by the command. Follow the recommendations for the nfsstat -c command earlier in this section.

    Lookups: cur > 80 ms

    The requests are taking too long to process. This indicates a slow network or a slow server. 

    Reads: cur > 150 ms

    The requests are taking too long to process. This indicates a slow network or a slow server. 

    Writes: cur > 250 ms

    The requests are taking too long to process. This indicates a slow network or a slow server.