These commands can be useful when troubleshooting NFS problems.
You can use this command to gather statistical information about NFS and RPC connections. The syntax of the command is as follows:
nfsstat [ -cmnrsz ]
-c |
Displays client-side information |
-m |
Displays statistics for each NFS mounted file system |
-n |
Specifies that NFS information is to be displayed (both client and server side) |
-r |
Displays RPC statistics |
-s |
Displays the server-side information |
-z |
Specifies that the statistics should be set to zero |
If no options are supplied on the command line, the -cnrs options are used.
Gathering server-side statistics can be important for debugging problems when new software or hardware is added to the computing environment. Running this command a minimum of once a week, and storing the numbers, provides a good history of previous performance.
# nfsstat -s Server rpc: Connection oriented: calls badcalls nullrecv badlen xdrcall dupchecks dupreqs 11420263 0 0 0 0 1428274 19 Connectionless: calls badcalls nullrecv badlen xdrcall dupchecks dupreqs 14569706 0 0 0 0 953332 1601 Server nfs: calls badcalls 24234967 226 Version 2: (13073528 calls) null getattr setattr root lookup readlink read 138612 1% 1192059 9% 45676 0% 0 0% 9300029 71% 9872 0% 1319897 10% wrcache write create remove rename link symlink 0 0% 805444 6% 43417 0% 44951 0% 3831 0% 4758 0% 1490 0% mkdir rmdir readdir statfs 2235 0% 1518 0% 51897 0% 107842 0% Version 3: (11114810 calls) null getattr setattr lookup access readlink read 141059 1% 3911728 35% 181185 1% 3395029 30% 1097018 9% 4777 0% 960503 8% write create mkdir symlink mknod remove rmdir 763996 6% 159257 1% 3997 0% 10532 0% 26 0% 164698 1% 2251 0% rename link readdir readdirplus fsstat fsinfo pathconf 53303 0% 9500 0% 62022 0% 79512 0% 3442 0% 34275 0% 3023 0% commit 73677 0% Server nfs_acl: Version 2: (1579 calls) null getacl setacl getattr access 0 0% 3 0% 0 0% 1000 63% 576 36% Version 3: (45318 calls) null getacl setacl 0 0% 45318 100% 0 0% |
The previous listing is an example of NFS server statistics. The first five lines relate to RPC and the remaining lines report NFS activities. In both sets of statistics, knowing the average number of badcalls or calls and the number of calls per week can help identify a problem. The badcalls value reports the number of bad messages from a client. This value can point out network hardware problems.
Some of the connections generate write activity on the disks. A sudden increase in these statistics could indicate trouble and should be investigated. For NFS version 2 statistics, the connections to note are setattr, write, create, remove, rename, link, symlink, mkdir, and rmdir. For NFS version 3 statistics, the value to watch is commit. If the commit level is high in one NFS server as compared to another almost identical server, check that the NFS clients have enough memory. The number of commit operations on the server grows when clients do not have available resources.
This command displays a stack trace for each process. The pstack command must be run by the owner of the process or by root. You can use pstack to determine where a process is hung. The only option that is allowed with this command is the PID of the process that you want to check (see the proc(1) man page).
The following example is checking the nfsd process that is running.
# /usr/proc/bin/pstack 243 243: /usr/lib/nfs/nfsd -a 16 ef675c04 poll (24d50, 2, ffffffff) 000115dc ???????? (24000, 132c4, 276d8, 1329c, 276d8, 0) 00011390 main (3, efffff14, 0, 0, ffffffff, 400) + 3c8 00010fb0 _start (0, 0, 0, 0, 0, 0) + 5c |
The example shows that the process is waiting for a new connection request. This is a normal response. If the stack shows that the process is still in poll after a request is made, the process might be hung. Follow the instructions in How to Restart NFS Services to fix this problem. Review the instructions in NFS Troubleshooting Procedures to fully verify that your problem is a hung program.
This command generates information about the RPC service that is running on a system. You can also use it to change the RPC service. Many options are available with this command (see the rpcinfo(1M) man page). This is a shortened synopsis for some of the options that you can use with the command:
rpcinfo [ -m | -s ] [ hostname ]
rpcinfo -T transport hostname [ progname ]
rpcinfo [ -t | -u ] [ hostname ] [ progname ]
-m |
Displays a table of statistics of the rpcbind operations |
-s |
Displays a concise list of all registered RPC programs |
-T |
Displays information about services that use specific transports or protocols |
-t |
Probes the RPC programs that use TCP |
-u |
Probes the RPC programs that use UDP |
transport |
Selects the transport or protocol for the services |
hostname |
Selects the host name of the server you need information from |
progname |
Selects the RPC program to gather information about |
If no value is given for hostname, the local host name is used. You can substitute the RPC program number for progname, but many users can remember the name and not the number. You can use the -p option in place of the -s option on those systems that do not run the NFS version 3 software.
The data that is generated by this command can include the following:
The RPC program number
The version number for a specific program
The transport protocol that is being used
The name of the RPC service
The owner of the RPC service
This example gathers information on the RPC services that are running on a server. The text that is generated by the command is filtered by the sort command to make it more readable. Several lines that list RPC services have been deleted from the example.
% rpcinfo -s bee |sort -n program version(s) netid(s) service owner 100000 2,3,4 udp6,tcp6,udp,tcp,ticlts,ticotsord,ticots rpcbind superuser 100001 4,3,2 ticlts,udp,udp6 rstatd superuser 100002 3,2 ticots,ticotsord,tcp,tcp6,ticlts,udp,udp6 rusersd superuser 100003 3,2 tcp,udp,tcp6,udp6 nfs superuser 100005 3,2,1 ticots,ticotsord,tcp,tcp6,ticlts,udp,udp6 mountd superuser 100007 1,2,3 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 ypbind superuser 100008 1 ticlts,udp,udp6 walld superuser 100011 1 ticlts,udp,udp6 rquotad superuser 100012 1 ticlts,udp,udp6 sprayd superuser 100021 4,3,2,1 tcp,udp,tcp6,udp6 nlockmgr superuser 100024 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 status superuser 100029 3,2,1 ticots,ticotsord,ticlts keyserv superuser 100068 5 tcp,udp cmsd superuser 100083 1 tcp,tcp6 ttdbserverd superuser 100099 3 ticotsord autofs superuser 100133 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser 100134 1 ticotsord tokenring superuser 100155 1 ticots,ticotsord,tcp,tcp6 smserverd superuser 100221 1 tcp,tcp6 - superuser 100227 3,2 tcp,udp,tcp6,udp6 nfs_acl superuser 100229 1 tcp,tcp6 metad superuser 100230 1 tcp,tcp6 metamhd superuser 100231 1 ticots,ticotsord,ticlts - superuser 100232 10 udp,udp6 sadmind superuser 100234 1 ticotsord gssd superuser 100235 1 tcp,tcp6 - superuser 100242 1 tcp,tcp6 metamedd superuser 100249 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser 300326 4 tcp,tcp6 - superuser 300598 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser 390113 1 tcp - unknown 805306368 1 ticots,ticotsord,ticlts,tcp,udp,tcp6,udp6 - superuser 1289637086 1,5 tcp - 26069 |
This example shows how to gather information about a particular RPC service by selecting a particular transport on a server.
% rpcinfo -t bee mountd program 100005 version 1 ready and waiting program 100005 version 2 ready and waiting program 100005 version 3 ready and waiting % rpcinfo -u bee nfs program 100003 version 2 ready and waiting program 100003 version 3 ready and waiting |
The first example checks the mountd service that is running over TCP. The second example checks the NFS service that is running over UDP.
This command is often used to watch for packets on the network. The snoop command must be run as root. The use of this command is a good way to ensure that the network hardware is functioning on both the client and the server. Many options are available (see the snoop(1M) man page). A shortened synopsis of the command follows:
snoop [ -d device ] [ -o filename ] [ host hostname ]
-d device |
Specifies the local network interface |
-o filename |
Stores all the captured packets into the named file |
hostname |
Displays packets going to and from a specific host only |
The -d device option is useful on those servers that have multiple network interfaces. You can use many other expressions besides setting the host. A combination of command expressions with grep can often generate data that is specific enough to be useful.
When troubleshooting, make sure that packets are going to and from the proper host. Also, look for error messages. Saving the packets to a file can simplify the review of the data.
You can use this command to see if a process is hung. The truss command must be run by the owner of the process or by root. You can use many options with this command (see the truss(1) man page). A shortened syntax of the command follows:
truss [ -t syscall ] -p pid
-t syscall |
Selects system calls to trace |
-p pid |
Indicates the PID of the process to be traced |
The syscall can be a comma-separated list of system calls to be traced. Also, starting syscall with a ! selects to exclude the listed system calls from the trace.
This example shows that the process is waiting for another connection request from a new client.
# /usr/bin/truss -p 243 poll(0x00024D50, 2, -1) (sleeping...) |
This is a normal response. If the response does not change after a new connection request has been made, the process could be hung. Follow the instructions in How to Restart NFS Services to fix the hung program. Review the instructions in NFS Troubleshooting Procedures to fully verify that your problem is a hung program.