Sun Cluster 3.0 Data Services Installation and Configuration Guide

NFS System Fault Monitor Process

The system fault monitor probes rpcbind, statd, lockd, nfsd, and mountd by checking for the presence of the process and its response to a null rpc call. This monitor uses the following NFS extension properties:

Rpcbind_nullrpc_timeoutLockd_nullrpc_timeout
Nfsd_nullrpc_timeoutRpcbind_nullrpc_reboot
Mountd_nullrpc_timeoutNfsd_nullrpc_restart

Statd_nullrpc_timeout

Mountd_nullrpc_restart

For a description of these properties, see Chapter 7, Installing and Configuring Sun Cluster HA for Network File System (NFS).

Each system fault monitor probe cycle does the following:

  1. Sleeps for Cheap_probe_interval.

  2. Probes rpcbind.

    If the process dies, reboots the system if Failover_mode=HARD.

    If a null rpc call fails and if Rpcbind_nullrpc_reboot=True and Failover_mode=HARD, reboots the system.

  3. Probes statd and lockd.

    If either of these daemons dies, restarts both.

    If a null rpc call fails, logs a message to syslog but does not restart.

  4. Probe mountd and mountd.

    If the process dies, restart it.

    If a null rpc call fails, restart mountd if the PXFS device is available and the extension property Mountd_nullrpc_restart=True.

If any of the NFS daemons fails to restart, the status of all online NFS resources is set to FAULTED. When all NFS daemons are restarted and healthy, the resource status is set to ONLINE again.