JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Data Service for Network File System (NFS) Guide     Oracle Solaris Cluster 4.0
search filter icon
search icon

Document Information

Preface

1.  Installing and Configuring HA for NFS

Overview of the Installation and Configuration Process for HA for NFS

Planning the HA for NFS Installation and Configuration

Service Management Facility Restrictions

NFSv3 Restrictions

Loopback File System Restrictions

ZFS Restrictions

Installing the HA for NFS Package

How to Install the HA for NFS Package

Registering and Configuring HA for NFS

Setting HA for NFS Extension Properties

Tools for Registering and Configuring HA for NFS

How to Register and Configure HA for NFS (clsetup)

How to Register and Configure HA for NFS (Command Line Interface)

How to Change Share Options on an NFS File System

How to Dynamically Update Shared Paths on an NFS File System

How to Tune HA for NFS Method Timeouts

Configuring SUNW.HAStoragePlus Resource Type

How to Set Up the HAStoragePlus Resource Type for an NFS-Exported UNIX File System Using the Command Line Interface

How to Set Up the HAStoragePlus Resource Type for an NFS-Exported ZFS

Tuning the HA for NFS Fault Monitor

Fault Monitor Startup

Fault Monitor Stop

Operations of HA for NFS Fault Monitor During a Probe

NFS System Fault Monitoring Process

NFS Resource Fault Monitoring Process

Monitoring of File Sharing

Upgrading the SUNW.nfs Resource Type

Information for Registering the New Resource Type Version

Information for Migrating Existing Instances of the Resource Type

A.  HA for NFS Extension Properties

Index

Tuning the HA for NFS Fault Monitor

The HA for NFS fault monitor is contained in a resource whose resource type is SUNW.nfs.

For general information about the operation of fault monitors, see Tuning Fault Monitors for Oracle Solaris Cluster Data Services in Oracle Solaris Cluster Data Services Planning and Administration Guide.

Fault Monitor Startup

The NFS resource MONITOR_START method starts the NFS system fault monitor. This start method first checks if the NFS system fault monitor nfs_daemons_probeis already running under the process monitor daemon rpc.pmfd. If the NFS system fault monitor is not running, the start method starts the nfs_daemons_probe process under the control of the process monitor. The start method then starts the resource fault monitor nfs_probe, also under the control of the process monitor.

Fault Monitor Stop

The NFS resource MONITOR_STOP method stops the resource fault monitor. If no other NFS resource fault monitor is running on the local node, the stop method stops the NFS system fault monitor.

Operations of HA for NFS Fault Monitor During a Probe

This section describes the operations of the following fault monitoring processes:

NFS System Fault Monitoring Process

The NFS system fault monitor probe monitors the NFS daemons nfsd, mountd, statd, and lockd, and the RPC portmapper service daemon rpcbindon the local node. The probe checks for the presence of the process and its response to a null rpc call. This monitor uses the following NFS extension properties:

See Setting HA for NFS Extension Properties.

Each NFS system fault monitor probe cycle performs the following steps in a loop. The system property Cheap_probe_interval specifies the interval between probes.

  1. The fault monitor probes rpcbind.

    If the process terminates unexpectedly, but a warm restart of the daemon is in progress, rpcbind continues to probe other daemons.

    If the process terminates unexpectedly, the fault monitor reboots the node.

    If a null rpc call to the daemon terminates unexpectedly, Rpcbind_nullrpc_reboot=True, and Failover_mode=HARD, the fault monitor reboots the node.

  2. The fault monitor probes statd first, and then lockd.

    If statd or lockd terminate unexpectedly, the system fault monitor attempts to restart both daemons.

    If a null rpc call to these daemons terminates unexpectedly, the fault monitor logs a message to syslog but does not restart statd or lockd.

  3. The fault monitor probes mountd.

    If mountd terminates unexpectedly, the fault monitor attempts to restart the daemon.

    If the null rpc call to the daemon terminates unexpectedly and Mountd_nullrpc_restart=True, the fault monitor attempts to restart mountd if the cluster file system is available.

  4. The fault monitor probes nfsd.

    If nfsd terminates unexpectedly, the fault monitor attempts to restart the daemon.

    If the null rpc call to the daemon terminates unexpectedly and Nfsd_nullrpc_restart=TRUE, the fault monitor attempts to restart nfsd if the cluster file system is available.

  5. If any one of the above NFS daemons (except rpcbind) fails to restart during a probe cycle, the NFS system fault monitor will retry the restart in the next cycle. When all the NFS daemons are restarted and healthy, the resource status is set to ONLINE. The monitor tracks unexpected terminations of NFS daemons in the last Retry_interval. When the total number of unexpected daemon terminations has reached Retry_count, the system fault monitor issues a scha_control giveover. If the giveover call fails, the monitor attempts to restart the failed NFS daemon.

  6. At the end of each probe cycle, if all daemons are healthy, the monitor clears the history of failures.

NFS Resource Fault Monitoring Process

NFS resource fault monitoring is specific to each NFS resource. The fault monitor of each resource checks the status of each shared path to monitor the file systems that the resource exports.

Before starting the NFS resource fault monitor probes, all the shared paths are read from the dfstab file and stored in memory. In each probe cycle, the probe performs the following steps.

  1. If dfstab has been changed since the last read, the probe refreshes the memory.

    If an error occurs while reading the dfstab file, the resource status is set to FAULTED, and the monitor skips the remainder of the checks in the current probe cycle.

  2. The fault monitor probes all the shared paths in each iteration by performing stat() on the path.

    If any path is not functional, the resource status is set to FAULTED.

  3. The probe checks for the presence of NFS daemons (nfsd, mountd, lockd, statd) and rpcbind.

  4. If any of these daemons are down, the resource status is set to FAULTED.

  5. If all shared paths are valid and NFS daemons are present, the resource status is reset to ONLINE.

Monitoring of File Sharing

The Oracle Solaris Cluster HA for NFS fault monitor probe monitors the success or failure of file sharing by monitoring the following files:

If the probe detects any modification to any of these files, it shares the paths in dfstab.resource again.