Sun Cluster 2.2 Software Installation Guide

Fault Monitoring Behavior

Netscape Messaging Server includes five separate daemons: smtpd, popd, imapd, mshttpd, and stored. These daemons can be stopped and started individually, and can fail individually.

Sun Cluster HA for Netscape fault monitoring checks that daemon processes exist and that protocol services are available. During process existing checking, the fault probe periodically verifies that a daemon exists. The fault probe interprets any daemon absence as an application failure, and takes action based on the current configuration parameters. During protocol probing, the fault probe periodically checks the daemon and takes action only in response to error codes indicating a timeout. The default timeout value set by Sun Cluster HA for Netscape is 660 seconds, to prevent inadvertent failovers in situations where a server is simply slow to respond.

Because this fault monitoring model relies on a fully active mail server, you must always turn off the Sun Cluster HA for Netscape data service (using hareg -n) before you perform any administrative tasks that require a daemon to stop. Otherwise the fault probe will take action. Turn on the data service (using hareg -y) only after completing the administrative task.

Sun Cluster HA for Netscape monitors the smtpd, popd, and imapd daemons with both process existence checking (using the local probe) and protocol probing (using both local and remote probes). Sun Cluster HA for Netscape monitors the mshttpd and stored daemons with only process existence checking (using the local probe). The mshttpd and stored daemons are never checked by a remote probe. Therefore, if an mshttpd process exists but is stalled, Sun Cluster HA for Netscape will take no action; once you notice that web mail clients are unable to connect, you must restart the mshttpd process manually.

Sun Cluster HA for Netscape does not monitor any SNMP subagents.