|C H A P T E R 6|
For debugging purposes configure remote IP access to all nodes in the cluster. For more information, see “Cluster Addressing and Networking” in Netra High Availability Suite 3.0 1/08 Foundation Services Overview.
You can use standard Solaris Operating System commands in the Foundation Services environment. For debugging applications that interact with the Foundation Services nodes use the debugging software provided with the Sun Studio 10 software.
Configure applications to report errors and their causes. This information can be used during troubleshooting to reduce the risk of the re-occurrence of similar errors. To facilitate recovery from an error, you can provide the following information:
In the Foundation Services, standard error and alert messages are sent to system log files. In error scenarios, you can refer to the system log files to determine the history of a process. Critical errors are written on the console in addition to being logged in the system log files.
While it is true that errors can cause notifications to be sent, notifications are events and are not errors in themselves. For information on notifications, see Introduction to Change Notifications.
The NMA enables you to receive information on notifications. Statistics are available to diagnose the cause of errors received. See the Netra High Availability Suite 3.0 1/08 Foundation Services NMA Programming Guide.
You cannot debug critical services, such as the CMM or Reliable NFS, on a running cluster. Debugging would interrupt the regular messages that these services send between nodes. Debugging tools, such as the truss command, cannot be used on daemons while they are being monitored by the Daemon Monitor.
Before debugging a Foundation Services daemon or a monitored Solaris daemon, stop the Daemon Monitor from monitoring the daemon that you want to debug. When you have finished debugging, restart the Daemon Monitor.
For information about how to stop and restart the Daemon Monitor, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide. For a list of monitored daemons, see the nhpmd1M (Solaris) or nhpmd8(Linux) man page.
If one of the applications you are running on your cluster terminates suddenly, CMM notification pipes that this application opened are kept on the nhcmmd side. You can be left with a broken pipe from the CMM to the dead application. If the CMM later sends a notification to this dead application, the CMM realizes that the application is dead and closes the broken pipe. Alternatively, the CMM frequently checks to see if a client application is dead and if necessary, closes associated pipes.
# Dec 23 09:56:07 machine_name CMM: S-CMM notif to /var/run/CMM_884_00000005 fails: Broken pipe
The CMM detects the problem and closes the notification pipe. For further information on accessing system log files, see “Accessing and Maintaining System Log Messages” in the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide and the syslog.conf4 man page.