C H A P T E R  6

Debugging Applications in the Foundation Services

For information about how to report and check errors caused by applications and how to debug application, see the following sections:

For debugging purposes configure remote IP access to all nodes in the cluster. For more information, see “Cluster Addressing and Networking” in Netra High Availability Suite 3.0 1/08 Foundation Services Overview.

You can use standard Solaris Operating System commands in the Foundation Services environment. For debugging applications that interact with the Foundation Services nodes use the debugging software provided with the Sun Studio 10 software.

Reporting Application Errors

Configure applications to report errors and their causes. This information can be used during troubleshooting to reduce the risk of the re-occurrence of similar errors. To facilitate recovery from an error, you can provide the following information:

For a list of the SA Forum/CLM API error codes, refer to the SA Forum documentation.

Reading Error Information for Debugging

In the Foundation Services, standard error and alert messages are sent to system log files. In error scenarios, you can refer to the system log files to determine the history of a process. Critical errors are written on the console in addition to being logged in the system log files.

While it is true that errors can cause notifications to be sent, notifications are events and are not errors in themselves. For information on notifications, see Introduction to Change Notifications.

The NMA enables you to receive information on notifications. Statistics are available to diagnose the cause of errors received. See the Netra High Availability Suite 3.0 1/08 Foundation Services NMA Programming Guide.

Note - NMA is not available for use on the Linux platform, and is only supported for use with the Solaris OS.

For information about using and configuring system log files, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide.

Stopping the Daemon Monitor for Debugging

You cannot debug critical services, such as the CMM or Reliable NFS, on a running cluster. Debugging would interrupt the regular messages that these services send between nodes. Debugging tools, such as the truss command, cannot be used on daemons while they are being monitored by the Daemon Monitor.

Before debugging a Foundation Services daemon or a monitored Solaris daemon, stop the Daemon Monitor from monitoring the daemon that you want to debug. When you have finished debugging, restart the Daemon Monitor.

For information about how to stop and restart the Daemon Monitor, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide. For a list of monitored daemons, see the nhpmd1M (Solaris) or nhpmd8(Linux) man page.

Broken Pipe Error Messages

If one of the applications you are running on your cluster terminates suddenly, CMM notification pipes that this application opened are kept on the nhcmmd side. You can be left with a broken pipe from the CMM to the dead application. If the CMM later sends a notification to this dead application, the CMM realizes that the application is dead and closes the broken pipe. Alternatively, the CMM frequently checks to see if a client application is dead and if necessary, closes associated pipes.

If many of your applications die suddenly, without notifying the CMM, the following can happen:

If one of your applications has died suddenly, you receive a system log message such as this:

#  Dec 23 09:56:07 machine_name CMM[839]: S-CMM 
notif to /var/run/CMM_884_00000005 fails: Broken pipe

The CMM detects the problem and closes the notification pipe. For further information on accessing system log files, see “Accessing and Maintaining System Log Messages” in the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide and the syslog.conf4 man page.