Chapter 5 . Troubleshooting

This chapter describes how to identify and correct problems while running your MessageQ client applications. Troubleshooting includes the following topics:

Determining the Version Number of the Client

To obtain technical support, you must know the version number of the MessageQ Client software the you are running. To determine the version of the MessageQ Client library, enable tracing of Client Library activity, run your application, and check the trace file dmqcldll.log for the version number.

Identifying Run-Time Errors

Problems at runtime can arise from a variety of error conditions. To identify and solve problems with the MessageQ Client, you can use a variety of tools to track down the source of the problem. The following list provides some ideas to help you to help you troubleshoot the source of application problems:

This chapter summarizes how to find and resolve application problems.

Logging an Error Event

Run time errors detected by the Client library are written to the dmqerror.log file in the default directory for the application. The errors indicate a run-time problem due to either a configuration error, an application error, network problem, or unexpected server response.

Error event logging can be either enabled or disabled by changing the MessageQ Client Configuration Logging option. When error event logging is disabled, the dmqerror.log file is not used and no information on error conditions is available. Refer to Configuring Tracing in Chapter 3 for more information about trace file settings.

Failing to Connect to the CLS

The MessageQ Client for UNIX attempts to establish a connection to the CLS in response to a call to pams_attach_q.

When the connection attempt fails, pams_attach_q returns the following error status:

PAMS__NETNOLINK -278

Check the file dmqerror.log for the full path of the configuration file (dmq.ini) used, the host name, and the endpoint of the server system with which the MessageQ Client attempted to connect. Refer to Identifying Run-Time Errors for additional information about the PAMS_NETNOLINK error.

Identifying Network Errors

Network errors result from the Client Library receiving an error when attempting to read or write on the network link. Occasional network connection problems can occur due to the state of the TCP/IP protocol stack or the network connection to the host system. Network errors are identified by the return status from the pams_attach_q function, such as the following:

PAMS__NETNOLINK -278

Network connection errors might also occur when attempting to execute any of the MessageQ API functions. For example, the pams_put_msg and pams_get_msg functions return the following return code when the connection to the server is broken and MRS is not enabled:

PAMS__NETERROR -276

The specific steps for clearing the network error depend on how the problem developed. The following actions will generally clear the problem:

  1. Check the error event log file, dmqerror.log, for a description of the error event.

  2. Stop and restart the application. In some cases, restarting the application or simply retrying the attach operation succeeds.

  3. Stop and restart the CLS.

Decoding TCP/IP Error Codes

Refer to Table 5-1 for the common TCP/IP error status codes logged to the file dmqerror.log when using the TCP/IP transport.
Table 5-1 Common TCP/IP Connect Errors

TCP/IP Error Meaning Recovery

EINVAL

A socket initialization error

Verify that the TCP/IP stack was installed correctly.

ECONNREFUSED

Connection request to the server is refused because a CLS is not listening on the endpoint

Verify that the endpoint defined in the Client for UNIX Server configuration endpoint matches the endpoint used by the CLS. Verify that the CLS is running.

ENOBUFS

The TCP/IP stack has run out of available network buffers.

Frequent attach/detach operations may result in connections that have not cleared out. Usually, they clear after a 60-90 second time-out. Increase the maximum number of TCP/IP sockets if needed.

Tracing PAMS API Activity

To obtain a time-stamped output file showing the sequence of MessageQ function calls and return status codes, follow these steps:

  1. Invoke the Configuration Utility.

  2. Choose Configure from the Main menu, then choose Tracing from the Configure menu.

  3. Set the Trace PAMS API Calls option to yes.

The information from the pams_ function call trace is written to the dmqcldll.log file in the default directory for the application. The PAMS tracing option can be used to observe the sequence of message function calls to determine the run-time behavior of the application.

Tracing Client Library Activity

To obtain detailed, time-stamped traces of the Client Library activity, follow these steps:

  1. Invoke the Configuration Utility.

  2. Choose Configure from the Main menu, then choose Tracing from the Configure menu.

  3. Set the Trace Client Library Activity option to yes.

The information from the library trace might be useful to debug connection problems between client library applications and the CLS. The library trace output is written to the log file, dmqcldll.log, in the default directory for the application. Be aware that the output from the tracing option can become very large over a long period of time.

A CLS server trace might be useful to get a detailed time-stamped trace of the client requests and MessageQ message operations performed by the CLS. For more information about trace output from the CLS, refer to the Installation and Configuration Guide for your MessageQ server platform.

Recovering from Client Crashes

Occasionally, applications crash (particularly during development) and do not have an opportunity to close or return resources in use before terminating. Applications using the MessageQ Client that are attached to the message queuing bus and then crash (or terminate) without calling pams_exit or pams_detach_q, leave many resources allocated but not available for reuse.

Resources that are in use after a client application crash include:

After the client crashes, the server system still has an open connection to the client and the CLS remains attached to the primary queue used by the client. The network protocol keep-alive mechanism does not notify the server that the client has gone away for a lengthy time period. Typically, you can reboot the client system and the server still functions as though it has a connection open to the client.

Restarting the client application usually establishes a new connection to CLS. If network connect errors occur, follow the troubleshooting procedure described in the Identifying Network Errors topic. The procedure releases and frees all resources used by the client.

If the client application calls pams_attach_q using either ATTACH_BY_NAME or ATTACH_BY_NUMBER to attach to a specific primary queue, the CLS detects a client reconnect attempt and automatically terminates the CLS instance (server process or thread) attached to the same message queue. Reconnecting to the same queue is only accepted if the client application is attempting to reconnect from the same host as the previous connection.

If the client application calls pams_attach_q using the ATTACH_TEMPORARY attach mode, a new instance of the CLS is started to support the client reconnect. The previous instances of the CLS remains active. For information about terminating CLS servers, see the CLS topic in the Installation and Configuration Guide for your MessageQ server platform.