Chapter 5 . Troubleshooting

This chapter describes how to identify and correct problems while running your application. It includes the following topics:


Identifying Run-time Errors

Problems at runtime can arise from a variety of error conditions. To identify and solve problems with the MessageQ Client, you can use several tools to track down the source of the problem. The following list provides some ideas to help you troubleshoot problems:


Logging an Error Event

Run-time errors detected by the Client Dynamic Link Library (DLL) are written to the file, dmqerror.log, in the default directory for the application. The errors indicate a run-time problem due to either a configuration error, an application error, network problem, or unexpected server response. Additional information can be found using the DMQCL_TRACE environment variable described in the Tracing Client DLL Activity topic.

You can enable or disable error event logging by changing the MessageQ Client Configuration Logging option. When error event logging is disabled, the dmqerror.log file is not used.


Failing to Connect to the CLS

The MessageQ Client attempts to establish a connection to the CLS in response to a call to pams_attach_q. The attempt to establish a connection fails immediately when:

When the connection attempt fails, pams_attach_q returns the PAMS__NETNOLINK -278 error status.

Check the file dmqerror.log for the host name and the endpoint of the server system with which the MessageQ Client attempted to connect. Either start the CLS on the server system or reconfigure the MessageQ Client default server.


Identifying Network Errors

Network errors result when the DLL receives an error when attempting to read or write on the network link. Occasional network connection problems can occur due to the state of the Windows TCP/IP protocol stack or the network connection to the host system. Network errors are identified by the return status from the pams_attach_q function, such as the following status value:

PAMS__NETNOLINK -278

Network connection errors might also occur when attempting to execute any of the MessageQ API functions. For example, the pams_put_msg and pams_get_msg functions return the following status value when the connection to the server is broken and MRS is not enabled:

PAMS__NETERROR -276

The specific steps for clearing the network error depend on how the problem developed. The following actions will generally clear the problem:

  1. Check the error event log file, dmqerror.log, for a description of the error event.

  2. Stop and restart the Windows application. In some cases, restarting the application or simply retrying the attach operation succeeds.

  3. Exit the Windows Program Manager to shut down Windows and restart Windows (not a complete reboot) to unload and reload any DLLs that need to be reset.

  4. Reboot the PC to completely reset the network drivers.

  5. Stop and restart the CLS.


Decoding TCP/IP Error Codes

Refer to for the common TCP/IP error status codes logged to the file dmqerror.log when using the DECnet transport.
Table 5-1 Common TCP/IP Connect Errors

TCP/IP Error

Meaning

Recovery

EINVAL

A Windows Sockets initialization error when the Winsock DLL cannot initialize

Verify that the installation of the TCP/IP stack is correct.

ECONNREFUSED

Connection request to the server is refused because a CLS is not listening on the endpoint.

Verify that the endpoint defined in the Client for Windows Server configuration matches the endpoint used by the CLS. Verify that the CLS is running.

ENOBUFS

The TCP/IP stack has run out of available network buffers.

Frequent attach/detach operations may result in connections that have not cleared out. Usually, they clear after a 60-90 second timeout. Increase the maximum number of TCP/IP sockets if needed.

Most TCP/IP stacks for Windows provide a utility program called NETSTAT to view the status of the network connections on the PC. Check your TCP/IP documentation for more information.


Tracing PAMS API Activity

To obtain a time-stamped output file showing the sequence of MessageQ function calls and return status codes, follow these steps:

  1. Invoke the Configuration Editor.

  2. Click on the Tracing button.

  3. Check the Trace PAMS API calls option.

The information from the MessageQ API function call trace is written to the log file dmqcldll.log in the default directory for the application. The PAMS tracing option can be useful for observing the sequence of message function calls to determine the run-time behavior of the application.


Tracing Client DLL Activity

To obtain detailed, time-stamped traces of the MessageQ Client DLL execution events, follow these steps:

  1. Invoke the Configuration Editor.

  2. Click on the Tracing button.

  3. Check the Trace client DLL activity option.

The information from the DLL trace might be useful to debug connection problems between client library applications and the CLS. The DLL trace output is written to the log file, dmqcldll.log, in the default directory for the application. Be aware that the tracing output can become very large over a long period of time.

A CLS server trace might be useful to get a detailed time-stamped activity of the client requests and MessageQ message operations performed by the CLS. For more information about trace output from the CLS, see the Installation and Configuration Guide for your MessageQ server platform.


Tracing Message Activity

Messages transmitted or received by the DLL can be logged to a file or the Event Watcher. The information from a message trace can be used to verify the quantity and content of messaging traffic. For more information about enabling message tracing, see the Configuring Logging topic in Chapter 3.


Recovering from Client Crashes

Applications occasionally crash (particularly during development) and do not have an opportunity to close or return resources in use prior to terminating. Applications using the MessageQ Client that are attached to the message queuing bus and then crash (or terminate) without calling pams_exit or pams_detach_q leave many resources allocated but not available for reuse. Resources that are in use after a client application crash include:

After the client crashes, the server system still has an open connection to the client and the CLS remains attached to the primary queue used by the client. The network protocol keepalive mechanism does not notify the server that the client has gone away for a lengthy time period. Typically, you can reboot the client PC and the server still functions as though it has a connection open to the client.

If the Client for Windows application attaches to temporary queues, the CLS processes must be stopped manually. If the MessageQ Client application uses permanent queues, they will be able to reconnect and the previous CLS process will terminate.

Restarting the client application usually establishes a new connection to CLS. If network connect errors occur, follow the troubleshooting procedure described in the Network Errors topic. The procedure releases and frees all resources used by the client.

If the client application calls pams_attach_q using either ATTACH_BY_NAME or ATTACH_BY_NUMBER to attach to a specific primary queue, the CLS detects a client reconnect attempt and automatically terminates the CLS instance (server process or thread) attached to the same message queue. Reconnecting to the same queue is only accepted if the client application is attempting to reconnect from the same PC node as the previous connection.

If the client application calls pams_attach_q using the ATTACH_TEMPORARY attach mode, a new instance of the CLS is started to support the client reconnect. The previous instance of the CLS remains active. For information about terminating CLS servers, see the CLS topic in the Installation and Configuration Guide for your server platform.


Loading Incorrect WinSock DLL at Runtime

The WinSock DLL is loaded at runtime by the Windows Operating System. Under some circumstances the "wrong" WinSock can be loaded. This usually results in an error when a WinSock function is used.

Use the following procedure to verify which WinSock DLL is being loaded:

  1. Using the MessageQ Client Config utility, enable the "Trace Client DLL activity". This can also be done by setting the DMQCL_TRACE environment variable.

  2. Run your application and attempt a pams_attach_q. This traces client DLL activity to dmqcldll.log.

  3. Open dmqcldll.log and search for the WSA Socket information. These lines contain a brief description of the WinSock DLL which is loaded at runtime.

When Windows loads the WinSock DLL, it searches the following directories using the following sequence:

  1. The directory from which the application loaded

  2. The current directory

  3. Windows 95: The Windows system directory
    Windows NT: The 32-bit Windows System directory

  4. The Windows directory

  5. The directories listed in the PATH environment

If the incorrect WinSock DLL is being loaded, you need to check for a WinSock DLL in each of the previous locations. To resolve a conflict, it may be necessary to do one or more of the following steps:

Check with your WinSock DLL vendor for their recommendation on resolving these conflicts.


EAFNOSUPPORT Errors

An EAFNOSUPPORT error indicated that the WinSock DLL does not support the requested action. For example, this would occur if DECnet is selected as the transport but WinSock DLL does not support DECnet. This probably indicates that the wrong WinSock DLL is being loaded at runtime. Refer to the topic Loading Incorrect WinSock DLL at Runtime for instructions on correcting this condition.