Sun Java System Message Queue 3.7 UR1 Administration Guide

Dead Message Queue Contains Messages

Symptom:


Listing all the destinations on the broker specified by:
---------------------------------
Host         Primary Port
---------------------------------
localhost    7676
----------------------------------------------------------------------
   Name     Type    State   Producers  Consumers  Msgs
                                         Total    Count  UnAck  Avg Size
------------------------------------------------- ----------------------
MyDest      Queue  RUNNING       0          0        5      0    1177.0
mq.sys.dmq  Queue  RUNNING       0          0       35      0    1422.0
Successfully listed destinations.

In this example, the dead message queue, mq.sys.dmq, contains 35 messages.

Possible causes:

Possible cause: The number of messages, or their sizes, exceed destination limits.

To confirm this cause of the problem: Use the QBrowser demo application to look at the contents of the dead message queue. For the QBrowser demo’s platform-specific location, see Appendix A, Platform-Specific Locations of Message QueueTM Data and look in the tables for “Example Applications and Locations.”

Here is an example invocation on the Windows platform:

cd \MessageQueue3\demo\applications\qbrowser java QBrowser

When the QBrowser main window appears, select the queue name mq.sys.dmq and then click Browse. A list like the one shown earlier under “Message timeout value is expiring” appears. Double-click any message to display details about that message, as shown under “Message timeout value is expiring.”

Note the values for the following message properties:

Under JMS Headers, note the value for JMSDestination to determine the destination whose messages are becoming dead.

To resolve the problem: Increase the destination limits. For example:

imqcmd update dst -n MyDest -o maxNumMsgs=1000

Possible cause: The broker clock and producer clock are not synchronized.

To confirm this cause of the problem: Using the QBrowser application, view the message details for messages in the dead message queue. Check the value for JMS_SUN_DMQ_UNDELIVERED_REASON, looking for messages with the reason EXPIRED.

In the broker log file, look for any of the following messages: B2102, B2103, B2104. These messages all report that possible clock skew was detected.

To resolve the problem: Check that you are running a time synchronization program, as described in Preparing System Resources.

Possible cause: Consumers are not receiving messages before they time out.

To verify this cause of the problem: Using the QBrowser application, view the message details for messages in the dead message queue. Check the value for JMS_SUN_DMQ_UNDELIVERED_REASON, looking for messages with the reason EXPIRED.

Check to see whether there any consumers on the destination. For example:

imqcmd query dst -t q -n MyDest

Check the value listed for Current Number of Active Consumers. If there are active consumers, one of the following is true:

To resolve the problem: Request that application developers increase message time-to-live values.

Possible cause: There are too many producers for the number of consumers.

To confirm this cause of the problem: Using the QBrowser application, view the message details for messages in the dead message queue. Check the value for JMS_SUN_DMQ_UNDELIVERED_REASON. If the reason is REMOVE_OLDEST or REMOVE_LOW_PRIORITY, use the imqcmd query dst command to check the number of producers and consumers on the destination. If the number of producers exceeds the number of consumers, production rate may be overwhelming consumption rate.

To resolve the problem: Add more consumer clients or set the destination’s limit behavior to FLOW_CONTROL (which uses consumption rate to control production rate), using a command such as the following:

imqcmd update dst -n myDst -t q -o consumerFlowLimit=FLOW_CONTROL

Possible cause: Producers are faster than consumers.

To confirm this cause of the problem: To determine whether slow consumers are causing producers to slow down, set the destination’s limit behavior to FLOW_CONTROL (which uses consumption rate to control production rate), using a command such as the following:

imqcmd update dst -n myDst -t q -o consumerFlowLimit=FLOW_CONTROL

Use metrics to examine the destination’s input and output, using a command such as the following:

imqcmd metrics dst -n myDst -t q -m rts

In the metrics output, examine the following values:

Because flow control aligns production to consumption, note whether production slows or stops. If so, there is a discrepancy between the processing speeds of producers and consumers. You can also check the number of unacknowledged (UnAcked) messages sent, by using the imqcmd list dst command. If the number of unacknowledged messages is less than the size of the destination, the destination has additional capacity and is being held back by client flow control.

To resolve the problem: If production rate is consistently faster than consumption rate, consider using flow control regularly, to keep the system aligned. In addition, using the subsequent sections, consider and attempt to resolve each of the following possible factors:

Possible cause: A consumer is too slow.

To confirm this cause of the problem: Use metrics to determine the rate of production and consumption, as described above under “Producers are faster than consumers.”

To resolve the problem:

Possible cause: Clients are not committing messages.

To confirm this cause of the problem: Check with application developers to find out whether the application uses transactions. If so, list the active transactions as follows:

imqcmd list txn

Here is an example of the command output:


----------------------------------------------------------------------
Transaction ID       State    User name  # Msgs/# Acks   Creation time
----------------------------------------------------------------------
6800151593984248832  STARTED  guest           3/2     7/19/04 11:03:08 AM

Note the numbers of messages and number of acknowledgments. If the number of messages is high, producers may be sending individual messages but failing to commit transactions. Until the broker receives a commit, it cannot route and deliver the messages for that transaction. If the number of acknowledgments is high, consumers may be sending acknowledgments for individual messages but failing to commit transactions. Until the broker receives a commit, it cannot remove the acknowledgments for that transaction.

To resolve the problem: Contact application developers to fix the coding error.

Possible cause: Consumers are failing to acknowledge messages.

To confirm this cause of the problem: Contact application developers to determine whether the application uses system-based acknowledgment or client-based acknowledgment. If the application uses system-based acknowledgment, skip this section; if it uses client-based acknowledgment (CLIENT_ACKNOWLEDGE), first decrease the number of messages stored on the client, using a command like the following:

imqcmd update dst -n myDst -t q -o consumerFlowLimit=1

Next, you will determine whether the broker is buffering messages because a consumer is slow, or whether the consumer processes messages quickly but does not acknowledge them. List the destination, using the following command:

imqcmd list dst

After you supply a user name and password, output like the following appears:


Listing all the destinations on the broker specified by:
---------------------------------
Host         Primary Port
---------------------------------
localhost    7676
----------------------------------------------------------------------
   Name     Type    State   Producers  Consumers  Msgs
                                         Total    Count  UnAck  Avg Size
------------------------------------------------ -----------------------
MyDest      Queue  RUNNING       0          0        5    200    1177.0
mq.sys.dmq  Queue  RUNNING       0          0       35      0    1422.0
Successfully listed destinations.

The UnAck number represents messages that the broker has sent and for which it is waiting for acknowledgment. If this number is high or increasing, you know that the broker is sending messages, so it is not waiting for a slow consumer. You also know that the consumer is not acknowledging the messages.

To resolve the problem: Contact application developers to fix the coding error.

Possible cause: Durable consumers are inactive.

To confirm this cause of the problem: Look at the topic’s durable subscribers, using the following command format:

imqcmd list dur -d topicName

To resolve the problem:

Possible cause: An unexpected broker error has occurred.

To confirm this cause of the problem: Use QBrowser to examine a message, as described earlier under “Producers are faster than consumers.” If the value for JMS_SUN_DMQ_UNDELIVERED_REASON is ERROR, a broker error occurred.

To resolve the problem: