Sun Java System Messaging Server 6.3 Administration Guide

27.4 Monitoring the MTA

This section consists of the following subsections:

27.4.1 Monitoring the Size of the Message Queues

Excessive message queue growth may indicate that messages are not being delivered, are being delayed in their delivery, or are coming in faster than the system can deliver them. This may be caused by a number of reasons such as a denial of service attack caused by huge numbers of messages flooding your system, or the Job Controller not running.

See 8.5.2 Channel Message Queues, 26.3.6 Messages are Not Dequeued and 26.3.7 MTA Messages are Not Delivered for more information on message queues. Symptoms of Message Queue Problems To Monitor the Size of the Message Queues

Probably the best way to monitor the message queues is to use imsimta qm and imsimta summarize. Refer to 27.8.6 imsimta qm counters.

You can also monitor the number of files in the queue directories (msg-svr-base/data/queue/). The number of files will be site-specific, and you’ll need to build a baseline history to find out what is “too many.” This can be done by recording the size of the queue files over a two week period to get an approximate average.

27.4.2 Monitoring Rate of Delivery Failure

A delivery failure is a failed attempt to deliver a message to an external site. A large increase in rate of delivery failure can be a sign of a network problem such as a dead DNS server or a remote server timing out on responding to connections. Symptoms of Rate of Delivery Failure

There are no outward symptoms. Lots of Q records will appear in to mail.log_current. To Monitor the Rate of Delivery Failure

Delivery failures are recorded in the MTA logs with the logging entry code Q. Look at the record in the file msg-svr-base/data/log/mail.log_current. Example:

mail.log:06-Oct-2003 00:24:03.66 501d.0b.9 ims-ms Q 5 durai.balusamy@Sun.COM rfc822;durai.balusamy@Sun.COM durai@ims-ms-daemon <00ce01c38bda$c7e2b240$6501a8c0@guindy> Mailbox is busy

27.4.3 Monitoring Inbound SMTP Connections

An unusual increase in the number of inbound SMTP connections from a given IP address may indicate: Symptoms of Unauthorized SMTP Connections To Monitor Inbound SMTP Connections

Local address       Remote address                                 State    32768   0  32768   0   CLOSE_WAIT    8760   0  24820   0   ESTABLISHED   33580   0  24820   0   TIME_WAIT

Note that you will first need to determine the appropriate number of SMTP connections and their states (ESTABLISHED, CLOSE_WAIT, etc.) for your system to determine if a particular reading is out of the ordinary.

If you find many connections staying in the SYN_RECEIVED state this might be caused by a broken network or a denial of service attack. In addition, the lifetime of an SMTP server process is limited. This is controlled by the MTA configuration variable MAX_LIFE_TIME in the dispatcher.cnf file. The default is 86,400 seconds (one day). Similarly, MAX_LIFE_CONNS specifies the maximum number of connections a server process can handle in its lifetime. If you find a particular SMTP server that has around for a long time you may wish to investigate.

27.4.4 Monitoring the Dispatcher and Job Controller Processes

The Dispatcher and Job Controller Processes must be operating for MTA to work. You should have one process of each kind. Symptoms of Dispatcher and Job Controller Processes Down

If the Dispatcher is down or does not have enough resources, SMTP connections are refused.

If the Job Controller is down, queue size will grow. To Monitor Dispatcher and Job Controller Processes

Check to see that the processes called dispatcher and job_controller exist. See 26.2.4 Check that the Job Controller and Dispatcher are Running.