Sun Java System Messaging Server 6.3 Administration Guide

26.2 Standard MTA Troubleshooting Procedures

This section outlines standard troubleshooting procedures for the MTA. Follow these procedures if a problem does not generate an error message, if an error message does not provide enough diagnostic information, or if you want to perform general wellness checks, testing, and standard maintenance of the MTA.

26.2.1 Check the MTA Configuration

Test your address configuration by using the imsimta test -rewrite utility. With this utility, you can test the MTA’s address rewriting and channel mapping without actually having to send a message. Refer to the MTA Command-line Utilities chapter in theChapter 2, Message Transfer Agent Command-line Utilities, in Sun Java System Messaging Server 6.3 Administration Reference for more information.

The utility will normally show address rewriting that will be applied as well as the channel to which messages will be queued. However, syntax errors in the MTA configuration will cause the utility to issue an error message. If the output is not what you expect, you may need to correct your configuration.

26.2.2 Check the Message Queue Directories

Check if messages are present in the MTA message queue directory, typically msg-svr-base/data/queue/. Use command-line utilities like imsimta qm to check for the presence of expected message files under the MTA message queue directory. For more information on imsimta qm, refer to the MTA command-line utilities chapter in theimsimta qm in Sun Java System Messaging Server 6.3 Administration Reference and 27.8.6 imsimta qm counters

If the imsimta test -rewrite output looks correct, check that messages are actually being placed in the MTA message queue subdirectories. To do so, enable message logging (For more information on MTA logging, see 25.3 Managing MTA Message and Connection Logs in the directory /msg-svr-base/log/. You can track a specific message by its message ID to ensure that it is being placed in the MTA message queue subdirectories. If you are unable to find the message, you may have a problem with file disk space or directory permissions.

26.2.3 Check the Ownership of Critical Files

You should have selected a mail server user account (mailsrv by default) when you installed Messaging Server. The following directories, subdirectories, and files should be owned by this account:


msg-svr-base/data/queue/
msg-svr-base/data/log
msg-svr-base/data/tmp

Commands, like the ones in the following UNIX system example, may be used to check the protection and ownership of these directories:


ls -l -p -d /opt/SUNWmsgsr/data/queue
drwxr-x---   2 mailsrv  mail 512 Jan  4 16:09 /opt/SUNWmsgsr/data/queue/

ls -l -p -d /opt/SUNWmsgsr/data/log
drwxr-x---   2 mailsrv  mail  3072 Feb 16 12:07 /opt/SUNWmsgsr/data/log/

ls -l -p -d /opt/SUNWmsgsr/data/tmp
drwxr-x---   2 mailsrv  mail   512 Feb 16 12:55 /opt/SUNWmsgsr/data/tmp/

Check that the files in msg-svr-base/data/queue are owned by the MTA account by using commands like in the following UNIX system example:

ls -l -p -R /opt/SUNWmsgsr/data/queue

26.2.4 Check that the Job Controller and Dispatcher are Running

The MTA Job Controller handles the execution of the MTA processing jobs, including most outgoing (master) channel jobs.

Some MTA channels, such as the MTA’s multi-threaded SMTP channels, include resident server processes that process incoming messages. These servers handle the slave (incoming) direction for the channel. The MTA Dispatcher handles the creation of such MTA servers. Dispatcher configuration options control the availability of the servers, the number of created servers, and how many connections each server can handle.

To check that the Job Controller and Dispatcher are present, and to see if there are MTA servers and processing jobs running, use the command imsimta process. Under idle conditions the command should result in job_controller and dispatcher processes. For example:


# imsimta process
USER      PID S VSZ    RSS   STIME    TIME     COMMAND
mailsrv 9567 S 18416 9368  02:00:02  0:00  /opt/SUNWmsgsr/lib/tcp_smtp_server
mailsrv 6573 S 18112 5720  Jul_13    0:00  /opt/SUNWmsgsr/lib/job_controller
mailsrv 9568 S 18416 9432  02:00:02  0:00  /opt/SUNWmsgsr/lib/tcp_smtp_server
mailsrv 6574 S 17848 5328  Jul_13    0:00  /opt/SUNWmsgsr/lib/dispatcher

If the Job Controller is not present, the files in the /msg-svr-base/data/queue directory will get backed up and messages will not be delivered. If you do not have a Dispatcher, then you will be unable to receive any SMTP connections.

For more information on imsimta process, refer to the imsimta process in Sun Java System Messaging Server 6.3 Administration Reference.

You could also use imsimta qm jobs to list, channel by channel, all active and pending delivery processing jobs currently being managed by the Job Controller. Additional cumulative information is provided for each channel such as the number of message files successfully delivered and those requeued for subsequent delivery attempts. The command syntax is as follows:


jobs [-[no]hosts] [-[no]jobs] [-[no]messages] [channel-name]

If neither the Job Controller nor the Dispatcher is present, you should review the dispatcher.log-* or job_controller.log-* file in /msg-svr-base/data/log

If the log files do not exist or do not indicate an error, start the processes by using the start-msg command. For more information, refer to the MTA command-line utilities chapter in the start-msg in Sun Java System Messaging Server 6.3 Administration Reference.


Note –

You should not see multiple instances of the Dispatcher or Job Controller when you run imsimta process, unless the system is in the process of forking (fork()) child processes before it executes (exec()) the program that needs to run. However, the time frame during such duplication is very small.


26.2.5 Check the Log Files

If MTA processing jobs run properly but messages stay in the message queue directory, you can examine the log files to see what is happening. All MTA log files are created in the directory /msg-svr-base/log. Log file name formats for various MTA processing jobs are shown in Table 26–1.

Table 26–1 MTA Log Files

File Name  

Log File Contents  

channel_master.log-uniqueid

Output of master program (usually client) for channel.

channel_slave.log-uniqueid

Output of slave program (usually server) for channel.

dispatcher.log-uniqueid

Dispatcher debugging. This log is created regardless if the Dispatcher DEBUG option is set. However, to get detailed debugging information, you should set the DEBUG option to a non-zero value.

imta

ims-ms channel error messages when there is a problem in delivery.

job_controller.log-uniqueid

Job controller logging. This log is created regardless if the Job Controller DEBUG option is set. However, to get detailed debugging information, you should set the DEBUG option to a non-zero value.

tcp_smtp_server.log-uniqueid

Debugging for the tcp_smtp_server. The information in this log is specific to the server, not to messages.

return.log-uniqueid

Debug output for the periodic MTA message bouncer job; this log file is created if the return_debug option is used in the option.dat


Note –

Each log file is created with a unique ID (uniqueid) to avoid overwriting an earlier log created by the same channel. To find a specific log file, you can use the imsimta view utility. You can also purge older log files by using the imsimta purge command. Note, however, that by default this command is run on a regular basis (see 4.6.2 Pre-defined Automatic Tasks). For more information, see the MTA command-line utilities chapter in theimsimta purge in Sun Java System Messaging Server 6.3 Administration Reference.


The channel_master.log-uniqueid and channel_slave.log-uniqueid log files will be created in any of the following situations:

For more information on debugging channel master and slave programs, see the Sun Java System Messaging Server Administration Reference.

26.2.6 Run a Channel Program Manually

When diagnosing an MTA delivery problem it is helpful to manually run an MTA delivery job, particularly after you enable debugging for one or more channels.

The command imsimta submit will notify the MTA Job Controller to run the channel. If debugging is enabled for the channel in question, imsimta submit will create a log file in directory /msg-svr-base/log as shown in Table 26–1.

The command imsimta run will perform outbound delivery for the channel under the currently active process, with output directed to your terminal. This may be more convenient than submitting a job, particularly if you suspect problems with job submission itself.


Note –

In order to manually run channels, the Job Controller must be running.


For information on syntax, options, parameters, examples of imsimta submit and imsimta run commands, refer to Command Descriptions in Sun Java System Messaging Server 6.3 Administration Reference.

26.2.7 Starting and Stopping Individual Channels

In some cases, stopping and starting individual channels may make message queue problems easier to diagnose and debug. Stopping a message queue allows you to examine queued messages to determine the existence of loops or spam attacks.

ProcedureTo Stop Outbound Processing (dequeueing) for a Specific Channel

  1. Use the imsimta qm stop command to stop a specific channel. Doing so prevents you from having to stop the Job Controller and having to recompile the configuration. In the following example, the conversion channel is stopped:

    imsimta qm stop conversion

  2. To resume processing, use the imsimta qm start command to restart the channel. In the following example, the conversion channel is started:

    imsimta qm start conversion

    For more information on the imsimta qm start and imsimta qm stop commands, see imsimta qm in Sun Java System Messaging Server 6.3 Administration Reference.


    Note –

    The command imsimta qm start/stop channel may fail if run simultaneously for many channels at the same time. The tool might have trouble updating the hold_list and could report: QM-E-NOTSTOPPED, unable to stop the channel; cannot update the hold list. imsimta qm start/stop channel should only be used sequentially with a few seconds interval between each run.

    If you only want the channel to run between certain hours, use the following options in the channel definition section in the job controller configuration file:


    urgent_delivery=08:00-20:00
    normal_delivery=08:00-20:00
    nonurgent_delivery=08:00-20:00

26.2.7.1 To Stop Inbound Processing from a Specific Domain or IP Address (enqueuing to a channel)

You can run one of the following processes if you want to stop inbound message processing for a specific domain or IP address, while returning temporary SMTP errors to client hosts. By doing so, messages will not be held on your system. Refer to the 18.1 PART 1. MAPPING TABLES.

ORIG_SEND_ACCESS 

  *|*@sesta.com|*|*        $X4.2.1|$NHost$ temporarily$ blocked

By using this process, the sender’s remote MTA will hold messages on their systems, continuing to resend them periodically until you restart inbound processing.

PORT_ACCESS

    TCP|*|25|IP_address_to_block|*    $N500$ can't$ connect$ now

When you want to restart inbound processing from the domain or IP address, be sure to remove these rules from the mapping tables and recompile your configuration. In addition, you may want to create unique error messages for each mapping table. Doing so will enable you to determine which mapping table is being used.

26.2.8 An MTA Troubleshooting Example

This section explains how to troubleshoot a particular MTA problem step-by-step. In this example, a mail recipient did not receive an attachment to an email message. Note: In keeping with MIME protocol terminology, the “attachment” is referred to as a “message part” in this section. The aforementioned troubleshooting techniques are used to identify where and why the message part disappeared (See 26.2 Standard MTA Troubleshooting Procedures). By using the following steps, you can determine the path the message took through the MTA. In addition, you can determine if the message part disappeared before or after the message entered the message queue. To do so, you will need to manually stop and run channels, capturing the relevant files.


Note –

The Job Controller must be running when you manually run messages through the channels.


26.2.8.1 Identify the Channels in the Message Path

By identifying which channels are in the message path, you can apply the master_debug and slave_debug keywords to the appropriate channels. These keywords generate debugging output in the channels’ master and slave log files; in turn, the master and slave debugging information will assist in identifying the point where the message part disappeared.

  1. Add log_message_id=1 in the option.dat file in directory /msg-svr-base/config. With this parameter, you will see message ID: header lines in the mail.log_current file.

  2. Run imsimta cnbuild to recompile the configuration.

  3. Run imsimta restart dispatcher to restart the SMTP server.

  4. Have the end user resend the message with the message part.

  5. Determine the channels that the message passes through.

    While there are different approaches to identifying the channels, the following approach is recommended:

    1. On UNIX platforms, use the grep command to search for message ID: header lines in the mail.log_current file in directory / msg-svr-base /log.

    2. Once you find the message ID: header lines, look for the E (enqueue) and D (dequeue) records to determine the path of the message. Refer to 25.3.1 Understanding the MTA Log Entry Format for more information on logging entry codes. See the following E and D records for this example:


      29-Aug-2001 10:39:46.44  tcp_local conversion        E 2 ... 
      29-Aug-2001 10:39:46.44  conversion tcp_intranet     E 2 ... 
      29-Aug-2001 10:39:46.44  tcp_intranet                  D 2 ...

The channel on the left is the source channel, and the channel on the right is the destination channel. In this example, the E and D records indicate that the message’s path went from the tcp_local channel to the conversion channel and finally to the tcp_intranet channel.

26.2.8.2 Manually Start and Stop Channels to Gather Data

This section describes how to manually start and stop channels. See 26.2.7 Starting and Stopping Individual Channels starting and stopping the channels in the message’s path, you are able to save the message and log files at different stages in the MTA process. These files are later used to To Identify the Point of Message Breakdown.

ProcedureTo Manually Start and Stop Channels

  1. Set the mm_debug=5 in the option.dat file in directory /msg-svr-base/config in order to provide substantial debugging information.

  2. Add the slave_debug and master_debug keywords to the appropriate channels in the imta.cnf file in directory /msg-svr-base/config.

    1. Use the slave_debug keyword on the inbound channel (or any channel where the message is switched to during the initial dialog) from the remote system that is sending the message with the message part. In this example, the slave_debug keyword is added to the tcp_local channel.

    2. Add the master_debug keyword to the other channels that the message passed through and were identified in 26.2.8.1 Identify the Channels in the Message Path would be added to the conversion and tcp_intranet channels.

    3. Run the command imsimta restart dispatcher to restart the SMTP server.

  3. Use the imsimta qm stop and imsimta qm start commands to manually start and stop specific channels. For more on information by using these keywords, see 26.2.7 Starting and Stopping Individual Channels.

  4. To start the process of capturing the message files, have the end user resend the message with the message part.

  5. When the message enters a channel, the message will stop in the channel if it has been stopped with the imsimta qm stop command. For more information, see Step Step 3.

    1. Copy and rename the message file before you manually run the next channel in the message’s path. See the following UNIX platform example:

      # cp ZZ01K7LXW76T7O9TD0TB.00 ZZ01K7LXW76T7O9TD0TB.KEEP1

      The message file typically resides in directory similar to /msg-svr-base/data/queue/destination_channel/001. The destination_channel is the next channel that the message passes through (such as: tcp_intranet). If you want to create subdirectories (like 001, 002, and so on) in the destination_channel directory, add the subdirs keyword to the channels.

    2. It is recommended that you number the extensions of the message each time you trap and copy the message in order to identify the order in which the message is processed.

  6. Resume message processing in the channel and enqueue to the next destination channel in the message’s path. To do so, use the imsimta qm start command.

  7. Copy and save the corresponding channel log file (for example: tcp_intranet_master.log-*) located in directory /msg-svr-base/log. Choose the appropriate log file that has the data for the message you are tracking. Make sure that the file you copy matches the timestamp and the subject header for the message as it comes into the channel. In the example of the tcp_intranet_master.log-*, you might save the file as tcp_intranet_master.keep so the file is not deleted.

  8. Repeat steps 5 - 7 until the message has reached its final destination.

    The log files you copied in Step Step 7 should correlate to the message files that you copied in Step Step 5. If, for example, you stopped all of the channels in the missing message part scenario, you would save the conversion_master.log-* and the tcp_intranet_master.log-* files. You would also save the source channel log file tcp_local_slave.log-*. In addition, you would save a copy of the corresponding message file from each destination channel: ZZ01K7LXW76T7O9TD0TB.KEEP1 from the conversion channel and ZZ01K7LXW76T7O9TD0TB.KEEP2 from the tcp_intranet channel.

  9. Remove debugging options once the message and log files have been copied.

    1. Remove the slave_debug and the master_debug keywords from the appropriate channels in the imta.cnf file in directory / msg-svr-base /config.

    2. Reset the mm_debug=0, and remove log_message_id=1 in the option.dat file in directory / msg-svr-base /config.

    3. Recompile the configuration by using imsimta cnbuild.

    4. Run the command imsimta restart dispatcher to restart the SMTP server.

ProcedureTo Identify the Point of Message Breakdown

  1. By the time you have finished starting and stopping the channel programs, you should have the following files with which you can use to troubleshoot the problem:

    1. All copies of the message file (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1) from each channel program

    2. A tcp_local_slave.log-* file

    3. A set of channel _master.log-* files for each destination channel

    4. A set of mail.log_current records that show the path of the message

      All files should have timestamps and message ID values that match the message ID: header lines in the mail.log_current records. Note that the exception is when messages are bounced back to the sender; these bounced messages will have a different message ID value than the original message.

  2. Examine the tcp_local_slave.log-* file to determine if the message had the message part when it entered the message queue.

    Look at the SMTP dialog and data to see what was sent from the client machine.

    If the message part did not appear in the tcp_local_slave.log-* file, then the problem occurred before the message entered the MTA. As a result, the message was enqueued without the message part. If this the case, the problem could have occurred on the sender’s remote SMTP server or in the sender’s client machine.

  3. Investigate the copies of the message files to see where the message part was altered or missing.

    If any message file showed that the message part was altered or missing, examine the previous channel’s log file. For example, you should look at the conversion_master.log-* file if the message part in the message entering the tcp_intranet channel was altered or missing.

  4. Look at the final destination of the message.

    If the message part looks unaltered in the tcp_local_slave.log, the message files (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1), and the channel_master.log-* files, then the MTA did not alter the message and the message part is disappearing at the next step in the path to its final destination.

    If the final destination is the ims-ms channel (the Message Store), then you might download the message from the server to a client machine to determine if the message part is being dropped during or after this transfer. If the destination channel is a tcp_* channel, then you need to go to the MTA in the message’s path. Assuming it is an Messaging Server MTA, you will need to repeat the entire troubleshooting process (See 26.2.8.1 Identify the Channels in the Message Path, 26.2.8.2 Manually Start and Stop Channels to Gather Data, and this section). If the other MTA is not under your administration, then the user who reported the problem should contact that particular site.