Sun Java System Messaging Server 6.3 Administration Guide

26.2.8 An MTA Troubleshooting Example

This section explains how to troubleshoot a particular MTA problem step-by-step. In this example, a mail recipient did not receive an attachment to an email message. Note: In keeping with MIME protocol terminology, the “attachment” is referred to as a “message part” in this section. The aforementioned troubleshooting techniques are used to identify where and why the message part disappeared (See 26.2 Standard MTA Troubleshooting Procedures). By using the following steps, you can determine the path the message took through the MTA. In addition, you can determine if the message part disappeared before or after the message entered the message queue. To do so, you will need to manually stop and run channels, capturing the relevant files.


Note –

The Job Controller must be running when you manually run messages through the channels.


26.2.8.1 Identify the Channels in the Message Path

By identifying which channels are in the message path, you can apply the master_debug and slave_debug keywords to the appropriate channels. These keywords generate debugging output in the channels’ master and slave log files; in turn, the master and slave debugging information will assist in identifying the point where the message part disappeared.

  1. Add log_message_id=1 in the option.dat file in directory /msg-svr-base/config. With this parameter, you will see message ID: header lines in the mail.log_current file.

  2. Run imsimta cnbuild to recompile the configuration.

  3. Run imsimta restart dispatcher to restart the SMTP server.

  4. Have the end user resend the message with the message part.

  5. Determine the channels that the message passes through.

    While there are different approaches to identifying the channels, the following approach is recommended:

    1. On UNIX platforms, use the grep command to search for message ID: header lines in the mail.log_current file in directory / msg-svr-base /log.

    2. Once you find the message ID: header lines, look for the E (enqueue) and D (dequeue) records to determine the path of the message. Refer to 25.3.1 Understanding the MTA Log Entry Format for more information on logging entry codes. See the following E and D records for this example:


      29-Aug-2001 10:39:46.44  tcp_local conversion        E 2 ... 
      29-Aug-2001 10:39:46.44  conversion tcp_intranet     E 2 ... 
      29-Aug-2001 10:39:46.44  tcp_intranet                  D 2 ...

The channel on the left is the source channel, and the channel on the right is the destination channel. In this example, the E and D records indicate that the message’s path went from the tcp_local channel to the conversion channel and finally to the tcp_intranet channel.

26.2.8.2 Manually Start and Stop Channels to Gather Data

This section describes how to manually start and stop channels. See 26.2.7 Starting and Stopping Individual Channels starting and stopping the channels in the message’s path, you are able to save the message and log files at different stages in the MTA process. These files are later used to To Identify the Point of Message Breakdown.

ProcedureTo Manually Start and Stop Channels

  1. Set the mm_debug=5 in the option.dat file in directory /msg-svr-base/config in order to provide substantial debugging information.

  2. Add the slave_debug and master_debug keywords to the appropriate channels in the imta.cnf file in directory /msg-svr-base/config.

    1. Use the slave_debug keyword on the inbound channel (or any channel where the message is switched to during the initial dialog) from the remote system that is sending the message with the message part. In this example, the slave_debug keyword is added to the tcp_local channel.

    2. Add the master_debug keyword to the other channels that the message passed through and were identified in 26.2.8.1 Identify the Channels in the Message Path would be added to the conversion and tcp_intranet channels.

    3. Run the command imsimta restart dispatcher to restart the SMTP server.

  3. Use the imsimta qm stop and imsimta qm start commands to manually start and stop specific channels. For more on information by using these keywords, see 26.2.7 Starting and Stopping Individual Channels.

  4. To start the process of capturing the message files, have the end user resend the message with the message part.

  5. When the message enters a channel, the message will stop in the channel if it has been stopped with the imsimta qm stop command. For more information, see Step Step 3.

    1. Copy and rename the message file before you manually run the next channel in the message’s path. See the following UNIX platform example:

      # cp ZZ01K7LXW76T7O9TD0TB.00 ZZ01K7LXW76T7O9TD0TB.KEEP1

      The message file typically resides in directory similar to /msg-svr-base/data/queue/destination_channel/001. The destination_channel is the next channel that the message passes through (such as: tcp_intranet). If you want to create subdirectories (like 001, 002, and so on) in the destination_channel directory, add the subdirs keyword to the channels.

    2. It is recommended that you number the extensions of the message each time you trap and copy the message in order to identify the order in which the message is processed.

  6. Resume message processing in the channel and enqueue to the next destination channel in the message’s path. To do so, use the imsimta qm start command.

  7. Copy and save the corresponding channel log file (for example: tcp_intranet_master.log-*) located in directory /msg-svr-base/log. Choose the appropriate log file that has the data for the message you are tracking. Make sure that the file you copy matches the timestamp and the subject header for the message as it comes into the channel. In the example of the tcp_intranet_master.log-*, you might save the file as tcp_intranet_master.keep so the file is not deleted.

  8. Repeat steps 5 - 7 until the message has reached its final destination.

    The log files you copied in Step Step 7 should correlate to the message files that you copied in Step Step 5. If, for example, you stopped all of the channels in the missing message part scenario, you would save the conversion_master.log-* and the tcp_intranet_master.log-* files. You would also save the source channel log file tcp_local_slave.log-*. In addition, you would save a copy of the corresponding message file from each destination channel: ZZ01K7LXW76T7O9TD0TB.KEEP1 from the conversion channel and ZZ01K7LXW76T7O9TD0TB.KEEP2 from the tcp_intranet channel.

  9. Remove debugging options once the message and log files have been copied.

    1. Remove the slave_debug and the master_debug keywords from the appropriate channels in the imta.cnf file in directory / msg-svr-base /config.

    2. Reset the mm_debug=0, and remove log_message_id=1 in the option.dat file in directory / msg-svr-base /config.

    3. Recompile the configuration by using imsimta cnbuild.

    4. Run the command imsimta restart dispatcher to restart the SMTP server.

ProcedureTo Identify the Point of Message Breakdown

  1. By the time you have finished starting and stopping the channel programs, you should have the following files with which you can use to troubleshoot the problem:

    1. All copies of the message file (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1) from each channel program

    2. A tcp_local_slave.log-* file

    3. A set of channel _master.log-* files for each destination channel

    4. A set of mail.log_current records that show the path of the message

      All files should have timestamps and message ID values that match the message ID: header lines in the mail.log_current records. Note that the exception is when messages are bounced back to the sender; these bounced messages will have a different message ID value than the original message.

  2. Examine the tcp_local_slave.log-* file to determine if the message had the message part when it entered the message queue.

    Look at the SMTP dialog and data to see what was sent from the client machine.

    If the message part did not appear in the tcp_local_slave.log-* file, then the problem occurred before the message entered the MTA. As a result, the message was enqueued without the message part. If this the case, the problem could have occurred on the sender’s remote SMTP server or in the sender’s client machine.

  3. Investigate the copies of the message files to see where the message part was altered or missing.

    If any message file showed that the message part was altered or missing, examine the previous channel’s log file. For example, you should look at the conversion_master.log-* file if the message part in the message entering the tcp_intranet channel was altered or missing.

  4. Look at the final destination of the message.

    If the message part looks unaltered in the tcp_local_slave.log, the message files (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1), and the channel_master.log-* files, then the MTA did not alter the message and the message part is disappearing at the next step in the path to its final destination.

    If the final destination is the ims-ms channel (the Message Store), then you might download the message from the server to a client machine to determine if the message part is being dropped during or after this transfer. If the destination channel is a tcp_* channel, then you need to go to the MTA in the message’s path. Assuming it is an Messaging Server MTA, you will need to repeat the entire troubleshooting process (See 26.2.8.1 Identify the Channels in the Message Path, 26.2.8.2 Manually Start and Stop Channels to Gather Data, and this section). If the other MTA is not under your administration, then the user who reported the problem should contact that particular site.