45 Troubleshooting the MTA

This chapter describes common tools, methods, and procedures for troubleshooting the Message Transfer Agent (MTA) in Unified Configuration.

Also, see the discussion about monitoring procedures in "Monitoring Messaging Server".

Note:

This information assumes that you are familiar with the MTA, both from a conceptual and administration perspective.

Troubleshooting Overview

One of the first steps in troubleshooting the MTA is to determine where to begin the diagnosis. Depending on the problem, you might look for error messages in log files. In other situations, you might check all the standard MTA processes, review the MTA configuration, or start and stop individual channels. Whatever approach you use, consider the following questions when troubleshooting the MTA:

  • Did configuration or environmental problems prevent messages from being accepted (for example, disk space or quota problems)?

  • Were MTA services such as the Dispatcher and the Job Controller present at the time the message entered the message queue?

  • Did network connectivity or routing problems cause messages to be stuck or misrouted on a remote system?

  • Did the problem occur before or after a message entered into the message queue?

This information addresses these questions in the subsequent sections.

Standard MTA Troubleshooting Procedures

This section outlines standard troubleshooting procedures for the MTA. Follow these procedures if a problem does not generate an error message, if an error message does not provide enough diagnostic information, or if you want to perform general wellness checks, testing, and standard maintenance of the MTA.

Check the MTA Configuration

Test your address configuration by using the imsimta test -rewrite utility. With this utility, you can test the MTA's address rewriting and channel mapping without actually having to send a message. Refer to "Message Transfer Agent Command-line Utilities" for more information.

The utility will normally show address rewriting that will be applied as well as the channel to which messages will be queued. However, syntax errors in the MTA configuration will cause the utility to issue an error message. If the output is not what you expect, you may need to correct your configuration.

Check the Message Queue Directories

Check if messages are present in the MTA message queue directory, typically DataRoot/queue/. Use command-line utilities like imsimta qm to check for the presence of expected message files under the MTA message queue directory. See "imsimta qm" and "imsimta qm counters" for more information.

If the imsimta test -rewrite output looks correct, check that messages are actually being placed in the MTA message queue subdirectories. To do so, enable message logging and check the log files in the directory DataRoot/log/. See "Managing MTA Message and Connection Logs" for more information on MTA logging You can track a specific message by its message ID to ensure that it is being placed in the MTA message queue subdirectories. If you are unable to find the message, you may have a problem with file disk space or directory permissions.

Check the Ownership of Critical Files

You should have selected a mail server user account (mailsrv by default) when you installed Oracle Communications Messaging Server. The following directories, subdirectories, and files should be owned by this account:

DataRoot/queue/
DataRoot/log
DataRoot/tmp

Commands, like the ones in the following UNIX system example, can be used to check the protection and ownership of these directories:

ls -l -p -d /opt/sun/comms/messaging64/data/queue
drwx------ 2 mailsrv mail 512 Sep 18 21:17 /opt/sun/comms/messaging64/data/queue

ls -l -p -d /opt/sun/comms/messaging64/data/log
drwx------ 2 mailsrv mail 2560 Oct 15 05:25 /opt/sun/comms/messaging64/data/log

ls -l -p -d /opt/sun/comms/messaging64/data/tmp
drwx------ 2 mailsrv mail 512 Sep 18 21:17 /opt/sun/comms/messaging64/data/tmp

Check that the files in DataRoot/queue are owned by the MTA account by a using command such as the following UNIX system example:

ls -l -p -R /opt/sun/comms/messaging64/data/queue

Check that the Job Controller and Dispatcher Are Running

The MTA Job Controller handles the execution of the MTA processing jobs, including most outgoing (master) channel jobs.

Some MTA channels, such as the MTA's multi-threaded SMTP channels, include resident server processes that process incoming messages. These servers handle the slave (incoming) direction for the channel. The MTA Dispatcher handles the creation of such MTA servers. Dispatcher configuration options control the availability of the servers, the number of created servers, and how many connections each server can handle.

To check that the Job Controller and Dispatcher are present, and to see if there are MTA servers and processing jobs running, use the command imsimta process. Under idle conditions the command should result in job_controller and dispatcher processes. For example:

imsimta process
USER PID S VSZ RSS STIME TIME COMMAND
mailsrv2 308 S 50168 26896 23:05:00 00:01 /opt/sun/comms/messaging64/lib/tcp_smtp_server
mailsrv2 4187 S 46880 17704 Feb_17 01:51 /opt/sun/comms/messaging64/lib/dispatcher
mailsrv2 5887 S 50160 26976 23:25:00 00:00 /opt/sun/comms/messaging64/lib/tcp_smtp_server
mailsrv2 19018 S 47008 21640 Jun_20 01:21 /opt/sun/comms/messaging64/lib/job_controller

If the Job Controller is not present, the files in the DataRoot/queue directory get backed up and messages are not delivered. If you do not have a Dispatcher, then you are unable to receive any SMTP connections.

See "imsimta process" for more information.

You could also use the imsimta qm jobs command to list, channel by channel, all active and pending delivery processing jobs currently being managed by the Job Controller. Additional cumulative information is provided for each channel such as the number of message files successfully delivered and those requeued for subsequent delivery attempts. The command syntax is as follows:

jobs [-[no]hosts] [-[no]jobs] [-[no]messages] [channel-name]

If neither the Job Controller nor the Dispatcher is present, you should review the dispatcher.log-* or job_controller.log-* file in the DataRoot/log directory.

If the log files do not exist or do not indicate an error, start the processes by using the "start-msg" command.

Note:

You should not see multiple instances of the Dispatcher or Job Controller when you run imsimta process, unless the system is in the process of forking (fork()) child processes before it executes (exec()) the program that needs to run. However, the time frame during such duplication is very small.

Check the Log Files

If MTA processing jobs run properly but messages stay in the message queue directory, you can examine the log files to see what is happening. All MTA log files are created in the DataRoot/log directory. Log file name formats for various MTA processing jobs are shown in Table 45-1.

Table 45-1 MTA Log Files

File Name Log File Contents

channel_master.log-uniqueid

Output of master program (usually client) for channel.

channel_slave.log-uniqueid

Output of slave program (usually server) for channel.

dispatcher.log-uniqueid

Dispatcher debugging. This log is created regardless if the Dispatcher DEBUG option is set. However, to get detailed debugging information, you should set the DEBUG option to a non-zero value.

imta

ims-ms channel error messages when there is a problem in delivery.

job_controller.log-uniqueid

Job controller logging. This log is created regardless if the Job Controller DEBUG option is set. However, to get detailed debugging information, you should set the DEBUG option to a non-zero value.

tcp_smtp_server.log-uniqueid

Debugging for the tcp_smtp_server. The information in this log is specific to the server, not to messages.

return.log-uniqueid

Debug output for the periodic MTA message bouncer job; this log file is created if the return_debug MTA option is set.


Note:

Each log file is created with a unique ID (uniqueid) to avoid overwriting an earlier log created by the same channel. To find a specific log file, you can use the imsimta view utility. You can also purge older log files by using the imsimta purge command. However, by default this command is run on a regular basis (see "Pre-defined Automatic Tasks"). See "imsimta purge" for more information.

The channelmaster.log-uniqueid and channelslave.log-uniqueid log files are created in any of the following situations:

  • There are errors in your current configuration.

  • The master_debug or slave_debug options are set on the channel.

  • If mm_debug is set to a non-zero value (mm_debug > 0).

For more information on debugging channel master and slave programs, see Messaging Server Reference.

Running a Channel Program Manually

When diagnosing an MTA delivery problem, it is helpful to manually run an MTA delivery job, particularly after you enable debugging for one or more channels.

The command imsimta submit notifies the MTA Job Controller to run the channel. If debugging is enabled for the channel in question, imsimta submit creates a log file in the DataRoot/log directory as shown in Table 45-1, "MTA Log Files".

The command imsimta run performs outbound delivery for the channel under the currently active process, with output directed to your terminal. This might be more convenient than submitting a job, particularly if you suspect problems with job submission itself.

Note:

To manually run channels, the Job Controller must be running.

See "Command Descriptions" for information on syntax, options, and examples of imsimta submit and imsimta run commands.

Starting and Stopping Individual Channels

In some cases, stopping and starting individual channels may make message queue problems easier to diagnose and debug. Stopping a message queue allows you to examine queued messages to determine the existence of loops or spam attacks.

To Stop Outbound Processing (dequeueing) for a Specific Channel

  1. Use the imsimta qm stop command to stop a specific channel. Doing so prevents you from having to stop the Job Controller and having to recompile the configuration. In the following example, the conversion channel is stopped:

    imsimta qm stop conversion
    
  2. To resume processing, use the imsimta qm start command to restart the channel. In the following example, the conversion channel is started:

    imsimta qm start conversion
    

    See "imsimta qm" for more information on imsimta qm start and imsimta qm stop commands.

    Note:

    The command imsimta qm start/stopchannel might fail if run simultaneously for many channels at the same time. The tool might have trouble updating the hold_list and could report: QM-E-NOTSTOPPED, unable to stop the channel; cannot update the hold list. imsimta qm start/stopchannel should only be used sequentially with a few seconds interval between each run.

    If you only want the channel to run between certain hours, use the following commands:

    msconfig set job_controller.job_pool:DEFAULT.urgent_delivery 08:00-20:00
    msconfig set job_controller.job_pool:DEFAULT.normal_delivery 08:00-20:00
    msconfig set job_controller.job_pool:DEFAULT.nonurgent_delivery 08:00-20:00
    

To Stop Inbound Processing from a Specific Domain or IP Address (Enqueuing to a Channel)

You can run one of the following processes if you want to stop inbound message processing for a specific domain or IP address, while returning temporary SMTP errors to client hosts. By doing so, messages are not held on your system. See "PART 1. MAPPING TABLES" for more information.

  • To stop inbound processing for a specific host or domain name, add the following access rule to the ORIG_SEND_ACCESS mapping table by running the msconfig edit mapping command:

    ORIG_SEND_ACCESS
    
    *|*@example.com|*|* $X4.2.1|$NHost$ temporarily$ blocked
    

By using this process, the sender's remote MTA holds messages on their systems, continuing to resend them periodically until you restart inbound processing.

  • To stop inbound processing for a specific IP address, add the following access rule to the PORT_ACCESS mapping table by running the msconfig edit mapping command:

    PORT_ACCESS
    
    TCP|*|25|_IP_address_to_block_|* $N400$ can't$ connect$ now
    

When you want to restart inbound processing from the domain or IP address, be sure to remove these rules from the mapping tables and recompile your configuration. In addition, you may want to create unique error messages for each mapping table. Doing so enables you to determine which mapping table is being used.

An MTA Troubleshooting Example

This section explains how to troubleshoot a particular MTA problem step-by-step. In this example, a mail recipient did not receive an attachment to an email message. Note: In keeping with MIME protocol terminology, the "attachment" is referred to as a "message part" in this section. The aforementioned troubleshooting techniques are used to identify where and why the message part disappeared (See "Standard MTA Troubleshooting Procedures"). By using the following steps, you can determine the path the message took through the MTA. In addition, you can determine if the message part disappeared before or after the message entered the message queue. To do so, you will need to manually stop and run channels, capturing the relevant files.

Note:

The Job Controller must be running when you manually run messages through the channels.

Identify the Channels in the Message Path

By identifying which channels are in the message path, you can apply the master_debug and slave_debug options to the appropriate channels. These options generate debugging output in the channels' master and slave log files. In turn, the master and slave debugging information will assist in identifying the point where the message part disappeared.

  1. Set the log_message_id MTA option to 1 by running the msconfig set log_message_id 1 command. With this option set, you will see message ID: header lines in the mail.log_current file.

  2. Run imsimta cnbuild to recompile the configuration.

  3. Run imsimta restartdispatcher to restart the SMTP server.

  4. Have the end user resend the message with the message part.

  5. Determine the channels that the message passes through. While there are different approaches to identifying the channels, the following approach is recommended:

    1. On UNIX platforms, use the grep command to search for message ID: header lines in the mail.log_current file in directory DataRoot/log.

    2. Once you find the message ID: header lines, look for the E (enqueue) and D (dequeue) records to determine the path of the message. Refer to "Understanding the MTA Log Entry Format" for more information on logging entry codes. See the following E and D records for this example:

      29-Aug-2001 10:39:46.44 tcp_local conversion E 2 ...
      29-Aug-2001 10:39:46.44 conversion tcp_intranet E 2 ...
      29-Aug-2001 10:39:46.44 tcp_intranet D 2 ...
      

The channel on the left is the source channel, and the channel on the right is the destination channel. In this example, the E and D records indicate that the message's path went from the tcp_local channel to the conversion channel and finally to the tcp_intranet channel.

Manually Start and Stop Channels to Gather Data

This section describes how to manually start and stop channels (see also "Starting and Stopping Individual Channels"). By starting and stopping channels in the message's path, you are able to save the message and log files at different stages in the MTA process. These files are later used to "To Identify the Point of Message Breakdown".

To Manually Start and Stop Channels

  1. Set the mm_debug MTA option to 5 by running the msconfig set mm_debug 5 command to provide substantial debugging information.

  2. Add the slave_debug and master_debug options to the appropriate channels by running the msconfig edit channels command and modifying the appropriate channel definitions.

    1. Use the slave_debug option on the inbound channel (or any channel where the message is switched to during the initial dialog) from the remote system that is sending the message with the message part. In this example, the slave_debug option is added to the tcp_local channel.

    2. Add the master_debug option to the other channels that the message passed through and were identified in "Identify the Channels in the Message Path". In this example, it would be added to the conversion and tcp_intranet channels.

    3. Recompile the configuration by using imsimta cnbuild if running a compiled configuration.

    4. Run the command imsimta restart dispatcher to restart the SMTP server.

  3. Use the imsimta qm stop and imsimta qm start commands to manually start and stop specific channels. See "Starting and Stopping Individual Channels" for more on information by using these channel options.

  4. To start the process of capturing the message files, have the end user resend the message with the message part.

  5. When the message enters a channel, the message will stop in the channel if it has been stopped with the imsimta qm stop command. For more information, see Step 3 in this section.

    1. Copy and rename the message file before you manually run the next channel in the message's path. See the following UNIX platform example:

      cp ZZ01K7LXW76T7O9TD0TB.00 ZZ01K7LXW76T7O9TD0TB.KEEP1

      The message file typically resides in directory similar to DataRoot/queue/destination_channel/001. The destination_channel is the next channel that the message passes through (such as: tcp_intranet). If you want to create subdirectories (like 001, 002, and so on) in the destination_channel directory, add the subdirs option to the channels.

    2. It is recommended that you number the extensions of the message each time you trap and copy the message to identify the order in which the message is processed.

  6. Resume message processing in the channel and enqueue to the next destination channel in the message's path. To do so, use the imsimta qm start command.

  7. Copy and save the corresponding channel log file (for example: tcp_intranet_master.log-*) located in the DataRoot/log directory. Choose the appropriate log file that has the data for the message you are tracking. Make sure that the file you copy matches the timestamp and the subject header for the message as it comes into the channel. In the example of the tcp_intranet_master.log-*, you might save the file as tcp_intranet_master.keep so the file is not deleted.

  8. Repeat steps 5 - 7 until the message has reached its final destination.

    The log files you copied in Step 7 should correlate to the message files that you copied in Step 5. If, for example, you stopped all of the channels in the missing message part scenario, you would save the conversion_master.log-* and the tcp_intranet_master.log-* files. You would also save the source channel log file tcp_local_slave.log-*.

    In addition, you would save a copy of the corresponding message file from each destination channel: ZZ01K7LXW76T7O9TD0TB.KEEP1 from the conversion channel and ZZ01K7LXW76T7O9TD0TB.KEEP2 from the tcp_intranet channel.

  9. Remove debugging options once the message and log files have been copied.

    1. Remove the slave_debug and the master_debug options from the appropriate channels by running the msconfig channels command.

    2. Reset the mm_debug MTA option and remove the setting for the log_message_id MTA option by running the msconfig set mm_debug 0 and msconfig set log_message_id 0 commands or by running the msconfig unset mm_debug and msconfig unset log_message_id commands.

    3. Recompile the configuration by using imsimta cnbuild if running a compiled configuration.

    4. Run the command imsimta restart dispatcher to restart the SMTP server.

To Identify the Point of Message Breakdown

  1. By the time you have finished starting and stopping the channel programs, you should have the following files with which you can use to troubleshoot the problem:

    1. All copies of the message file (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1) from each channel program

    2. A tcp_local_slave.log-* file

    3. A set of channel_master.log-* files for each destination channel

    4. A set of mail.log_current records that show the path of the message All files should have timestamps and message ID values that match the message ID: header lines in the mail.log_current records.

      Note that the exception is when messages are bounced back to the sender; these bounced messages will have a different message ID value than the original message.

  2. Examine the tcp_local_slave.log-* file to determine if the message had the message part when it entered the message queue.

    Look at the SMTP dialog and data to see what was sent from the client machine.

    If the message part did not appear in the tcp_local_slave.log-* file, then the problem occurred before the message entered the MTA. As a result, the message was enqueued without the message part. If this the case, the problem could have occurred on the sender's remote SMTP server or in the sender's client machine.

  3. Investigate the copies of the message files to see where the message part was altered or missing.

    If any message file showed that the message part was altered or missing, examine the previous channel's log file. For example, you should look at the conversion_master.log-* file if the message part in the message entering the tcp_intranet channel was altered or missing.

  4. Look at the final destination of the message.

    If the message part looks unaltered in the tcp_local_slave.log, the message files (for example: ZZ01K7LXW76T7O9TD0TB.KEEP1), and the channel_master.log-* files, then the MTA did not alter the message and the message part is disappearing at the next step in the path to its final destination. If the final destination is the ims-ms channel (the Message Store), then you might download the message from the server to a client machine to determine if the message part is being dropped during or after this transfer.

    If the destination channel is a tcp_* channel, then you must go to the MTA in the message's path. Assuming it is an Messaging Server MTA, you will need to repeat the entire troubleshooting process (see "Identify the Channels in the Message Path" and "Manually Start and Stop Channels to Gather Data" and this section). If the other MTA is not under your administration, then the user who reported the problem should contact that particular site.

Common MTA Problems and Solutions

This section lists common problems and solutions for MTA configuration and operation.

TLS Problems

If, during SMTP dialog, the STARTTLS command returns the following error:

454 4.7.1 TLS library initialization failure

and if you have certificates installed and working for POP and IMAP access, check the following:

  • Protections/ownerships of the certificates have to be set so mailsrv account can access the files.

  • The directory where the certificates are stored need to have protections/ownerships set such that the mailsrv account can access the files within that directory.

After changing protections and installing certificates, you must run:

stop-msg dispatcher
start-msg dispatcher

Restarting should work, but it is better to shut it down completely, install the certificates, and then start things back up.

Changes to Configuration Files or MTA Databases Do Not Take Effect

If changes to your configuration are not taking effect, check to see if you have performed the following steps:

  1. Recompile the configuration (by running imsimta cnbuild).

  2. Restart the appropriate processes (like imsimta restart dispatcher).

  3. Re-establish any client connections.

The MTA Sends Outgoing Mail but Does Not Receive Incoming Mail

Most MTA channels depend upon a slave or channel program to receive incoming messages. For some transport protocols that are supported by the MTA (like TCP/IP and UUCP), you must make sure that the transport protocol activates the MTA slave program rather than its standard server. Replacing the native sendmail SMTP server with the MTA SMTP server is performed as a part of the Messaging Server installation.

For the multi-threaded SMTP server, the startup of the SMTP server is controlled by the Dispatcher. If the Dispatcher is configured to use a MIN_PROCS value greater than or equal to one for the SMTP service, then there should always be at least one SMTP server process running (and potentially more, according to the MAX_PROCS value for the SMTP service). The imsimta process command may be used to check for the presence of SMTP server processes. See "imsimta process" for more information.

Dispatcher (SMTP Server) Won't Start Up

If the dispatcher won't start up, first check the dispatcher.log-* for relevant error messages. If the log indicates problems creating or accessing the /tmp/.path.dispatcher.socket file, then verify that the /tmp protections are set to 1777. This would show up in the permissions as follows:

drwxrwxrwt 8 root sys 734 Sep 17 12:14 tmp/
.

Also do an ls -l of the path.version-specific-name.dispatcher.socket file and confirm the proper ownership. For example, if this is created by root, then it is inaccessible by mailsrv.

Do not remove the .path.dispatcher.file and do not create it if it's missing. The dispatcher will create the file. If protections are not set to 1777, the dispatcher will not start or restart because it won't be able to create/access the socket file. In addition, there may be other problems occurring not related to the Messaging Server.

Messaging Server: MessagingServer_home/config/.var_opt_sun_comms_messaging64_config.dispatcher.socket

Timeouts on Incoming SMTP Connections

Timeouts on incoming SMTP connections are most often related to system resources and their allocation. The following techniques can be used to identify the causes of timeouts on incoming SMTP connections.

To Identify the Causes of Timeouts on Incoming SMTP Connections

  1. Check how many simultaneous incoming SMTP connections you allow. This is controlled by the MAX_PROCS and MAX_CONNS Dispatcher settings for the SMTP service. The number of simultaneous connections allowed is MAX_PROCS*MAX_CONNS. If you can afford the system resources, consider raising this number if it is too low for your usage.

  2. Another technique you can use is to open a TELNET session. In the following example, the user connects to 127.0.0.1 port 25. Once connected, 220 banner is returned. For example:

    telnet 127.0.0.1 25
    Trying 127.0.0.1...
    Connected to 127.0.0.1.
    Escape character is '^]'.
    220 budgie.example.com --Server ESMTP (Sun Java System Messaging Server 6.1
    (built May 7 2001))
    

    If you are connected and receive a 220 banner, but additional commands (like ehlo and mail from) do not illicit a response, then you should run imsimta test -rewrite to ensure that the configuration is correct.

  3. If the response time of the 220 banner is slow, and if running the pstack command on the SMTP server shows the following iii_res* functions (these functions indicate that a name resolution lookup is being performed):

    febe2c04 iii_res_send (fb7f4564, 28, fb7f4de0, 400, fb7f458c, fb7f4564) +
    42c febdfdcc iii_res_query (0, fb7f4564, c, fb7f4de0, 400, 7f) + 254
    

    then it is likely that the host has to do reverse name resolution lookups, even on a common pair like localhost/127.0.0.1. To prevent such a performance slowdown, you should reorder your host's lookups in the /etc/nsswitch.conf file. To do so, change the following line in the /etc/nsswitch.conf file from:

    hosts: dns nis [NOTFOUND=return] files
    

    to:

    hosts: files dns nis [NOTFOUND=return]
    

    Making this change in the /etc/nsswitch.conf file can improve performance as fewer SMTP servers have to handle messages instead of multiple SMTP servers having to perform unnecessary lookups.

  4. You can also put the slave_debug option on the channels handling incoming SMTP over TCP/IP mail, usually tcp_local and tcp_intranet. After doing so, review the most recent tcp_local_slave.log-uniqueid files to identify any particular characteristics of the messages that time out. For example, if incoming messages with large numbers of recipients are timing out, consider using the expandlimit option on the channel. Remember that if your system is overloaded and overextended, timeouts will be difficult to avoid entirely.

Messages Are Not Dequeued

Errors encountered during TCP/IP delivery are often transient; the MTA will generally retain messages when problems are encountered and retry them periodically. It is normal on large networks to experience periodic outages on certain hosts while other host connections work fine. To verify the problem, examine the log files for errors relating to delivery attempts. You may see error messages such as, "Fatal error from smtp_open." Such errors are not uncommon and are usually associated with a transient network problem. To debug TCP/IP network problems, use utilities like PING, TRACEROUTE, and NSLOOKUP.

The following example shows the steps you might use to see why a message is sitting in the queue awaiting delivery to xtel.co.uk. To determine why the message is not being dequeued, you can recreate the steps the MTA uses to deliver SMTP mail on TCP/IP.

nslookup -query=mx example.com (Step 1)

Server: LOCALHOST
Address: 127.0.0.1

Non-authoritative answer:
example.com preference = 10, mail exchanger = mailhost.example.com (Step 2)

telnet mailhost.example.com 25 (Step 3)
Trying... [10.1.1.1]
telnet: Unable to connect to remote host: Connection refused
  1. Use the NSLOOKUP utility to see what MX records, if any, exist for this host. If no MX records exist, then you should try connecting directly to the host. If MX records do exist, then you must connect to the designated MX relays. The MTA honors MX information preferentially, unless explicitly configured not to do so. For more information, see the discussion on TCP/IP nameserver and MX record support in Messaging Server Reference.

  2. In this example, the DNS (Domain Name Service) returned the name of the designated MX relay for xtel.co.uk. This is the host to which the MTA will actually connect. If more than one MX relay is listed, the MTA will try each MX record in succession, with the lowest preference value tried first.

  3. If you do have connectivity to the remote host, you should check if it is accepting inbound SMTP connections by using TELNET to the SMTP server port 25.

    Note:

    If you use TELNET without specifying the port, you will discover that the remote host accepts normal TELNET connections. This does not indicate that it accepts SMTP connections; many systems accept regular TELNET connections but refuse SMTP connections and vice versa. Consequently, you should always do your testing against the SMTP port.

    In the previous example, the remote host is refusing connections to the SMTP port. This is why the MTA fails to deliver the message. The connection may be refused due to a misconfiguration of the remote host or some sort of resource exhaustion on the remote host. In this case, nothing can be done to locally to resolve the problem. Typically, you should let the MTA continue to retry the message.

If you are running Messaging Server on a TCP/IP network that does not use DNS, you can skip the first two steps. Instead, you can use TELNET to directly access the host in question. Be careful to use the same host name that the MTA would use. Look at the relevant log file from the MTA's last attempt to determine the host name. If you are using host files, you should make sure that the host name information is correct. It is strongly recommended that you use DNS instead of host names.

If you test connectivity to a TCP/IP host and encounter no problems using interactive tests, it is quite likely that the problem has simply been resolved since the MTA last tried to deliver the message. You can re-run the imsimta submit tcp_channel on the appropriate channel to see if messages are being dequeued.

Creating a New Channel

In certain circumstances, a remote domain can break down and the volume of mail addressed to this server can be so great that the outgoing channel queue fills up with messages that cannot be delivered. The MTA tries to redeliver these messages periodically (the frequency and number of the retries is configurable using the backoff channel option) and under normal circumstances, no action is needed. However, if too many messages get stuck in the queue, other messages may not get delivered in a timely manner because all the channel jobs are working to process the backlog of messages that cannot be delivered.

In this situation, you can reroute these messages to a new channel running in its own job controller pool. This will avoid contention for processing and allow the other channels to deliver their messages. This procedure is described in the following procedure. Assume a domain called example.org.

To Create a New Channel

  1. Create a new channel called tcp_example-daemon and add a new value for the pool option.

    Channels are created in by running the msconfig edit channels command. The channel should have the same channel options on your regular outgoing tcp_* channel. Typically, this is the tcp_local channel, which handles all outbound (internet) traffic. Since example.org is out on the Internet, this is the channel to emulate. The new channel might look something like this:

    tcp_example smtp nomx single_sys remotehost inner allowswitchchannel \
    dentnonenumeric subdirs 20 maxjobs 7 pool SMTP_example maytlsserver \
    maysaslserver saslswitchchannel tcp_auth missingrecipientpolicy 0 \
    tcp_example-daemon
    

    Note the new option-value pair pool SMTP_example. This specifies that messages to this channel will only use computer resources from the SMTP_example pool. There is a blank line before and after the new channel.

  2. Add two rewrite rules by running the msconfig edit rewrite to direct email destined for example.org to the new channel.

    The new rewrite rules look like this:

    example.org $U%$D@tcp_example-daemon
    .example.org $U%$H$D@tcp_example-daemon
    

    These rewrite rules direct messages to example.org (including addresses like host1.example.org or hostA.host1.example.org) to the new channel whose official host name is tcp_example-daemon. The rewriting part of these rules, $U%$D and $U%$H$D, retain the original addresses of the messages. $U copies the user name from original address. % is the separator---the @ between the username and domain. $H copies the unmatched portion of host/domain specification at the left of dot in pattern. $D copies the portion of domain specification that matched.

  3. Define a new job controller pool called SMTP_example.

    Run the msconfig set job_controller.job_pool:SMTP_example.job_limit 10 command to create a new pool called SMTP_example with a job_limit of 10. You can verify the addition of the new pool by running the msconfig show job_controller.job_pool command which will show output similar to the following:

    msconfig show job_controller.job_pool
    role.job_controller.job_pool:DEFAULT.job_limit = 10
    role.job_controller.job_pool:DEFAULT.urgent_delivery = help
    role.job_controller.job_pool:IMS_POOL.job_limit = 2
    role.job_controller.job_pool:SMTP_POOL.job_limit = 10
    role.job_controller.job_pool:SMTP_example.job_limit = 10
    

    This creates a message resource pool called SMTP_example that allows up to 10 jobs to be simultaneously run. See "The Job Controller" for details on jobs and pools.

  4. Restart the MTA.

    Issue the commands: imsimta cnbuild; imsimta restart

    This recompiles the configuration and restarts the job controller and dispatcher.

    In this example, a large quantity of email from your internal users is destined for a particular remote site called example.org. For some reason, example.org, is temporarily unable to accept incoming SMTP connections and thus cannot deliver email. (This type of situation is not a rare occurrence.)

    As email destined for example.org comes in, the outgoing channel queue, typically tcp_local, will fill up with messages that cannot be delivered. The MTA tries to redeliver these messages periodically (the frequency and number of the retries is configurable using the backoff options) and under normal circumstances, no action is needed.

    However, if too many messages get stuck in the queue, other messages may not get delivered in a timely manner because all the channel jobs are working to process the backlog of example.org messages. In this situation, you may want reroute example.org messages to a new channel running in its own job controller pool (see "The Job Controller"). This will allow the other channels to deliver their messages without having to contend for processing resources used by example.org messages. Creating a new channel to address this situation is described in the following information.

MTA Messages Are Not Delivered

In addition to message transport problems, there are two common problems which can result in unprocessed messages in the message queues:

  1. The queue cache is not synchronized with the messages in the queue directories. Message files in the MTA queue subdirectories that are awaiting delivery are entered into an in-memory queue cache. When channel programs run, they consult this queue cache to determine which messages to deliver in their queues. There are circumstances where there are message files in the queue, but there is no corresponding queue cache entry.

    1. To check if a particular file is in the queue cache, you can use the imsimta cache -view utility. If the file is not in the queue cache, then the queue cache needs to be synchronized.

      The queue cache is normally synchronized every four hours. If required, you can manually resynchronize the cache by using the command imsimta cache -sync. Once synchronized, the channel programs will process the originally unprocessed messages after new messages are processed. If you want to change the default (4 hours), you should modify the job_controller configuration by running the msconfig set job_controller.synch_timetimeperiod command where timeperiod reflects how often the queue cache is synchronized. The timeperiod must be greater than 30 minutes. In the following example, the queue cache synchronization is modified to 2 hours by running the following command:

      msconfig set job_controller.synch_time 02:00
      

      You can run imsimta submitchannel to clear out the backlog of messages after running imsimta cache -sync. Clearing out the channel may take a long time if the backlog of messages is large (greater than 1000).

      For summarized queue cache information, run imsimta qm -maint dir -database -total.

    2. If after synchronizing the queue cache, messages are still not being delivered, you should restart the Job Controller. To do so, use the imsimta restart job_controller command.

      Restarting the Job Controller causes the message data structure to be rebuilt from the message queues on disk.

      Caution:

      Restarting the Job Controller is a drastic step and should only be performed after all other avenues have been thoroughly exhausted.

      Refer to "The Job Controller" for more information on the Job Controller.

  2. Channel processing programs fail to run because they cannot create their processing log file. Check the access permissions, disk space and quotas.

Messages are Looping

If the MTA detects that a message is looping, that message will be sidelined as a HELD file. See "Diagnosing and Cleaning up .HELD Messages" for more information. Certain cases can lead to message loops which the MTA can not detect.

The first step is to determine why the messages are looping. You should look at a copy of the problem message file while it is in the MTA queue area, MTA mail log entries (if you have the logging channel option enabled in your MTA configuration for the channels in question) relating to the problem message, and MTA channel debug log files for the channels in question. Determining the From: and To: addresses for the problem message, seeing the Received: header lines, and seeing the message structure (type of encapsulation of the message contents), can all help pinpoint which sort of message loop case you are encountering.

Some of the more common cases include:

  1. A postmaster address is broken. The MTA requires that the postmaster address be a functioning address that can receive email. If a message to the postmaster is looping, check that your configuration has a proper postmaster address pointing to an account that can receive messages.

  2. Stripping of Received: header lines is preventing the MTA from detecting the message loop. Normal detection of message loops is based on Received: header lines. If Received: header lines are being stripped (either explicitly on the MTA system itself, or on another system like a firewall), it can interfere with proper detection of message loops. In these scenarios, check that no undesired stripping of Received: header lines is occurring. Also, check for the underlying reason why the messages are looping. Possible reasons include: a problem in the assignment of system names or a system not configured to recognize a variant of its own name, a DNS problem, a lack of authoritative addressing information on the system in question, or a user address forwarding error.

  3. Incorrect handling of notification messages by other messaging systems are generating reencapsulated messages in response to notification messages. Internet standards require that notification messages (reports of messages being delivered, or messages bouncing) have an empty envelope From: address to prevent message loops. However, some messaging systems do not correctly handle such notification messages. When forwarding or bouncing notification messages, these messaging systems may insert a new envelope From: address. This can then lead to message loops. The solution is to fix the messaging system that is incorrectly handling the notification messages.

Diagnosing and Cleaning up .HELD Messages

If the MTA detects a serious problem having to do with delivery of a message, the message is stored in a file with the suffix .HELD in DataRoot/queue/channel. For example:

ls
ZZ0HXZ00G0EBRBCP.HELD
ZZ0HY200C0O6LGHU.HELD
ZZ0HYA006LP66O3H.HELD
ZZ0HZ7003EOQSE37.HELD

.HELD files can occur due to three major reasons:

  • Looping messages. The MTA detected that the messages were looping via build-up of one or another sort of Received: header lines).

  • User or domain status set to hold. These are messages that are, by intent of the MTA administrator, intentionally being side-lined, typically while some maintenance procedure is being performed, (for example, while moving user mailboxes).

  • Suspicious messages. Messages that met some suspicious threshold and were held for later manual inspection by the MTA administrator. Messages can be .HELD due to exceeding a configured maximum number of envelope recipients (see the holdlimit channel option in Messaging Server Reference), due to running the "imsimta qclean" or "clean" or "hold" commands based on some suspicion of the message(s) in question, or due to use of a hold action in a Sieve script.

Messages .HELD Due to Looping

Messages bouncing between servers or channels are said to be looping. Typically, a message loop occurs because each server or channel thinks the other is responsible for delivery of the message. Looping messages usually have a great many *Received: header lines. The Received: header lines illustrate the exact path of the message loop. Look carefully at the host names and any recipient address information (for example, for recipient clauses or ORCPT recipient comments) appearing in such header lines. One cause of such message loops is user error.

For example, an end users might set an option to forward messages on two separate mail hosts to one another. On their example.com account, the users enable mail forwarding to their example.edu account. And, forgetting that they have enabled this setting, they set mail forwarding on their example.edu account to their example.com account.

A loop can also occur with a faulty MTA configuration. For example, MTA Host X thinks that messages for mail.example.com go to Host Y. However, Host Y thinks that Host X should handle messages for mail.example.com. As a result, Host Y returns the mail to Host X.

In these cases, the message is ignored by the MTA and no further delivery is attempted. When such a problem occurs, look at the header lines in the message to determine which server or channel is bouncing the message. Fix the entry as needed.

Another common cause of message loops is the MTA receiving a message that was addressed to the MTA host using a network name that the MTA does not recognize (has not been configured to recognize) as one of its own names. The solution is to add the additional name to the list of names that your MTA recognizes as its own. The MTA's threshholds for determining that a message is looping are configurable; see the MAX_*RECEIVED_LINES MTA options (http://download.oracle.com/docs/cd/E19566-01/819-4429/index.html). Also note that the MTA may optionally be configured. See the HELD_SNDOPR MTA option to generate a syslog notice whenever a message is forced into .HELD state due to exceeding such a threshold. If syslog messages of Received count exceeded; message held. are present, then you know that this is occurring.

You can resend the .HELD message by using the imsimta qm "release" command or by following these steps:

Note:

The imsimta qm "release" command is the preferred method.
  1. Rename the .HELD extension to any 2 digit number other than 00. For example, .HELD to .06.

    Note:

    Before renaming the .HELD file, be sure that the message is not likely to continue looping.
  2. Run imsimta cache -sync.

    Running this command updates the cache.

  3. Run imsimta submitchannel or imsimta runchannel.

You might need to perform these steps multiple times, since the message may again be marked as .HELD, because the Received: header lines accumulate. If the problem still exists, the *.HELD file is recreated under the same channel with as before. If the problem has been addressed, the messages are dequeued and delivered.

See "clean" if you determine that the messages can simply be deleted with no attempt to deliver them,

Messages .HELD Due to User or Domain hold Status

Messages that are .HELD due to a user or domain status of hold, and only messages .HELD for such a reason, are normally stored in the hold channel's queue area. That is, .HELD message files in the hold channel's queue area can be assumed to be .HELD due to user or domain status.

Messages .HELD Due to a Suspicious Characteristic

Messages .HELD due to some suspicious characteristic exhibit that characteristic. The characteristic could be anything that the site has chosen to characterize as suspicious. MTA Administrators should stay aware of these configuration choices and actions. However, if you are not the only or original administrator of this MTA, then check the MTA configuration for any configured use of the holdlimit channel option (see the discussion on expansion of multiple addresses in Messaging Server Reference), any use of the $H flag in address-based *_ACCESS mapping tables, or any use of the hold action in any system Sieve file, or any channel level Sieve filters configured and named by use of sourcefilter or destinationfilter channel options. See the discussion on the filter file location in Messaging Server Reference. Additionally, ask any fellow MTA administrators about any manual command-line message holds (through, for instance, an imsimta qm clean command) they might have recently performed. Application of a Sieve filter hold action, whether from a system Sieve filter or from users' personal Sieve filters, may optionally be logged. See the discussion on the LOG_FILTER global MTA option in Messaging Server Reference for more information.

Received Message is Encoded

Messages sent by the MTA are received in an encoded format. For example:

Date: Wed, 04 Jul 2001 11:59:56 -0700 (PDT)
From: "Desdemona Vilalobos" <Desdemona@example.com>
To: santosh@example.edu
Subject: test message with 8bit data
MIME-Version: 1.0
Content-type: TEXT/PLAIN; CHARSET=ISO-8859-1
Content-transfer-encoding: QUOTED-PRINTABLE

2=00So are the Bo=F6tes Void and the Coal Sack the same?=

These messages appear unencoded when read with the MTA decoder command imsimta decode. See Messaging Server Reference for more information.

The SMTP protocol only allows the transmission of ASCII characters (a seven-bit character set) as set forth by RFC 821. In fact, the unnegotiated transmission of eight-bit characters is illegal through SMTP, and it is known to cause a variety of problems with some SMTP servers. For example, SMTP servers can go into compute bound loops. Messages are sent over and over again. Eight-bit characters can crash SMTP servers. Finally, eight-bit character sets can wreak havoc with browsers and mailboxes that cannot handle eight-bit data.

An SMTP client used to only have three options when handling a message containing eight-bit data: return the message to the sender as undeliverable, encode the message, or send it in direct violation of RFC 821. But with the advent of MIME and the SMTP extensions, standard encodings exist that can encode eight-bit data by using the ASCII character set.

In the previous example, the recipient received an encoded message with a MIME content type of TEXT/PLAIN. The remote SMTP server (to which the MTA SMTP client transferred the message) did not support the transfer of eight-bit data. Because the original message contained eight-bit characters, the MTA had to encode the message.

Server-Side Rules (SSR) Are Not Working

A filter consists of one or more conditional actions to apply to a mail message. Since the filters are stored and evaluated on the server, they are often referred to as server-side rules (SSR).

See also "To Debug User-level Filters" for more information.

Testing Your SSR Rules

  • To check the MTA's user filters, run the following command:

    imsimta test -rewrite -debug -filter user@domain
    

    In the output, look for the following information:

    mmc_open_url called to open ssrf: user@ims-ms
    URL with quotes stripped: ssrd: user@ims-ms
    Determined to be a SSRD URL.
    Identifier: user@ims-ms-daemon
    Filter successfully obtained.
    
  • In addition, you can add the slave_debug option to the tcp_local channel to see how a filter is applied. The results are displayed in the tcp_local_slave.log file. Be sure to set mm_debug to 5 by running the msconfig set mm_debug 5 command to get sufficient debugging information.

Common Syntax Problems

If there is a syntax problem with the filter, look for the following message in the tcp_local_slave.log-* file:

Error parsing filter expression:...

  • If the filter is good, then filter information is at the end of the output.

  • If the filter is bad, then the following error is at the end of the output:

    Address list error - 4.7.1 Filter syntax error: desdaemona@example.com

    Also, if the filter is bad, then the SMTP RCPT TO command returns a temporary error response code:

    RCPT TO: <user>@<domain>
    452 4.7.1 Filter syntax error
    

Slow Response After Users Press Send Email Button

If users are experiencing delays when they send messages, undersized message queue disks could be responsible for reduced disk input/output. When users press the SEND button on their email client, the MTA will not fully accept receipt of the message until the message has been committed to the message queue. See the discussion on disk sizing for MTA message queues in Messaging Server Installation and Configuration Guide for more information.

Asterisks in the Local Parts of Addresses or Received Fields

The MTA now checks for 8-bit characters (instead of just ASCII characters) in the local parts of addresses as well as the received fields it constructs and replaces them with asterisks.

Abnormal Job Controller Terminations Seen in job_controller Logs

The Job Controller is essentially an in-memory database. Unlike other parts of the MTA, it doesn't have queues or transactions with which to contend. It listens for activity coming in on various network connections and updates its database accordingly.

Consequently, if the Job Controller fails, it is most likely a resource allocation failure (resource exhaustion). The only significant resource the Job Controller uses, especially when under stress, is memory. Therefore, allocate the right amount of memory for the machine that contains the Job Controller. See the discussion on planning a messaging server sizing strategy in Messaging Server Installation and Configuration Guide for details on memory utilization.

General Error Messages

When the MTA fails to start, general error messages appear at the command line. In this section, common general error messages will be described and diagnosed.

Note:

To diagnose your own MTA configuration, use the imsimta test -rewrite-debug utility to examine your MTA's address rewriting and channel mapping process. This utility enables you to check the configuration without actually sending a message. See "Check the MTA Configuration" for more information.

MTA subcomponents might also issue other error messages that are described in the MTA command-line utilities and configuration information. See "Configuring POP, IMAP, and HTTP Services" and "About MTA Services" and Messaging Server Reference for more information.

Errors in mm_init

An error in mm_init generally indicates an MTA configuration problem. If you run the imsimta test -rewrite utility, these errors are displayed. Other utilities such as imsimta cnbuild, or a channel, a server, or a browser might also return such an error.

Commonly encountered mm_init errors include:

bad equivalence for alias...

The right-hand side of an alias file entry is improperly formatted.

cannot open alias include file...

A file included into the alias file cannot be opened.

duplicate aliases found...

Two alias file entries have the same left hand side. You must find and eliminate the duplication. Look for an error message that says error line #XXX where XXX is a line number. You can fix the duplicated alias on the line.

duplicate host in channel table...

This error message indicates that you have two channel definitions in the MTA configuration that both have the same official host name.

Check your MTA configuration for any channel definitions with duplicate official host names.

duplicate mapping name found...

This message indicates that two mapping tables have the same name, and one of the duplicate mapping tables needs to be removed.

Note:

A blank line should precede and follow any line with a mapping table name. However, no blank lines should be interspersed among the entries of a mapping table.

mapping name is too long...

This error means that a mapping table name is too long and needs to be shortened.

error initializing ch_facility compiled character set version mismatch

If you see this message, you must recompile and reinstall your compiled character set tables through the command imsimta chbuild. See "imsimta chbuild" for more information.

error initializing ch_facility no room in...

This error message generally means that you need to resize your MTA character set internal tables and then rebuild the compiled character set tables with the following commands:

imsimta chbuild -noimage -maximum -option
imsimta chbuild

Verify that nothing else needs to be recompiled or restarted before making this change. See "imsimta chbuild" for more information.

local host alias or proper name too long for system...

This error indicates that a local host alias or proper name is too long (the optional right hand side in the second or subsequent names in a channel block).

no equivalence addresses for alias...

An entry in the alias file is missing a right hand side (translation value).

no official host name for channel...

This error indicates that a channel definition block is missing the required second line (the official host name line). See the MTA configuration and command-line utilities information in Messaging Server Reference for more information on channel definition blocks. A blank line is required before and after each channel definition block, but a blank line must not be present between the channel name and official host name lines of the channel definition.

official host name is too long

The official host name for a channel (second line of the channel definition block) is limited to 128 octets in length. If you are trying to use a longer official host name on a channel, shorten it to a place holder name, and then use a rewrite rule to match the longer name to the short official host name. You might see this scenario if you work with the l (local) channel host name. For example:

<Original l Channel:>
!delivery channel to local /var/mail store
l subdirs 20 viaaliasrequired maxjobs 7 pool LOCAL_POOL
walleroo.pocofronitas.thisnameismuchtoolongandreallymakesnosensebutitisan
example.monkey.gorilla.orangutan.antidiexampleblismentarianism.newt.salaman
der.lizard.gecko.komododragon.com

<Create Place Holder:>
!delivery channel to local /var/mail store
l subdirs 20 viaaliasrequired maxjobs 7 pool LOCAL_POOL
newt

<Create Rewrite Rule:>
newt.salamander.lizard.gecko.komododragon.com $U%$D@newt

When using the l (local) channel, you need to use a REVERSE mapping table. See the MTA configuration information in Messaging Server Reference for information on usage and syntax.

Compiled Configuration Version Mismatch

One of the functions of the imsimta cnbuild utility is to compile MTA configuration information into an image that can be quickly loaded. The compiled format is quite rigidly defined and often changes substantially between different versions of the MTA. Minor changes might occur as part of patch releases.

When such changes occur, an internal version field is also changed so that incompatible formats can be detected. The MTA components halt with the "Compiled Configuration Version Mismatch" error when an incompatible format is detected. The solution to this problem is to generate a new, compiled configuration with the command imsimta cnbuild.

Also, use the imsimta restart command to restart any resident MTA server processes, so they can obtain updated configuration information.

Swap Space Errors

To ensure proper operation, it is important to configure enough swap space on your messaging system. The amount of required swap space will vary depending on your configuration. A general tuning recommendation is that the amount of swap space should be at least three times the amount of main memory.

An error message such as the following indicates a lack of swap space:

jbc_channels: chan_execute [1]: fork failed: Not enough space

You might see this error in the Job Controller log file. Other swap space errors will vary depending on your configuration.

Use the following commands to determine how much swap space you have left and determine how much you have used:

swap -s (at the time MTA processes are busy), ps -elf, or tail /var/adm/messages

File Open or Create Errors

To send a message, the MTA reads configuration files and creates message files in the MTA message queue directories. Configuration files must be readable by the MTA or any program written against the MTA's SDKs. During installation, proper permissions are assigned to these files. The MTA utilities and procedures which create configuration files also assign permissions. If the files are protected by the system manager, other privileged user, or through some site-specific procedure, the MTA may not be able to read configuration information. This results in "File open" errors or unpredictable behavior. The imsimta test -rewrite utility reports additional information when it encounters problems reading configuration files. See "imsimta test" for more information.

If the MTA appears to function from privileged accounts but not from unprivileged accounts, then file permissions in the MTA table directory are likely the cause of the problem. Check the permissions on configuration files and their directories. See "Check the Ownership of Critical Files" for more information.

"File create" errors usually indicate a problem while creating a message file in an MTA message queue directory. See "Check the Message Queue Directories" to diagnose file creation problems.

Illegal Host/Domain Errors

You might see this error when an address is provided to the MTA through a browser. Or, the error may be deferred and returned as part of an error return mail message. In both cases, this error message indicates that the MTA is not able to deliver mail to the specified host. To determine why the mail is not being sent to the specified host, follow these troubleshooting procedures:

  • Verify that the address in question is not misspelled, is not transcribed incorrectly, or does not use the name of a host or domain that no longer exists.

  • Run the address in question through the imsimta test -rewrite utility. If this utility also returns an "illegal host/domain" error on the address, then the MTA has no rewrite rules or other configurations to handle the address. Verify that you have configured MTA correctly, that you answered all configuration questions appropriately, and that you have kept your configuration information up to date.

  • If imsimta test -rewrite does not encounter an error on the address, then MTA is able to determine how to handle the address, but the network transport will not accept it. You can examine the appropriate log files from the delivery attempt for additional details. Transient network routing or name service errors should not result in returned error messages, though it is possible for badly misconfigured domain name servers to cause these problems.

  • If you are on the Internet, check that you have properly configured your TCP/IP channel to support MX record lookups. Many domain addresses are not directly accessible on the Internet and require that your mail system correctly resolve MX entries. If you are on the Internet and your TCP/IP is configured to support MX records, you should have configured the MTA to enable MX support. See the discussion on TCP/IP connection and DNS lookup support in Messaging Server Reference for more information. If your TCP/IP package is not configured to support MX record lookups, then you will not be able to reach MX-only domains.

Errors in SMTP channels, os_smtp_* errors

Errors such as the following are not necessarily MTA errors: os_smtp_* errors like os_smtp_open, os_smtp_read, and os_smtp_write errors. These errors are generated when the MTA reports a problem encountered at the network layer. For example, an os_smtp_open error means that the network connection to the remote side could not be opened. The MTA may be configured to connect to an invalid system because of addressing errors or channel configuration errors. The os_smtp_* errors are commonly due to DNS or network connectivity problems, particularly if this was a previously working channel or address. os_smtp_read or os_smtp_write errors are usually an indication that the connection was aborted by the other side or due to network problems.

Network and DNS problems are often transient in nature. The occasional os_smtp_* error is usually nothing to be concerned about. However, if you are consistently seeing these errors, it could be an indication of an underlying network problem.

To obtain more information about a particular os_smtp_* error, enable debugging on the channel in question. Investigate the debug channel log file that will show details of the attempted SMTP dialogue. In particular, look at the timing of when a network problem occurred during the SMTP dialogue. The timing could suggest the type of network or remote side issue. In some cases, you might also want to perform network level debugging (for example, TCP/IP packet tracing) to determine what was sent or received.