Errors encountered during TCP/IP delivery are often transient; the MTA will generally retain messages when problems are encountered and retry them periodically. It is normal on large networks to experience periodic outages on certain hosts while other host connections work fine. To verify the problem, examine the log files for errors relating to delivery attempts. You may see error messages such as, “Fatal error from smtp_open.” Such errors are not uncommon and are usually associated with a transient network problem. To debug TCP/IP network problems, use utilities like PING, TRACEROUTE, and NSLOOKUP.
The following example shows the steps you might use to see why a message is sitting in the queue awaiting delivery to xtel.co.uk. To determine why the message is not being dequeued, you can recreate the steps the MTA uses to deliver SMTP mail on TCP/IP.
% nslookup -query=mx xtel.co.uk (Step 1) Server: LOCALHOST Address: 127.0.0.1 Non-authoritative answer: XTEL.CO.UK preference = 10, mail exchanger = nsfnet-relay.ac.uk (Step 2) % telnet nsfnet-relay.ac.uk 25 (Step 3) Trying... [22.214.171.124] telnet: Unable to connect to remote host: Connection refused
Use the NSLOOKUP utility to see what MX records, if any, exist for this host. If no MX records exist, then you should try connecting directly to the host. If MX records do exist, then you must connect to the designated MX relays. The MTA honors MX information preferentially, unless explicitly configured not to do so. See also 126.96.36.199 TCP/IP MX Record Support.
In this example, the DNS (Domain Name Service) returned the name of the designated MX relay for xtel.co.uk. This is the host to which the MTA will actually connect. If more than one MX relay is listed, the MTA will try each MX record in succession, with the lowest preference value tried first.
If you do have connectivity to the remote host, you should check if it is accepting inbound SMTP connections by using TELNET to the SMTP server port 25.
If you use TELNET without specifying the port, you will discover that the remote host accepts normal TELNET connections. This does not indicate that it accepts SMTP connections; many systems accept regular TELNET connections but refuse SMTP connections and vice versa. Consequently, you should always do your testing against the SMTP port.
In the previous example, the remote host is refusing connections to the SMTP port. This is why the MTA fails to deliver the message. The connection may be refused due to a misconfiguration of the remote host or some sort of resource exhaustion on the remote host. In this case, nothing can be done to locally to resolve the problem. Typically, you should let the MTA continue to retry the message.
If you are running Messaging Server on a TCP/IP network that does not use DNS, you can skip the first two steps. Instead, you can use TELNET to directly access the host in question. Be careful to use the same host name that the MTA would use. Look at the relevant log file from the MTA’s last attempt to determine the host name. If you are using host files, you should make sure that the host name information is correct. It is strongly recommended that you use DNS instead of host names.
Note that if you test connectivity to a TCP/IP host and encounter no problems using interactive tests, it is quite likely that the problem has simply been resolved since the MTA last tried to deliver the message. You can re-run the imsimta submit tcp_channel on the appropriate channel to see if messages are being dequeued.
In certain circumstances, a remote domain can break down and the volume of mail addressed to this server can be so great that the outgoing channel queue will fill up with messages that cannot be delivered. The MTA tries to redeliver these messages periodically (the frequency and number of the retries is configurable using the backoff keywords) and under normal circumstances, no action is needed. However, if too many messages get stuck in the queue, other messages may not get delivered in a timely manner because all the channel jobs are working to process the backlog of messages that cannot be delivered.
In this situation, you can reroute these messages to a new channel running in its own job controller pool. This will avoid contention for processing and allow the other channels to deliver their messages. This procedure is described below. We assume a domain called siroe.com
Create a new channel called tcp_siroe-daemon and add a new value for the pool keyword.
Channels are created in the channel block section of /msg-svr-base/config/imta.cnf. The channel should have the same channel keywords on your regular outgoing tcp_* channel. Typically, this is the tcp_local channel, which handles all outbound (internet) traffic. Since siroe.com is out on the internet, this is the channel to emulate. The new channel may look something like this:
tcp_siroe smtp nomx single_sys remotehost inner allowswitchchannel \ dentnonenumeric subdirs 20 maxjobs 7 pool SMTP_SIROE maytlsserver \ maysaslserver saslswitchchannel tcp_auth missingrecipientpolicy 0 \ tcp_siroe-daemon
Note the new keyword-value pair pool SMTP_SIROE. This specifies that messages to this channel will only use computer resources from the SMTP_SIROE pool. Note also that a blank line is required before and after the new channel.
Add two rewrite rules to the rewrite rule section of the imta.cnf file to direct email destined for siroe.com to the new channel.
The new rewrite rules look like this:
siroe.com $U%$D@tcp_siroe-daemon .siroe.com $U%$H$D@tcp_siroe-daemon
These rewrite rules will direct messages to siroe.com (including addresses like host1.siroe.com or hostA.host1.siroe.com) to the new channel whose official host name is tcp_siroe-daemon. The rewriting part of these rules, $U%$D and $U%$H$D, retain the original addresses of the messages. $U copies the user name from original address. % is the separator—the @ between the username and domain. $H copies the unmatched portion of host/domain specification at the left of dot in pattern. $D copies the portion of domain specification that matched.
Define a new job controller pool called SMTP_SIROE.
In /msg-svr-base/config/job_controller.cnf add the following:
This creates a message resource pool called SMTP_SIROE that allows up to 10 jobs to be simultaneously run. Be sure not to leave any blank lines between this pool definition and the others. See 8.7 The Job Controller for details on jobs and pools.
Restart the MTA.
Issue the commands: imsimta cnbuild;imsimta restart
This recompiles the configuration and restarts the job controller and dispatcher.
In this example, a large quantity of email from your internal users is destined for a particular remote site called siroe.com. For some reason, siroe.com, is temporarily unable to accept incoming SMTP connections and thus cannot deliver email. (This type of situation is not a rare occurence.)
As email destined for siroe.com comes in, the outgoing channel queue, typically tcp_local, will fill up with messages that cannot be delivered. The MTA tries to redeliver these messages periodically (the frequency and number of the retries is configurable using the backoff keywords) and under normal circumstances, no action is needed.
However, if too many messages get stuck in the queue, other messages may not get delivered in a timely manner because all the channel jobs are working to process the backlog of siroe.com messages. In this situation, you may wish reroute siroe.com messages to a new channel running in its own job controller pool (see 8.7 The Job Controller). This will allow the other channels to deliver their messages without having to contend for processing resources used by siroe.com messages. Creating a new channel to address this situation is described below.