Sun Java System Messaging Server 6.3 Administration Guide

12.8 Attachments and MIME Processing

This section describes keywords that deal with attachments and MIME processing. It consists of the following sections:

12.8.1 Ignoring the Encoding Header Line

Keywords: ignoreencoding, interpretencoding

The MTA can convert various nonstandard message formats to MIME using the Yes CHARSET-CONVERSION. In particular, the RFC 1154 format uses a nonstandard Encoding: header line. However, some gateways emit incorrect information on this header line, with the result that sometimes it is desirable to ignore this header line. The ignoreencoding keyword instructs the MTA to ignore any Encoding: header line.

Note –

Unless the MTA has a CHARSET-CONVERSION enabled, such headers are ignored in any case. The interpretencoding keyword instructs the MTA to pay attention to any Encoding: header line, if otherwise configured to do so, and is the default.

12.8.2 Automatic Defragmentation of Message/Partial Messages

Keywords: defragment, nodefragment

The MIME standard provides the message/partial content type for breaking up messages into smaller parts. This is useful when messages have to traverse networks with size limits, or traverse unreliable networks where message fragmentation can provide a form of “checkpointing,” allowing for less subsequent duplication of effort when network failures occur during message transfer. Information is included in each part so that the message can be automatically reassembled after it arrives at its destination.

The defragment channel keyword and the defragmentation channel provide the means to reassemble messages in the MTA. When a channel is marked defragment, any partial messages queued to the channel are placed in the defragmentation channel queue instead. After all the parts have arrived, the message is rebuilt and sent on its way. The nodefragment disables this special processing. The keyword nodefragment is the default.

12.8.2.1 Defragmentation Channel

Messages are routed to the defragment channel if the defragment keyword is on a destination channel. That is, when the defragment keyword is present on a destination channel to which the MTA would normally enqueue a message, the MTA "looks inside" (MIME parses) the message structure and if the MTA sees that the structure is a MIME message fragment, then the MTA automatically routes the message to the defragment channel rather than straight to the (normal) destination channel.

The defragment database contains information about message fragments coming into the MTA including information indicating the host on which each message fragment is received. Once an initial fragment has been received and noted in the defragment database, any other parts of the message that are received on any other systems using the same defragment database will get routed to the host that received the very first part. For instance:

message/partial; id=123; part=x arrives on host 1 and is routed to the defragment channel on host 1 due to the defragment keyword being present on what would otherwise be the destination/outbound channel.
Defragment channel on host 1 checks the defragment database for whether any other parts of this message have yet arrived. If none have, the defragment channel (on host 1) enters this part into the defragment database marking the part as being on host 1.
message/partial; id=123; part=y arrives on host 2 and is routed to the defragment channel on host 2 due to the defragment keyword being present on what would otherwise be the destination/outbound channel.
Defragment channel on host 2 checks the defragment database, sees that part x of this message is already present and stored on host 1. The defragment channel redirects the message fragment to host 1 (source routes the address with @host1).
message/partial' id=123; part=y arrives on host 1, is routed to the defragment channel, the defragment channel runs and enters it into the database, and so on.

All remaining parts of a fragmented message are redirected to the host that received the first (first receiving, not necessarily part=1) part of the message. They are reassembled by that host's defragment channel and eventually a reassembled, defragmented message (or if the defragmentation efforts times out, due to exceeding notices, the individual fragments are sent on as-is) is sent onwards to the real destination channel. You get some load-balancing of the defragmentation of messages depending upon which host happens to receive the "first" part of each message.

12.8.2.2 Defragmentation Channel Retention Time

Messages are retained in the defragment channel queue only for a limited time. When one half of the time before the first nondelivery notice is sent has elapsed, the various parts of a message will be sent on without being reassembled. This choice of time value eliminates the possibility of a nondelivery notification being sent about a message in the defragment channel queue.

The notices channel keyword controls the amount of time that can elapse before nondelivery notifications are sent, and hence also controls the amount of time messages are retained before being sent on in pieces. Set the notices keyword value to twice the amount of time you wish to retain messages for possible defragmentation. For example, a notices value of 4 would cause retention of message fragments for two days:

defragment notices 4 
DEFRAGMENT-DAEMON

12.8.2.3 Using NFS-based File Systems for Defragmentation and Vacation Caching

NFS-based file systems are often used for defragmentation and vacation caching. One application is to share the defragmentation database between the multiple MTA systems by having them all share the same defragment cache. This is done by making a link from the msg-svr-base/config/defragment_cache on each system to the file on which you wish to have be the shared defragment database, which will be on the shared NFS disk.

In any case, NFS servers that support proper NFS file semantics (specifically those that honor lock requests, like Solaris NFS) can be used for vacation and defragmentation caches. If NFS is used, use a soft mount option. (A hard mount is the default.) Setting a relatively short timeout value controlled by the mount timeo option (see the mount_nfs(1M) man page) is also a good idea.

With an NFS hard mount and NFS going down, what one would see is the defragment channels on the various systems hanging. With a soft mount, the defragment channels wouldn't hang, but since they would be failing to open the defragment cache, they wouldn't be able to coordinate with defragment channels on other hosts. In the unlikely case that all of a message's fragments happened to first arrive at the same host, then that host's defragment channel would presumably be able to reassemble the message and send it onwards, properly reassembled. More likely, fragments would be on different hosts, would never get reassembled, and would get sent onwards as separate fragments once a relevant defragment channel's retention time expired.

12.8.3 Automatic Fragmentation of Large Messages

Keywords: maxblocks, maxlines

Some email systems or network transfers cannot handle messages that exceed certain size limits. The MTA provides facilities to impose such limits on a channel-by-channel basis. Messages larger than the set limits are automatically split (fragmented) into multiple, smaller messages. The content type used for such fragments is message/partial, and a unique ID parameter is added so that parts of the same message can be associated with one another and, possibly, be automatically reassembled by the receiving mailer.

The maxblocks and maxlines keywords are used to impose size limits beyond which automatic fragmentation are activated. Both of these keywords must be followed by a single integer value. The keyword maxblocks specifies the maximum number of blocks allowed in a message. An MTA block is normally 1024 bytes; this can be changed with the BLOCK_SIZE option in the MTA option file. The keyword maxlines specifies the maximum number of lines allowed in a message. These two limits can be imposed simultaneously if necessary.

Message headers are, to a certain extent, included in the size of a message. Because message headers cannot be split into multiple messages, and yet they themselves can exceed the specified size limits, a rather complex mechanism is used to account for message header sizes. This logic is controlled by the MAX_HEADER_BLOCK_USE and MAX_HEADER_LINE_USE options in the MTA option file.

MAX_HEADER_BLOCK_USE is used to specify a real number between 0 and 1. The default value is 0.5. A message's header is allowed to occupy this much of the total number of blocks a message can consume (specified by the maxblocks keyword). If the message header is larger, the MTA takes the product of MAX_HEADER_BLOCK_USE and maxblocks as the size of the header (the header size is taken to be the smaller of the actual header size and maxblocks) * MAX_HEADER_BLOCK_USE.

For example, if maxblocks is 10 and MAX_HEADER_BLOCK_USE is the default, 0.5, any message header larger than 5 blocks is treated as a 5-block header, and if the message is 5 or fewer blocks in size it is not fragmented. A value of 0 causes headers to be effectively ignored insofar as message-size limits are concerned.

A value of 1 allows headers to use up all of the size that's available. Each fragment always contains at least one message line, regardless of whether or not the limits are exceeded by this. MAX_HEADER_LINE_USE operates in a similar fashion in conjunction with the maxlines keyword.

12.8.4 Imposing Message Line Length Restrictions

Keywords: linelength

The SMTP specification allows for lines of text containing up to 1000 bytes. However, some transfers may impose more severe restrictions on line length. The linelength keyword provides a mechanism for limiting the maximum permissible message line length on a channel-by-channel basis. Messages queued to a given channel with lines longer than the limit specified for that channel are automatically encoded.

The various encodings available in the MTA always result in a reduction of line length to fewer than 80 characters. The original message may be recovered after such encoding is done by applying an appropriating decoding filter.

Note –

Encoding can only reduce line lengths to fewer than 80 characters. Specification of line length values less than 80 may not actually produce lines with lengths that comply with the stated restriction.

The linelength keyword causes encoding of data to perform “soft” line wrapping for transport purposes. The encoding is normally decoded at the receiving side so that the original “long” lines are recovered. For “hard” line wrapping, see the “Record, text” in Table 13–7.

12.8.5 Interpreting Content-transfer-encoding Fields on Multiparts and Message/RFC822 Parts

Keywords: interpretmultipartencoding, ignoremultipartencoding, interpretmessageencoding, ignoremessageencoding

The MIME specification prohibits the use of a content-transfer-encoding other than 7–bit, 8–bit, and binary on multipart or message/rfc822 parts. It has long been the case that some agents violate the specification and encode multiparts and message/rfc822 objects. Accordingly, the MTA has code to accept such encodings and remove them. However, recently a different standards violation has shown up, one where a content-transfer-encoding field is present with a value of quoted-printable or base63, but the part isn't actually encoded. If the MTA tries to decode such a message the result is typically a blank messages, which is pretty much what you'd expect.

Messages with this problem have become sufficiently prevalent that two new pairs of channel keywords have been added to deal with the problem - interpretation of content-transfer-encoding fields on multiparts and message/rfc822 parts can be enabled or disabled. The first pair is interpretmultipartencoding and ignoremultipartencoding and the second is interpretmessageencoding and ignoremessageencoding. The defaults are interpretmultipartencoding and interpretmessageencoding.