13.5.3 To Control Conversion Processing (Sun Java System Messaging Server 6.3 Administration Guide)

Sun Java System Messaging Server 6.3 Administration Guide

13.5.3 To Control Conversion Processing

This section describes how to controls conversion processing. It consists of the following subsections:

When a message is sent to the conversion channel, it is processed body part-by-body part. Processing is controlled by the MTA conversions file, which is specified by the IMTA_CONVERSION_FILE option in the imta_tailor file (default: msg-svr-base/conversions). The conversions file consists of line-separated entries that 1) qualify which types of body parts will be processed, and 2) how they will be processed.

Each entry consists of one or more lines containing one or more name=value parameter clauses. Where the name in the > parameter clauses is one of the parameters in Table 11-5.The values in the parameter clauses conform to MIME conventions. Every line except the last must end with a semicolon (;). A physical line in this file is limited to 252 characters. You can split a logical line into multiple physical lines using the back slash (\) continuation character. Entries are terminated either by a line that does not end in a semicolon, one or more blank lines, or both.

Below is a simple example of a conversion file entry:

Example 13–1 `conversions` File Entry

out-chan=ims-ms; in-type=application; in-subtype=wordperfect5.1;
  out-type=application; out-subtype=msword; out-mode=block;
  command="/usr/bin/convert -in=wordp -out=msword 'INPUT_FILE' 'OUTPUT_FILE’"

The clauses out-chan=ims-ms; in-type=application; in-subtype=wordperfect5.1 qualify the body part. That is, they specify the type of part to be converted. The header of each part is read and its Content-Type: and other header information is extracted. The entries in the conversion file are then scanned in order from first to last; any in-* parameters present, and the OUT-CHAN parameter, if present, are checked. If all of these parameters match the corresponding information for the body part being processed, then the conversion specified by the command= or delete= clause is performed, and the out-* parameters are set.

If no match occurs, then the part is matched against the next conversions file entry. Once all body parts have been scanned and processed (assuming there is a qualifying match), then the message is sent onwards to the next channel. If there are no matches, no processing occurs, and the message is sent to the next channel.

out-chan=ims-ms specifies that only message parts destined for the ims-ms channel will be converted. in-type=application and in-subtype=wordperfect5.1 specifies that the MIME Content-type header for the message part must be application/wordperfect5.1.

Message parts can be further qualified with additional in-* parameters. (See Table 13–6.) The entry above will trigger conversion actions on a message part which has the following MIME header lines:

Content-type: APPLICATION/wordperfect5.1;name=Draft1.wpc
Content-transfer-encoding: BASE64
Content-disposition: attachment; filename=Draft1.wpc
Content-description: "Project documentation Draft1 wordperfect format"

After the three conversion file qualifying parameters in Example 13–1, the next two parameters, out-type=application and out-subtype=msword, specify replacement MIME header lines to be attached to the “processed” body part. out-type=application and out-subtype=msword specify that the MIME Content-type/subtype of the outgoing message be application/msword.

Note that since the in-type and out-type parameters are the same, out-type=application is not necessary since the conversion channel defaults to the original MIME labels for outgoing body parts. Additional MIME labels for outgoing body parts can be specified with additional output parameters.

out-mode=block (Example 13–1) specifies the file type that the site-supplied program will return. In other words, it specifies how the file will be stored and how the conversion channel should be read back in the returned file. For example, an html file is stored in text mode, while an .exe program or a zip file is stored in block/binary mode. Mode is a way of describing that the file being read is in a certain storage format.

The final parameter in Example 13–1 specifies the action to take on the body part:

command="/usr/bin/convert -in=wordp -out=msword 'INPUT_FILE’ 'OUTPUT_FILE’"

The command= parameter specifies that a program will execute on the body part. /usr/bin/convert is the hypothetical command name; -in=wordp and -out=msword are hypothetical command line arguments specifying the format of the input text and output text; INPUT_FILE and OUTPUT_FILE are conversion channel environmental variables (see 13.5.3.2 To Use Conversion Channel Environmental Variables program should store its converted body part.

Note –

Envelope originator and recipient information is now provided as x-envelope-from and x-envelope-to fields respectively when a file containing the outer message header is requested by a regular conversion entry.

Instead of executing a command on the body part, the message part can simply be deleted by substituting DELETE=1 in place of the command parameter.

Note –

Whenever the conversions file is modified, you must recompile the configuration (see 10.1 Compiling the MTA Configuration).).

13.5.3.1 Conversion Channel Information Flow

The flow of information is as follows: a message containing body parts comes into the conversion channel. The conversion channel parses the message, and processes the parts one by one. The conversion channel then qualifies the body part, that is, it determines if it should be processed or not by comparing its MIME header lines to the qualifying parameters (Table 13–6). If the body part qualifies, the conversion processing commences.

If MIME or body part information is to be passed to the conversion script, it is stored in an environmental variable (13.5.3.2 To Use Conversion Channel Environmental Variables) as specified by information passing parameters (Table 13–6).

At this point, an action specified by an action parameter, (Table 13–6)is taken on the body part. Typically the action is that the body part be deleted or that it be passed to a program wrapped in a script. The script processes the body part and then sends it back to the conversion channel for reassembling into the post-processed message. The script can also send information to the conversion channel by using the conversion channel output options (Table 13–4). This can be information such as new MIME header lines to add to the output body part, error text to be returned to the message sender, or special directives instructing the MTA to initiate some action such as bounce, delete, or hold a message.

Finally, the conversion channel replaces the header lines for the output body part as specified by the output parameters (Table 13–6).

13.5.3.2 To Use Conversion Channel Environmental Variables

When operating on message body parts, it is often useful to pass MIME header line information, or entire body parts, to and from the site-supplied program. For example, a program may require Content-type and Content-disposition header line information as well as a message body part. Typically a site-supplied program’s main input is a message body part which is read from a file. After processing the body part, the program will need to write it to a file from which the conversion channel can read it. This type of information passing is done by using conversion channel environmental variables.

Environmental variables can be created in the conversions file using the parameter-symbol-* parameter or by using a set of pre-defined conversion channel environmental variables (see 13.5.3.3 To Use Conversion Channel Output Options).

The following conversions file entry and incoming header show how to pass MIME information to the site-supplied program using environment variables.

conversions file entry:

in-channel=*; in-type=application; in-subtype=*;
  parameter-symbol-0=NAME; parameter-copy-0=*;
  dparameter-symbol-0=FILENAME; dparameter-copy-0=*;
  message-header-file=2; original-header-file=1;
  override-header-file=1; override-option-file=1;
  command="/bin/viro-scan500.sh ”INPUT_FILE’ ”OUTPUT_FILE’"

Incoming header:

Content-type: APPLICATION/msword; name=Draft1.doc
Content-transfer-encoding: BASE64
Content-disposition: attachment; filename=Draft1.doc
Content-description: "Project documentation Draft1 msword format"

in-channel=*; in-type=application; in-subtype=* specify that a message body part from any input channel of type application will be processed.

parameter-symbol-0=NAME specifies that value of the Content-type parameter name, if present (Draft1.doc in our example), be stored in an environment variable called NAME.

parameter-copy-0=* specifies that all Content-type parameters of the input body part be copied to the output body part.

dparameter-symbol-0=FILENAME specifies that the value of the Content-disposition parameter filename (Draft1.doc in our example), be stored in an environment variable called FILENAME.

dparameter-copy-0=* specifies that all Content-disposition parameters of the input body part be copied to the output body part.

message-header-file=2 specifies that the original header of the message as a whole (the outermost message header) be written to the file specified by the environment variable MESSAGE_HEADERS.

original-header-file=1 specifies that the original header of the enclosing MESSAGE/RFC822 part are written to the file specified by the environment variable INPUT_HEADERS.

override-header-file=1 specifies that MIME headers are read from the file specified by environmental variable OUTPUT_HEADERS, overriding the original MIME header lines in the enclosing MIME part. $OUTPUT_HEADERS is an on-the-fly temporary file created at the time conversion runs. A site-supplied program would use this file to store MIME header lines changed during the conversion process. The conversion channel would then read the MIME header lines from this file when it re-assembles the body part. Note that only MIME header lines can be modified. Other general, non-MIME header lines cannot be cannot be altered by the conversion channel.

override-option-file=1 specifies that the conversion channel read conversion channel options from the file named by the OUTPUT_OPTIONS environmental variable. See 13.5.3.3 To Use Conversion Channel Output Options.

command="msg-svr-base/bin/viro-scan500.sh" specifies the command to execute on the message body part.

Table 13–3 Conversion Channel Environment Variables


Environment Variable	Description
`ATTACHMENT_NUMBER`	Attachment number for the current part. This has the same format as the ATTACHMENT-NUMBER conversion match parameter.
`CONVERSION_TAG`	The current list of active conversion tags. This corresponds to the TAG conversion match parameter.
`INPUT_CHANNEL`	The channel that enqueued the message to the conversion channel. This corresponds to the IN-CHANNEL conversion match parameter.
`INPUT_ENCODING`	Encoding originally present on the body part.
`INPUT_FILE`	Name of the file containing the original body part. The site-supplied program should read this file.
`INPUT_HEADERS`	Name of the file containing the original header lines for the body part. The site-supplied program should read this file.
`INPUT_TYPE`	MIME `Content-type` of the input message part.
`INPUT_SUBTYPE`	MIME content subtype of the input message part.
`INPUT_DESCRIPTION`	MIME `content-description` of the input message part.
`INPUT_DISPOSITION`	MIME `content-disposition` of the input message part.
`MESSAGE_HEADERS`	Name of the file containing the original outermost header for an enclosing message (not just the body part) or the header for the part’s most immediately enclosing MESSAGE/RFC822 part. The site-supplied program should read this file.
`OUTPUT_CHANNEL`	The channel the message is headed for. This corresponds to the OUT-CHANNEL conversion match parameter.
`OUTPUT_FILE`	Name of the file where the site-supplied program should store its output. The site-supplied program should create and write this file.
`OUTPUT_HEADERS`	Name of the file where the site-supplied program should store MIME header lines for an enclosing part. The site-supplied program should create and write this file. Note that file should contain actual MIME header lines (not `option=value` lines) followed by a blank line as its final line. Note also that only MIME header lines can be modified. Other general, non-MIME header lines cannot be cannot be altered by the conversion channel.
`OUTPUT_OPTIONS`	Name of the file from which the site-supplied program should read conversion channel options. See 13.5.3.3 To Use Conversion Channel Output Options.
`PART_NUMBER`	The part number for the current part. This has the same format as the PART-NUMBER conversion match parameter.
`PART_SIZE`	The size in bytes of the part being processed.

Mail Conversion Tags

Mail conversion tags are special tags which are associated with a particular recipient or sender. When a message is being delivered, the tag is visible to the conversion channel program, which may make use of it for special processing. Conversion tags are stored in the LDAP directory.

Mail conversion tags could be used as follows: the administrator can set up selected users with a mail conversion tag value of harmonica. The administrator then has a conversion channel setup which, when processing that mail, will detect the presence of the tag and the value of harmonica. When that happens, the program will perform some arbitrary function.

Mail conversion tags can be set on a per user or a per domain basis. The recipient LDAP attribute at the domain level is MailDomainConversionTag (modifiable with the MTA option LDAP_DOMAIN_ATTR_CONVERSION_TAG). At the user level it is MailConversionTag (modifiable with the MTA option LDAP_CONVERSION_TAG). Both of these attributes can be multivalued with each value specifying a different tag. The set of tags associated with a given recipient is cumulative, that is, tags set at the domain level are combined with tags set at the user level.

Sender-based conversion tags can be set with the MTA options LDAP_SOURCE_CONVERSION_TAG and LDAP_DOMAIN_ATTR_SOURCE_CONVERSION_TAG, which specify user and domain level LDAP attributes respectively for conversion tags associated with these source address. There is no default attribute for either of these options.

Two new actions are available to system Sieves: addconversiontag and setconversiontag. Both accept a single argument: A string or list of conversion tags. addconversiontag adds the conversion tag(s) to the current list of tags while setconversiontag empties the existing list before adding the new ones. Note that these actions are performed very late in the game so setconversiontag can be used to undo all other conversion tag setting mechanisms. These allow you put conversion tags in the Sieves filters.

The Sieve envelope test accepts conversiontag as an envelope field specifier value. The test checks the current list of tags, one at a time. Note that the :count modifier, if specified, allows checking of the number of active conversion tags. This type of envelope test is restricted to system Sieves. Also note that this test only sees the set of tags that were present prior to Sieve processing—the effects of setconversiontag and addconversiontag actions are not visible.

Including Conversion Tag Information in Various Mapping Probes

A new MTA option, INCLUDE_CONVERSIONTAG, has been added to selectively enable the inclusion of conversion tag information in various mapping probes. This is a bit-encoded value. The bits are assigned are shown in the table below. In all cases the current set of tags appears in the probe as a comma separated list.

Position	Value	Mapping
0	1	`CHARSET_CONVERSION` - added as `;TAG=` field before `;CONVERT`.
1	2	`CONVERSION` - added as `;TAG=` field before `;CONVERT`
2	4	`FORWARD` - added just before current address (\| delim)
3	8	`ORIG_SEND_ACCESS` - added at end of probe (\| delim)
4	16	`SEND_ACCESS` - added at end of probe (\| delim)
5	32	`ORIG_MAIL_ACCESS` - added at end of probe (\| delim)
6	64	`MAIL_ACCESS` - added at end of probe (\| delim)

13.5.3.3 To Use Conversion Channel Output Options

Conversion channel output options (Table 13–4) are dynamic variables used to pass information and special directives from the conversion script to the conversion channel. For example, during body part processing the script may want to send a special directive asking the conversion channel to bounce the message and to add some error text to the returned message stating that the message contained a virus.

The output options are initiated by setting OVERRIDE-OPTION-FILE=1 in the desired conversion entry. Output options are then set by the script as needed and stored in the environmental variable file, OUTPUT_OPTIONS. When the script is finished processing the body part, the conversion channel reads the options from the OUTPUT_OPTIONS file.

The OUTPUT_OPTION variable is the name of the file from which the conversion channel reads options. Typically it is used as an on-the-fly temporary file to pass information. The example below shows a script that uses output options to return an error message to a sender who mailed a virus.

/usr/local/bin/viro_screen2k $INPUT_FILE   # run the virus screener

if [ $? -eq 1 ]; then
   echo "OUTPUT_DIAGNOSTIC=’Virus found and deleted.’" > $OUTPUT_OPTIONS
   echo "STATUS=178029946" >> $OUTPUT_OPTIONS
else
   cp $INPUT_FILE $OUTPUT_FILE # Message part is OK
fi

In this example, the system diagnostic message and status code are added to the file defined by $OUTPUT_OPTIONS. If you read the $OUTPUT_OPTIONS temporary file out you would see something like:

OUTPUT_DIAGNOSTIC="Virus found and deleted."
STATUS=178029946

The line OUTPUT_DIAGNOSTIC='Virus found and deleted’ tells the conversion channel to add the text Virus found and deleted to the message.

178029946 is the PMDF__FORCERETURN status per the pmdf_err.h file which is found in the msg-svr-base/include/deprecated/pmdf_err.h. This status code directs the conversion channel to bounce the message back to the sender. (For more information on using special directives refer to 13.5.4 To Bounce, Delete, Hold, Retry Messages Using the Conversion Channel Output

A complete list of the output options is shown below.

Table 13–4 Conversion Channel Output Options


Option	Description
`OUTPUT_TYPE`	MIME content type of the output message part.
`OUTPUT_SUBTYPE`	MIME content subtype of the output message part.
`OUTPUT_DESCRIPTION`	MIME content description of the output message part.
`OUTPUT_DIAGNOSTIC`	Text to include as part of the message sent to the sender if a message is forcibly bounced by the conversion channel.
`OUTPUT_DISPOSITION`	MIME c`ontent-disposition` of the output message part.
`OUTPUT_ENCODING`	MIME content transfer `encoding` to use on the output message part.
`OUTPUT_MODE`	MIME `Mode` with which the conversion channel should write the output message part, hence the mode with which recipients should read the output message part.
`STATUS`	Exit status for the converter. This is typically a special directive initiating some action by the conversion channel. A complete list of directives can be viewed in `msg-svr-base/include/deprecated/pmdf_err.h`

13.5.3.4 Headers in an Enclosing MESSAGE/RFC822 Part

When performing conversions on a message part, the conversion channel has access to the header in an enclosing MESSAGE/RFC822 part, or to the message header if there is no enclosing MESSAGE/RFC822 part. Information in the header may be useful for the site-supplied program.

If an entry is selected that has ORIGINAL-HEADER-FILE=1, then all the original header lines of the enclosing MESSAGE/RFC822 part are written to the file represented by the ORIGINAL_HEADERS environment variable. If OVERRIDE-HEADER-FILE=1, then the conversion channel will read and use as the header on that enclosing part the contents of the file represented by the ORIGINAL_HEADERS environment variable.

13.5.3.5 To Call Out to a Mapping Table from a Conversion Entry

out-parameter-* values may be stored and retrieved in an arbitrarily named mapping table. This feature is useful for renaming attachments sent by clients that send all attachments with a generic name like att.dat regardless of whether they are postscript, msword, text or whatever. This is a generic way to relabel the part so that other clients (Outlook for example) will be able to open the part by reading the extension.

The syntax for retrieving a parameter value from a mapping table is as follows:

”mapping-table-name:mapping-input[$Y, $N]’

$Y returns a parameter value. If there is no match found or the match returns $N, then that parameter in the conversions file entry is ignored or treated as a blank string. Lack of a match or a $N does not cause the conversion entry itself to be aborted.

Consider the following mapping table:

X-ATT-NAMES

   postscript       temp.PS$Y
   wordperfect5.1   temp.WPC$Y
   msword           temp.DOC$Y

The following conversion entry for the above mapping table results in substituting generic file names in place of specific file names on attachments:

out-chan=tcp_local; in-type=application; in-subtype=*; 
   in-parameter-name-0=name; in-parameter-value-0=*; 
   out-type=application; out-subtype=’INPUT-SUBTYPE’; 
   out-parameter-name-0=name; 
   out-parameter-value-0=”’X-ATT-NAMES:\\’INPUT_SUBTYPE\\''"; 
   command="cp  ”INPUT_FILE’  ”OUTPUT_FILE’"

In the example above, out-chan=tcp_local; in-type=application; in-subtype=* specifies that a message to be processed must come from the tcp_local channel with the content-type header of application/* (* specifies that any subtype would do).

in-parameter-name-0=name; in-parameter-value-0=* additionally specifies that the message must have a content-type parameter called name=* and that any value for that parameter will be accepted (again, * specifies that any parameter value would do.)

out-type=application; specifies that the MIME Content-type parameter for the post-processing message be application.

out-subtype=’INPUT-SUBTYPE’; specifies that the MIME subtype parameter for the post-processing body part be the INPUT-SUBTYPE environmental variable, which is the original value of the input subtype. Thus, if you wanted change

Content-type: application/xxxx; name=foo.doc

Content-type: application/msword; name=foo.doc

then you would use

out-type=application; out-subtype=msword

out-parameter-name-0=name; specifies that the output body part will have a MIME Content-type name= parameter.

out-parameter-value-0=’X-ATT-NAMES:\\’INPUT_SUBTYPE\\’’; says to take the value of the INPUT_SUBTYPE variable (that is, the original content-type header subtype value of the original body part) and search the mapping table X-ATT-NAMES. If a match is found, the content-type parameter specified by out-parameter-name-0 (that is, name) receives the new value specified in the X-ATT-NAMES mapping table. Thus, if the original subtype was msword, the value of the name parameter will be temp.DOC.