Sun Java System Messaging Server 6 2005Q4 MTA Developer's Reference

Chapter 5 Decoding Messages

The MTA has facilities for parsing and decoding single and multipart messages formatted using the MIME Internet messaging format. Additionally, these facilities can convert messages with other formats to MIME. For example, messages with BINHEX or UUENCODE data, the RFC 1154 format, and many other proprietary formats. The mtaDecodeMessage() routine provides access to these facilities, parsing either a queued message or a message from an arbitrary source such as a disk file or a data stream.

This chapter discusses the following topics:

Usage Modes for mtaDecodeMessage()

There are two usage modes for mtaDecodeMessage(). In the first mode, messages are simply parsed, any encoded content decoded, and each resulting, atomic message part presented to an inspection routine. This mode of usage is primarily of use to channels which interface the MTA to non-Internet mail systems such as SMS and X.400. The second mode of operation allows the message to be rewritten after inspection. The output destination for this rewriting may be either the MTA channel queues, or an arbitrary destination via a caller-supplied output routine. During the inspection process in this second usage mode, individual, atomic message parts may be discarded or replaced with text. This operational mode is primarily of use to intermediate processing channels which need to scan message content or perform content conversions. For example, virus scanners and encryption software. A Simple Decoding Example illustrates the first usage mode, while A Simple Virus Scanner Example the second.

For the first usage mode, the calling routine must supply the following items:

  1. An input source for the message.

  2. An inspection routine which will be passed each atomic message part of the parsed and decoded message.

    For the second usage mode, the calling routine must supply the same two items as listed for the first usage mode, and in addition a third item must be supplied:

  3. An output destination to direct the resulting message to.

The input source can be either a queued message file, represented by a dequeue context, or it can be provided by a caller-supplied input routine. Use the former when processing queued messages and the latter when processing data from disk files, data streams, or other arbitrary input sources. Since the parser and decoder require only a single, sequential pass over its input data, it is possible to stream data to mtaDecodeMessage().

The output destination can be a message being enqueued and represented either by an enqueue context, or by a caller-supplied output routine. Use an enqueue context when submitting the message to the MTA. In all other cases, use a caller-supplied output routine.

The following are some common usage cases and their associated input sources and output destinations.

The Input Source

The message to be decoded is provided as either a dequeue context or a caller-supplied routine.

Dequeue Context

When using a dequeue context, you must observe the following:

  1. Pass the dequeue context from mtaDecodeStart() to mtaDecodeMessage() along with the MTA_DECODE_DQ item code.

  2. The recipient list of the message being dequeued must have already been read by mtaDequeueRecipientNext() before calling mtaDecodeMessage().

  3. mtaDequeueMessageFinish() must not yet have been called for the dequeue context.

After using a dequeue context with mtaDecodeMessage(), further calls to mtaDequeueRecipientNext() can’t be made. Calls to mtaDequeueLineNext() can only be performed after a call to mtaDequeueRewind().

Caller-Supplied Input Routine

To use a caller-supplied input routine, pass the address of the input routine along with the MTA_DECODE_PROC item code to mtaDecodeMessage(). In Example 5–1, the caller supplied routine's name is decode_read().

When using a caller-supplied input routine, each block of data returned by the routine must be a single line of the message. This is the default expectation of mtaDecodeMessage() and corresponds to the MTA_TERM_NONE item code. If, instead, the MTA_TERM_CR, _CRLF, _LF, or _LFCR item code are specified, then the block of data need not correspond to a single, complete line of message data; it may be a portion of a line, multiple lines, or even the entire message.

On each successful call, the input routine should return a status code of zero (MTA_OK). When there is no more message data to provide, then the input routine should return MTA_EOF. The call that returns the last byte of data should return zero; it is the subsequent call that must return MTA_EOF. In the event of an error, the input routine should return a non-zero status code other than MTA_EOF (for example, MTA_NO). This terminates the message parsing process and mtaDecodeMessage() returns an error.

The Inspection Routine

Whenever mtaDecodeMessage() is called, an “inspection” routine must be supplied by the caller. In Example 5–1, the inspection routine’s name is decode_inspect().

As the message is parsed and decoded, mtaDecodeMessage() presents each atomic message part to the inspection routine one line at a time. The presentation begins with the part’s header lines. Once all of the header lines have been presented, the lines of content are presented.

So that the inspection routine can tell if it is being presented with a line from the header or content of the message, a data type indicator is supplied to the inspection routine each time it is called. In regards to lines of the message’s content, the data type indicator discriminates between text and binary content. Text content is considered any content with a MIME content type of text or message (for example, text/plain, text/html, message/rfc822), while binary content is all other MIME content types (application, image, and audio).

When writing an inspection routine for use with mtaDecodeMessage(), the following points apply:

A Simple Decoding Example

This sample program found in Example 5–1 decodes a MIME formatted message using mtaDecodeMessage(). This is not a channel program. The actual message to be decoded is compiled into the program rather than being drawn from a channel queue.

After the Messaging Server product is installed, these programs can be found in the following location:

msg_server_base/examples/mtasdk/

Some lines of code are immediately preceded by a comment of the format:

/* See explanatory comment N */

where N is a number. The numbers are links to some corresponding explanatory text in the section that follows this code, see Explanatory Text for Numbered Comments in the Simple Decoding Example.

For the sample output generated by this program, see MIME Message Decoding Simple Example Output.


Example 5–1 Decoding MIME Messages Simple Example


/*
 *  decode_simple.c
 *
 *    Decode a multipart MIME message.
 *
 */
#include <stdio.h>
#include <string.h>
#include "mtasdk.h"

/* 
 *  Inline data for a sample message to decode
 * See explanatory comment 1 
 */
static const char message[] = 
  "From: sue@siroe.com\n"
  "Date: 31 Mar 2003 09:32:47 -0800\n"
  "Subject: test message\n"
  "Content-type: multipart/mixed; boundary=BoundaryMarker\n"
  "\n\n"
  "--BoundaryMarker\n"
  "Content-type: text/plain; charset=us-ascii\n"
  "Content-disposition: inline\n"
  "\n"
  "This is a\n"
  "  test message!\n"
  "--BoundaryMarker\n"
  "Content-type: application/postscript\n"
  "Content-disposition: attachment; filename='a.ps'\n"
  "Content-transfer-encoding: base64\n"
  "\n"
  "IyFQUwoxMDAgMTAwIG1vdmV0byAzMDAgMzAwIGxpbmV0byBzdHJva2UKc2hv"  "3Bh\n"
  "Z2UK\n"
  "--BoundaryMarker--\n";

static mta_decode_read_t decode_read;
static mta_decode_inspect_t decode_inspect;
typedef struct {
     const char *cur_position;
     const char *end_position;
} position_t;

main()
{
     position_t pos;

     /*
      * Initialize the MTA SDK
      */
     if ((ires = mtaInit(0)))
     {
         mtaLog("mtaInit() returned %d; %s\n", ires, 
                mtaStrError(ires, 0));
         return(1);
     }

     /* 
      *  For a context to pass to mtaDecodeMessage(), we pass a
      *  pointer to the message data to be parsed.  The 
      *  decode_read() routine uses this information when 
      *  supplying data to mtaDecodeMessage().
      *  See explanatory comment 2 
      */
     pos.cur_position = message;
     pos.end_position = message + strlen(message);

     /*
      *  Invoke mtaDecodeMessage():
      *    1. Use decode_read() as the input routine to supply the 
      *       message to be MIME decoded,
      *    2. Use decode_inspect() as the routine to inspect each 
      *       MIME decoded message part,
      *    3. Do not specify an output routine to write the 
      *       resulting, MIME message, and
      *    4. Indicate that the input message source uses LF 
      *       record terminators. 
      *  See explanatory comment 3 
      */
     mtaDecodeMessage((void *)&pos, MTA_DECODE_PROC, 
                      (void *)decode_read,
                      0, NULL, decode_inspect, MTA_TERM_LF, 0);
}

/* 
 *  decode_read -- Provide message data to mtaDecodeMessage(). 
 *                 The entire message could just as easily be 
 *                 given to mtaDecodeMessage()at once. However, 
 *                 for illustration purposes, the message is 
 *                 provided in 200 byte chunks.
 *  See explanatory comment 4 
 */
static int decode_read(void *ctx, const char **line, size_t 
                       *line_len)
{
     position_t *pos = (position_t *)ctx;

     if (!pos)
          return(MTA_NO);
     else if (pos->cur_position >= pos->end_position)
          return(MTA_EOF);
     *line = pos->cur_position;
     *line_len = ((pos->cur_position + 200) < 
                 pos->end_position) ? 200 : 
                 (pos->end_position - pos->cur_position);
     pos->cur_position += *line_len;
     return(MTA_OK);
}

/* 
 *  decode_inspect -- Called by mtaDecodeMessage() to output a 
 *                    a line of the parsed message.  The line is 
 *                    simply output with additional information 
 *                    indicating whether the line comes from a 
 *                    header, text part, or binary part.
 *  See explanatory comment 5 
*/
static int decode_inspect (void *ctx, mta_decode_t *dctx, int 
                           data_type, const char *data,
                           size_t data_len)
{
     static const char *types[] = {"N", "H", "T", "B"};

     /* See explanatory comment 6 */
     if (data_type == MTA_DATA_NONE)
          return(MTA_OK);

     /* See explanatory comment 7 */
     printf("%d%s: %.*s\n",
            mtaDecodeMessageInfoInt(dctx, 
                                    MTA_DECODE_PART_NUMBER), 
                                    types[data_type], data_len, 
                                    data);

            return(MTA_OK);
}

Explanatory Text for Numbered Comments in the Simple Decoding Example

The following numbered explanatory text corresponds to the numbered comments in Example 5–1.

  1. The MIME message to be decoded. It is a multipart message with two parts. The first part contains text, the second part a PostScriptTM attachment.

  2. The private context to be passed to mtaDecodeMessage() and, in turn, passed by it to the supplied input routine, decode_read(). The input routine uses this context to track how many bytes of the input message it has supplied to mtaDecodeMessage().

  3. The call to mtaDecodeMessage(). An input routine, decode_read(), is supplied to provide the message to be decoded. Since the message source has each record terminated by line feeds, the MTA_TERM_LF option is also specified. The routine decode_inspect() is passed for use as an inspection routine.

  4. The input routine, decode_read(). This routine provides the message to be decoded 200 bytes at a time. Note that providing only 200 bytes at a time is arbitrary: the routine could, if it chose, provide the entire message, or 2000 bytes at a time, or a random number of bytes on each call. After the entire message has been supplied, subsequent calls to decode_read() return the MTA_EOF status.

  5. The inspection routine, decode_inspect(). For each atomic message part, this routine is called repeatedly. The repeated calls provide, line by line, the part’s header and decoded content.

  6. For a given message part, the final call to decode_inspect() provides no part data. This final call serves to give decode_inspect() a last chance to accept or discard the part when outputting the final form of the message via an optional output routine supplied to mtaDecodeMessage(). That optional routine is not used here.

  7. The part number for this message part is obtained with a call to mtaDecodeMessageInfoInt().

MIME Message Decoding Simple Example Output

The following shows the output generated by the program in Example 5–1.


1H: Content-type: text/plain; charset=us-ascii
1H: Content-disposition: inline
1T: This is a
1T:   test message!
2H: Content-type: application/postscript
2H: Content-transfer-encoding: base64
2H: Content-disposition: attachment; filename="a.ps"
2B: #!PS
100 100 moveto 300 300 lineto stroke
showpage

The Output Destination

When an optional output destination is supplied to mtaDecodeMessage(), the processed input message is subsequently written to the output destination. When conversion to MIME is requested, the output message will be the result of the conversion. Additionally, the written message will reflect any changes made by the inspection routine with mtaDecodeMessagePartDelete(). That routine may be used to delete an atomic part or replace the part with new, caller-supplied content.

The output destination can be either a message submission to the MTA (that is, an ongoing enqueue) or an arbitrary destination represented by a caller-supplied output routine.

Enqueue Context

When using a message enqueue context, you must do the following:

  1. Supply the enqueue context along with the MTA_DECODE_NQ item code.

  2. Specification of the message’s recipient list must have already been completed with mtaEnqueueTo() before calling mtaDecodeMessage().

  3. mtaEnqueueFinish() must not yet have been called for the enqueue context.

After the call to mtaDecodeMessage() has completed successfully, complete the message enqueue with mtaEnqueueFinish(). In the event of an error, the message submission should be cancelled with mtaEnqueueFinish(). mtaDecodeMessage() writes the entire message header and content. There is no need for the caller to write anything to the message’s header or content.

Caller-Supplied Output Routine

To use a caller-supplied output routine (for example, decode_write()), supply the address of the output routine along with the MTA_DECODE_PROC item code to mtaDecodeMessage().

Each line passed to the output routine represents a complete line of the message to be output. The output routine must add to the line any line terminators required by the output destination (for example, carriage return, line feed pairs if transmitting over the SMTP protocol, line feed terminators if writing to a UNIX® text file, and so forth).

Decode Contexts

When mtaDecodeMessage() calls either a caller-supplied inspection or output routine, it passes a decode context to those routines. Through SDK routine calls, this decode context can be queried to obtain information about the message part currently being processed, as shown in the following table:

Message Code  

Description  

MTA_DECODE_CCHARSET

The character set specified with the CHARSET parameter of the part’s Content-type: header line. If the part lacks a CHARSET specification, then the value us-ascii will be returned. Obtain with mtaDecodeMessageInfoString().

MTA_DECODE_CDISP

Value of the Content-disposition: header line, less any optional parameters. Will be a zero length string if the part lacks a Content-disposition: header line. Obtain with mtaDecodeMessageInfoString().

MTA_DECODE_CDISP_PARAMS

Parameter list to the Content-disposition: header line, if any. The parsed list is returned as a pointer to an option context. For further information, see mtaDecodeMessageInfoParams().

MTA_DECODE_CSUBTYPE

The content subtype specified with the part’s Content-type: header line (for example, plain for text/plain, gif for image/gif). Defaults to plain when the part lacks a Content-type: header line.

Obtain with mtaDecodeMessageInfoString().

MTA_DECODE_CTYPE

The major content type specified with the part’s Content-type: header line (for example, text for text/plain, image for image/gif). Defaults to text when the part lacks a Content-type: header line.

Obtain with mtaDecodeMessageInfoString().

MTA_DECODE_CTYPE_PARAMS

Parameter list to the Content-type: header line, if any. The parsed list is returned as a pointer to an option context. For further information, see mtaDecodeMessageInfoParams().

MTA_DECODE_DTYPE

Data type associated with this part. Obtain with mtaDecodeMessageInfoInt().

MTA_DECODE_PART_NUMBER

Sequential part number for the current part. The first message part is part 0, the second part is 1, the third part is 2, and so on. Obtain with mtaDecodeMessageInfoInt().

A Simple Virus Scanner Example

Example 5–2 that follows shows how to use the mtaDecodeMessage() routine to write an intermediate processing channel that converts messages with formats other than MIME, for example UUENCODE content, to MIME output. It then decodes the MIME message, scanning it for potentially harmful attachments. (In this example, an attachment is any message part.) Any harmful attachments are removed from the message after which it is re-enqueued for delivery. The list of harmful MIME media types and file name extensions is read from a channel option file. An example option file for the channel is shown in Example Option File.

In this example, the MIME Content-type: and Content-disposition: header lines are used to detect potentially harmful message attachments such as executable files. This example could be extended to also scan the content of the attachments, possibly passing the contents to a virus scanner. Further, the example could be modified to return as undeliverable any messages containing harmful attachments.


Note –

To configure the MTA to run this channel, see Running Your Enqueue and Dequeue Programs. The PMDF_CHANNEL_OPTION environment variable must give the absolute file path to the channel’s option file. Also, for a discussion on configuring special rewrite rules for re-enqueuing dequeued mail, see Preventing Mail Loops when Re-enqueuing Mail.


For the output generated by this sample program, see Decoding MIME Messages Complex Example Output.

After the Messaging Server product is installed, these programs can be found in the following location:

msg_server_base/examples/mtasdk/

Some lines of code are immediately preceded by a comment of the format:

/* See explanatory comment N */

where N is a number. The numbers are links to some corresponding explanatory text in the section that follows this code, see Explanatory Text for Numbered Comments in the Decoding MIME Messages Complex Example.


Example 5–2 Decoding MIME Messages Complex Example


/*
 *  virus_scanner_simple.c
 *
 *    Remove potentially harmful content from queued messages.
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "mtasdk.h"

/*
 *  A structure to store our channel options
 */
typedef struct {
     /* Produce debug output?       */
     int     debug; 
     /* Unwanted MIME content types */
     char    bad_mime_types[BIGALFA_SIZE+3]; 
     /* Length of bmt string        */
     size_t  bmt_len; 
     /* Unwanted file types         */
     char    bad_file_types[BIGALFA_SIZE+3]; 
     /* Length of bft string        */
     size_t  bft_len; 
} our_options_t;

/*
 *  Forward declarations
 */
static void error_exit(int ires, const char *msg);
static void error_report(our_options_t *options, int ires, const
                         char *func);
static int  is_bad_mime_type(our_options_t *options, mta_decode_t
                             *dctx, char *buf, size_t maxbuflen);
static int is_bad_file_type(our_options_t *options, mta_opt_t 
                            *params, const char *param_name, 
                            char *buf, size_t maxbuflen);
static int  load_options(our_options_t *options);

static mta_dq_process_message_t process_message;
static mta_decode_read_t decode_read;
static mta_decode_inspect_t decode_inspect;

/*
 *  main() -- Initialize the MTA SDK, load our options, and then 
 *            start the message processing loop.
 */
int main()
{
      int ires;
      our_options_t options;

      /* 
      *  Initialize the MTA SDK
      *  See explanatory comment 1 
      */
      if ((ires = mtaInit(0)))
          error_exit(ires, "Unable to initialize the MTA SDK");

      /* 
       *  Load our channel options
       *  See explanatory comment 2 
       */
      if ((ires = load_options(&options)))
          error_exit(ires, "Unable to load our channel options");

      /* 
       *  Now process the queued messages.  Be sure to indicate a
       *  thread stack size sufficient to accomodate message 
       *  enqueue processing.
       *  See explanatory comment 3 
       */
      if ((ires = mtaDequeueStart((void *)&options, 
                                  process_message, NULL, 0)))
          error_exit(ires, "Error during dequeue processing");

      /*
       * All done
       */
      mtaDone();
      return(0);
}

/* 
 *  process_message() -- This routine is called by 
 *                       mtaDequeueStart() to process each queued
 *                       message.  We dont make use of ctx2, but 
 *                       ctx1 is a pointer to our channel options.
 *  See explanatory comment 4 
 */
static int process_message(void **ctx2, void *ctx1, mta_dq_t *dq,
                           const char *env_from, size_t 
                           env_from_len)
{
     const char *adr;
     int disp, ires;
     size_t len;
     mta_nq_t *nq;
     our_options_t *options = (our_options_t *)ctx1;

     /*
      *  Initializations
      */
     nq = NULL;

     /*
      *  A little macro to do error checking on mta*() calls
      */
#define CHECK(f,x) \
     if ((ires = x)) { error_report(options, ires, f); goto \
                       done_bad; }

     /* 
      *  Start a message enqueue.  Use the dequeue context to copy
      *  envelope flags fromt the current message to this new 
      *  message being enqueued.
      *  See explanatory comment 5 
      */
     CHECK("mtaEnqueueStart",
           mtaEnqueueStart(&nq, env_from, env_from_len,
           MTA_DQ_CONTEXT, dq, 0));

     /*
      *  Process the envelope recipient list
      *  See explanatory comment 6 
      */
     while (!(ires = mtaDequeueRecipientNext(dq, &adr, &len, 0)))
     {
          /*
           *  Add this envelope recipient address to the message 
           *  being enqueued.  Use the dequeue context to copy
           *  envelope flags for this recipient from the current
           *  message to the new message.
           */
          ires = mtaEnqueueTo(nq, adr, len, MTA_DQ_CONTEXT, 
                              dq, MTA_ENV_TO, 0);
          /*  See explanatory comment 7 */
          disp = (ires) ? MTA_DISP_DEFERRED : MTA_DISP_RELAYED;
          CHECK("mtaDequeueRecipientDisposition",
                mtaDequeueRecipientDisposition(dq, adr, len, 
                                               disp, 0));
     }

     /*
      *  A normal exit from the loop occurs when 
      *  mtaDequeueRecipientNext() returns an MTA_EOF status.
      *  Any other status signifies an error.
      */
     if (ires != MTA_EOF)
     {
         error_report(options, ires, "mtaDequeueRecipientNext");
         goto done_bad;
      }

      /* 
       *  Begin the MIME decode of the message
       *  See explanatory comment 8 
       */
      CHECK("mtaDecodeMessage",
            mtaDecodeMessage(
              /* Private context is our options */
              (void *)options, 
              /* Input is the message being dequeued */
              MTA_DECODE_DQ, (void *)dq,
              /* Output is the message being enqueued */
              MTA_DECODE_NQ, (void *)nq,
              /* Inspection routine */
              decode_inspect, 
              /* Convert non-MIME formats to MIME */
              MTA_DECODE_THURMAN, 0)); 

      /* 
       *  Finish the enqueue
       *  NOTE: ITS IMPORTANT TO DO THIS before DOING THE 
       *  DEQUEUE. YOU WILL LOSE MAIL IF YOU DO THE DEQUEUE FIRST 
       *  and then THE ENQUEUE FAILS. 
       *  See explanatory text 9 
       */
      CHECK("mtaEnqueueFinish", mtaEnqueueFinish(nq, 0));
      nq = NULL;

      /*
       *  Finish the dequeue
       */
      CHECK("mtaDequeueFinish", mtaDequeueMessageFinish(dq, 0));

      /*
       *  All done with this message
       */
      return(MTA_OK); 

done_bad:
     /*
      *  Abort any ongoing enqueue or dequeue
      */
     if (nq)
          mtaEnqueueFinish(nq, MTA_ABORT, 0);
     if (dq)
          mtaDequeueMessageFinish(dq, MTA_ABORT, 0);

     /*
      *  And return our error status
      */
     return(ires);
}

#undef CHECK

/* 
 *  decode_inspect() -- This is the routine that inspects each 
 *                      message part, deciding whether to accept 
 *                      or reject it.
 *  See explanatory comment 10 
 */
static int decode_inspect(void *ctx, mta_decode_t *dctx, 
                          int data_type,const char *data, 
                          size_t data_len)
{
     char buf[BIGALFA_SIZE * 2 + 10];
     int i;
     our_options_t *options = (our_options_t *)ctx;

     /*
      *  See if the part has:
      *
      *    1. A bad MIME content-type,
      *    2. A bad file name extension in the (deprecated) 
      *       NAME= content-type parameter, or
      *    3. A bad file name extension in the 
      *       FILENAME= content-disposition parameter.
      */
     i = 0;
     if ((i = is_bad_mime_type(ctx, dctx, buf, sizeof(buf))) ||
         is_bad_file_type(ctx,
                          mtaDecodeMessageInfoParams(dctx,
                                  MTA_DECODE_CTYPE_PARAMS, NULL),
                          "NAME", buf, sizeof(buf)) ||
         is_bad_file_type(ctx,
                          mtaDecodeMessageInfoParams(dctx,
                                  MTA_DECODE_CDISP_PARAMS, NULL),
                          "FILENAME", buf, sizeof(buf)))
     {
         char msg[BIGALFA_SIZE*4 + 10];

         /* 
          *  Replace this part with a text message indicating 
          *  that the parts content has been deleted.
          *  See explanatory comment 11 
          */
         if (i)
              i = sprintf(msg,
         "The content of this message part has been removed.\n"
         "It contained a potentially harmful media type of %.*s",
                          strlen(buf)-2, buf+1);

         else
              i = sprintf(msg,
      "The content of this message part has been removed.\n"
      "It contained a potentially harmful file named '%s'", buf);
         return(mtaDecodeMessagePartDelete(dctx, 
                              MTA_REASON, msg, i,
                              MTA_DECODE_CTYPE, "text", 4,
                              MTA_DECODE_CSUBTYPE, "plain", 5,
                              MTA_DECODE_CCHARSET, "us-ascii", 8,
                              MTA_DECODE_CDISP, "inline", 6,
                              MTA_DECODE_CLANG, "en", 2, 0));
     }
     else
          /*
           *  Keep the part
           *  See explanatory comment 12 
           */
          return(mtaDecodeMessagePartCopy(dctx, 0)); 
}

/* 
 *  is_bad_mime_type() -- See if the parts media type is in our
 *                        bad MIME content types, for example:
 *                        application/vbscript
 *  See explanatory comment 13 
 */
static int is_bad_mime_type(our_options_t *options,
                            mta_decode_t *dctx, char *buf,
                            size_t maxbuflen)
{
     const char *csubtype, *ctype;
     size_t i, len1, len2;
     char *ptr;

     /*
      *  Sanity checks
      */
      if (!options || !options->bmt_len || 
          !options->bad_mime_types[0] ||
          !dctx)
           return(0);

      /*
       *  Get the MIME content type
       */
      ctype = mtaDecodeMessageInfoString(dctx, MTA_DECODE_CTYPE,
                                         NULL, &len1);
      csubtype = mtaDecodeMessageInfoString(dctx, 
                                            MTA_DECODE_CSUBTYPE,
                                            NULL, &len2);

      /*
       *  Build the string: <0x01>type/subtype<0x01><0x00>
       */
      ptr = buf;
      *ptr++ = (char)0x01;
      for (i = 0; i < len1; i++)
           *ptr++ = tolower(*ctype++);
      *ptr++ = /;
      for (i = 0; i < len2; i++)
           *ptr++ = tolower(*csubtype++);
      *ptr++ = (char)0x01;
      *ptr = \0;

      /*
       *  Now see if the literal just built occurs in the list of 
       *  bad MIME content types
       */
      return((strstr(options->bad_mime_types, buf)) ? -1 : 0);
}

/* 
 *  is_bad_file_type() -- See if the part has an associated file 
 *                        name whose file extension is in our list 
 *                        of bad file names, such as .vbs.
 *  See explanatory comment 14 
 */
static int is_bad_file_type(our_options_t *options,
                            mta_opt_t *params,
                            const char *param_name, char *buf,
                            size_t maxbuflen)
{
      const char *ptr1;
      char fext[BIGALFA_SIZE+2], *ptr2;
      size_t i, len;

      /*
       *  Sanity checks
       */
      if (!options || !options->bft_len || !params || !param_name)
           return(0);

      len = 0;
      buf[0] = \0;
      if (mtaOptionString(params, param_name, 0, buf, &len,
                          maxbuflen - 1) ||
          !len || !buf[0])
          /*
           *  No file name parameter specified
           */
          return(0);

          /*
           *  A file name parameter was specified.  Parse it to 
           *  extract the file extension portion, if any.
           */
          ptr1 = strrchr(buf, .);
          if (!ptr1)
               /*
                *  No file extension specified
                */
               return(0);

          /*
           *  Now store the string created earlier in fext[] 
           *  Note that we drop the . from the extension.
           */
          ptr1++; /* Skip over the . */
          ptr2 = fext;
          *ptr2++ = (char)0x01;
          len = len - (ptr1 - buf);
          for (i = 0; i < len; i++)
               *ptr2++ = tolower(*ptr1++);
          *ptr2++ = (char)0x01;
          *ptr2++ = \0;

          /*
           *  Now return -1 if the string occurs in 
           *  options->bad_file_types.
           */
          return((strstr(options->bad_file_types, fext))
                 ? -1 : 0);
}

/* 
 *  load_options() -- Load our channel options from the channels 
 *                    option file 
 *  See explanatory comment 15 
 */
static int load_options(our_options_t *options)
{
     char buf[BIGALFA_SIZE+1];
     size_t buflen, i;
     mta_opt_t *channel_opts;
     int ires;
     const char *ptr0;
     char *ptr1;

     /*
      *  Initialize the our private channel option structure
      */
     memset(options, 0, sizeof(our_options_t));

     /* 
      *  Access the channels option file
      *  See explanatory comment 16 
      */
     channel_opts = NULL;
     if ((ires = mtaOptionStart(&channel_opts, NULL, 0, 0)))
     {
          mtaLog("Unable to access our channel option file");
          return(ires);
     }

     /*
      *  DEBUG=0|1
      */
     options->debug = 0;
     mtaOptionInt(channel_opts, "DEBUG", 0, &options->debug);
     if (options->debug)
         mtaDebug(MTA_DEBUG_SDK, 0);

     /*
      *  BAD_MIME_TYPES=type1/subtype1[,type2/subtype2[,...]]
      */
     buf[0] = \0;
     mtaOptionString(channel_opts, "BAD_MIME_TYPES", 0, buf,
                     &buflen, sizeof(buf));

     /*
      *  Now translate the comma separated list:
      * 
      *    Type1/Subtype1[,Type2/Subtype2[,...]]
      *
      *  to
      *
      *<0x01>type1/subtype1[<0x01>type2/subtype2[<0x01>...]]<0x01>
      */

     ptr0 = buf;
     ptr1 = options->bad_mime_types;
     *ptr1++ = (char)0x01;
     for (i = 0; i < buflen; i++)
     {
          if (*ptr0 != ,)
              *ptr1++ = tolower(*ptr0++);
          else
          {
               *ptr1++ = (char)0x01;
               ptr0++
           }
     }
     *ptr1++ = (char)0x01;
     *ptr1   = \0;
     options->bmt_len = ptr1 - options->bad_mime_types;

     /*
      *  BAD_FILE_TYPES=["."]Ext1[,["."]Ext2[,...]]
      */
     buf[0] = \0;
     buflen = 0;
     mtaOptionString(channel_opts, "BAD_FILE_TYPES", 0, buf,
                     &buflen, sizeof(buf));

     /*
      *  Now translate the comma separated list:
      *    ["."]Ext1[,["."]Ext2[,...]]
      * 
      *  to
      *
      *    <0x01>ext1[<0x01>ext2[<0x01>...]]<0x01>
      */
     ptr0 = buf;
     ptr1 = options->bad_file_types;
     *ptr1++ = (char)0x01;
     for (i = 0; i < buflen; i++)
     { 
          switch(*ptr0)
         {
          default :   /* copy after translating to lower case */
               *ptr1++ = tolower(*ptr0++);
               break;
          case . :  /* discard */
               break;
          case , :  /* end current type */
               *ptr1++ = (char)0x01;
               ptr0++;
               break;
          }
     }
     *ptr1++ = (char)0x01;
     *ptr1   = \0;
     options->bft_len = ptr1 - options->bad_file_types;

     /* 
      *  Dispose of the mta_opt_t context
      *  See explanatory comment 17 
      */
     mtaOptionFinish(channel_opts);
     /*
      *  And return a success
      */

     return(MTA_OK);
}

/*
 *  error_report()  Report an error condition when debugging is
 *                    enabled. 
 */
static void error_report(our_options_t *options, int ires, 
                         const char *func)
{
     if (options->debug)
         mtaLog("%s() returned %d; %s",
                (func ? func : "?"), ires, mtaStrError(ires));
}

/*
 *  error_exit() -- Exit with an error status and error message.
 */
static void error_exit(int ires, const char *msg)
{
     mtaLog("%s%s%s", (msg ? msg : ""), (msg ? "; " : ""),
            mtaStrError(ires));
     exit(1);
}

Example Option File

This example lists the MIME media types and file extensions this program is to consider potentially harmful.


DEBUG=1
BAD_MIME_TYPES=application/vbscript
BAD_FILE_TYPES=bat,com,dll,exe,vb,vbs

Sample Input Message

The example that follows is the text of a sample input message the program in Example 5–2 is to process. The second message part is a file attachment. The attached file name is trojan_horse.vbs. Consequently when this message is processed by the channel, it should remove the attachment as the file extension .vbs is in the list of harmful file extensions. The sample program replaces the attachment with a text attachment indicating the content has been deleted.


Received: from [129.153.12.22] ([129.153.12.22])
 by frodo.siroe.com (Sun Java System Messaging Server 6 2004Q2 (built Apr 7
 2003)) with SMTP id <0HD7001023OYDA00@frodo.siroe.com\> for
 for sue@sesta.com; Fri, 11 Apr 2003 13:03:23 -0700 (PDT)
Date: Fri, 11 Apr 2003 13:03:08 -0700
From: sue@sesta.com
Subject: test message
Message-id: <0HD7001033P1DA00@frodo.siroe.com\>
Content-type: multipart/mixed; boundary=BoundaryMarke

--BoundaryMarker
Content-type: text/plain; charset=us-ascii
Content-disposition: inline

This is a
  test message!

--BoundaryMarker
Content-type: application/octet-stream
Content-disposition: attachment; filename="trojan_horse.vbs"
Content-transfer-encoding: base64

IyFQUwoxMDAgMTAwIG1vdmV0byAzMDAgMzAwIGxpbmV0byBzdHJva2UKc2hvd3Bh
Z2UK

--BoundaryMarker--

Explanatory Text for Numbered Comments in the Decoding MIME Messages Complex Example

  1. The MTA SDK is explicitly initialized. This call is not really necessary as the MTA SDK will implicitly initialize itself when mtaDequeueStart() is called. However, for debugging purposes, it can be useful to make this call at the start of a program so that an initialization failure will show clearly in the diagnostic output. If the call is omitted, initialization failure will be less obvious. The failure will still be noted in the diagnostic output, but it will be obscured through the routine call that triggered implicit initialization.

  2. Channel options are loaded via a call to the load_options() routine. That routine is part of this example and, as discussed later, uses the SDK routines for obtaining channel option values from the channel’s option file.

  3. The message dequeue processing loop is initiated with a call to mtaDequeueStart().

  4. For each queued message to be processed, process_message() will be called by mtaDequeueStart().

  5. A message enqueue is started. This enqueue is used to re-enqueue the queued message currently being processed. As the message is processed, its non-harmful content will be copied to the new message being enqueued.

  6. The envelope recipient list is copied from the queued message to the new message being enqueued.

  7. Since this is an intermediate channel, that is, it doesn’t effect final delivery of a message, successful processing of a recipient address is associated with a disposition of MTA_DISP_RELAYED.

  8. After processing the message’s envelope, mtaDecodeMessage() is invoked to decode the message, breaking it into individual MIME message parts. mtaDecodeMessage() is told to use the current dequeue context as the input source for the message to decode. This supplies the queued message being processed as input to the MIME decoder. Further, the current enqueue context is supplied as the output destination for the resulting message. This directs mtaDecodeMessage() to output the resulting parsed message to the message being enqueued, less any harmful attachments that are explicitly deleted by the inspection routine. The routine decode_inspect() is supplied as the inspection routine. If the call to mtaDecodeMessage() fails, the CHECK() macro causes the queued message to be deferred and the message enqueue to be cancelled.

  9. After a successful call to mtaDecodeMessage(), the message enqueue is committed. It is important that this be done before committing the dequeue. If the operation is done in the other order– dequeue finish followed by enqueue finish– then mail may be lost. For example, the message would be lost if the dequeue succeeds and then deletes the underlying message file before the enqueue, and then the enqueue fails for some reason, such as insufficient disk space.

  10. The inspection routine, decode_inspect(). This routine checks the MIME header lines of each message part for indication that the part may contain harmful content.

  11. Message parts with harmful content are discarded with a call to mtaDecodeMessagePartDelete(). The discarded message part is replaced with a text message part containing a warning about the discarded harmful content.

  12. Message parts with safe content are kept by copying them to the output message with mtaDecodeMessagePartCopy().

  13. Using the configured channel options, this routine determines if a message part’s media type is in the list of harmful types.

  14. Using the configured channel options, this routine determines if a filename appearing in the MIME header lines has an extension considered harmful.

  15. The load_options() routine is used to load the channel’s site-configured options from a channel option file.

  16. The channel option file, if any, is opened and read by mtaOptionStart(). Since an explicit file path is not supplied, the file path specified with the PMDF_CHANNEL_OPTION environment variable gives the name of the option file to read.

  17. After loading the channel’s options, the option file context is disposed of with a call to mtaOptionFinish().

Decoding MIME Messages Complex Example Output

The example that follows shows the output generated by the MIME decoding program found in Example 5–2.


Received: from sesta.com by frodo.siroe.com
 Sun Java System Messaging Server Version 6 2004 Q2(built Apr 7 2003))
 id <0HDE00C01BFK6500@frodo.siroe.com\> for sue@sesta.com; Tue, 11
 Apr 2003 13:03:29 -0700 (PDT)
Received: from [129.153.12.22] ([129.153.12.22])
 by frodo.siroe.com (Sun Java System Messaging Server 6 2004 Q2 (built Apr 7
 2003)) with SMTP id <0HD7001023OYDA00@frodo.siroe.com\> for
 sue@sesta.com; Fri, 11 Apr 2003 13:03:23 -0700 (PDT)
Date: Fri, 11 Apr 2003 13:03:08 -0700
From: sue@sesta.com
Subject: test message
To: sue@sesta.com
Message-id: <0HD7001033P1DA00@frodo.siroe.com\>
Content-type: multipart/mixed;
 boundary="Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)"

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)
Content-type: text/plain; charset=us-ascii
Content-disposition: inline

This is a
 test message!

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)
Content-type: text/plain; charset=us-ascii
Content-language: en
Content-disposition: inline

The content of this message part has been removed.
It contained a potentially harmful file named "trojan_horse.vbs"

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)--