Sun Java System Messaging Server 6 2005Q4 MTA Developer's Reference

A Simple Virus Scanner Example

Example 5–2 that follows shows how to use the mtaDecodeMessage() routine to write an intermediate processing channel that converts messages with formats other than MIME, for example UUENCODE content, to MIME output. It then decodes the MIME message, scanning it for potentially harmful attachments. (In this example, an attachment is any message part.) Any harmful attachments are removed from the message after which it is re-enqueued for delivery. The list of harmful MIME media types and file name extensions is read from a channel option file. An example option file for the channel is shown in Example Option File.

In this example, the MIME Content-type: and Content-disposition: header lines are used to detect potentially harmful message attachments such as executable files. This example could be extended to also scan the content of the attachments, possibly passing the contents to a virus scanner. Further, the example could be modified to return as undeliverable any messages containing harmful attachments.


Note –

To configure the MTA to run this channel, see Running Your Enqueue and Dequeue Programs. The PMDF_CHANNEL_OPTION environment variable must give the absolute file path to the channel’s option file. Also, for a discussion on configuring special rewrite rules for re-enqueuing dequeued mail, see Preventing Mail Loops when Re-enqueuing Mail.


For the output generated by this sample program, see Decoding MIME Messages Complex Example Output.

After the Messaging Server product is installed, these programs can be found in the following location:

msg_server_base/examples/mtasdk/

Some lines of code are immediately preceded by a comment of the format:

/* See explanatory comment N */

where N is a number. The numbers are links to some corresponding explanatory text in the section that follows this code, see Explanatory Text for Numbered Comments in the Decoding MIME Messages Complex Example.


Example 5–2 Decoding MIME Messages Complex Example


/*
 *  virus_scanner_simple.c
 *
 *    Remove potentially harmful content from queued messages.
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "mtasdk.h"

/*
 *  A structure to store our channel options
 */
typedef struct {
     /* Produce debug output?       */
     int     debug; 
     /* Unwanted MIME content types */
     char    bad_mime_types[BIGALFA_SIZE+3]; 
     /* Length of bmt string        */
     size_t  bmt_len; 
     /* Unwanted file types         */
     char    bad_file_types[BIGALFA_SIZE+3]; 
     /* Length of bft string        */
     size_t  bft_len; 
} our_options_t;

/*
 *  Forward declarations
 */
static void error_exit(int ires, const char *msg);
static void error_report(our_options_t *options, int ires, const
                         char *func);
static int  is_bad_mime_type(our_options_t *options, mta_decode_t
                             *dctx, char *buf, size_t maxbuflen);
static int is_bad_file_type(our_options_t *options, mta_opt_t 
                            *params, const char *param_name, 
                            char *buf, size_t maxbuflen);
static int  load_options(our_options_t *options);

static mta_dq_process_message_t process_message;
static mta_decode_read_t decode_read;
static mta_decode_inspect_t decode_inspect;

/*
 *  main() -- Initialize the MTA SDK, load our options, and then 
 *            start the message processing loop.
 */
int main()
{
      int ires;
      our_options_t options;

      /* 
      *  Initialize the MTA SDK
      *  See explanatory comment 1 
      */
      if ((ires = mtaInit(0)))
          error_exit(ires, "Unable to initialize the MTA SDK");

      /* 
       *  Load our channel options
       *  See explanatory comment 2 
       */
      if ((ires = load_options(&options)))
          error_exit(ires, "Unable to load our channel options");

      /* 
       *  Now process the queued messages.  Be sure to indicate a
       *  thread stack size sufficient to accomodate message 
       *  enqueue processing.
       *  See explanatory comment 3 
       */
      if ((ires = mtaDequeueStart((void *)&options, 
                                  process_message, NULL, 0)))
          error_exit(ires, "Error during dequeue processing");

      /*
       * All done
       */
      mtaDone();
      return(0);
}

/* 
 *  process_message() -- This routine is called by 
 *                       mtaDequeueStart() to process each queued
 *                       message.  We dont make use of ctx2, but 
 *                       ctx1 is a pointer to our channel options.
 *  See explanatory comment 4 
 */
static int process_message(void **ctx2, void *ctx1, mta_dq_t *dq,
                           const char *env_from, size_t 
                           env_from_len)
{
     const char *adr;
     int disp, ires;
     size_t len;
     mta_nq_t *nq;
     our_options_t *options = (our_options_t *)ctx1;

     /*
      *  Initializations
      */
     nq = NULL;

     /*
      *  A little macro to do error checking on mta*() calls
      */
#define CHECK(f,x) \
     if ((ires = x)) { error_report(options, ires, f); goto \
                       done_bad; }

     /* 
      *  Start a message enqueue.  Use the dequeue context to copy
      *  envelope flags fromt the current message to this new 
      *  message being enqueued.
      *  See explanatory comment 5 
      */
     CHECK("mtaEnqueueStart",
           mtaEnqueueStart(&nq, env_from, env_from_len,
           MTA_DQ_CONTEXT, dq, 0));

     /*
      *  Process the envelope recipient list
      *  See explanatory comment 6 
      */
     while (!(ires = mtaDequeueRecipientNext(dq, &adr, &len, 0)))
     {
          /*
           *  Add this envelope recipient address to the message 
           *  being enqueued.  Use the dequeue context to copy
           *  envelope flags for this recipient from the current
           *  message to the new message.
           */
          ires = mtaEnqueueTo(nq, adr, len, MTA_DQ_CONTEXT, 
                              dq, MTA_ENV_TO, 0);
          /*  See explanatory comment 7 */
          disp = (ires) ? MTA_DISP_DEFERRED : MTA_DISP_RELAYED;
          CHECK("mtaDequeueRecipientDisposition",
                mtaDequeueRecipientDisposition(dq, adr, len, 
                                               disp, 0));
     }

     /*
      *  A normal exit from the loop occurs when 
      *  mtaDequeueRecipientNext() returns an MTA_EOF status.
      *  Any other status signifies an error.
      */
     if (ires != MTA_EOF)
     {
         error_report(options, ires, "mtaDequeueRecipientNext");
         goto done_bad;
      }

      /* 
       *  Begin the MIME decode of the message
       *  See explanatory comment 8 
       */
      CHECK("mtaDecodeMessage",
            mtaDecodeMessage(
              /* Private context is our options */
              (void *)options, 
              /* Input is the message being dequeued */
              MTA_DECODE_DQ, (void *)dq,
              /* Output is the message being enqueued */
              MTA_DECODE_NQ, (void *)nq,
              /* Inspection routine */
              decode_inspect, 
              /* Convert non-MIME formats to MIME */
              MTA_DECODE_THURMAN, 0)); 

      /* 
       *  Finish the enqueue
       *  NOTE: ITS IMPORTANT TO DO THIS before DOING THE 
       *  DEQUEUE. YOU WILL LOSE MAIL IF YOU DO THE DEQUEUE FIRST 
       *  and then THE ENQUEUE FAILS. 
       *  See explanatory text 9 
       */
      CHECK("mtaEnqueueFinish", mtaEnqueueFinish(nq, 0));
      nq = NULL;

      /*
       *  Finish the dequeue
       */
      CHECK("mtaDequeueFinish", mtaDequeueMessageFinish(dq, 0));

      /*
       *  All done with this message
       */
      return(MTA_OK); 

done_bad:
     /*
      *  Abort any ongoing enqueue or dequeue
      */
     if (nq)
          mtaEnqueueFinish(nq, MTA_ABORT, 0);
     if (dq)
          mtaDequeueMessageFinish(dq, MTA_ABORT, 0);

     /*
      *  And return our error status
      */
     return(ires);
}

#undef CHECK

/* 
 *  decode_inspect() -- This is the routine that inspects each 
 *                      message part, deciding whether to accept 
 *                      or reject it.
 *  See explanatory comment 10 
 */
static int decode_inspect(void *ctx, mta_decode_t *dctx, 
                          int data_type,const char *data, 
                          size_t data_len)
{
     char buf[BIGALFA_SIZE * 2 + 10];
     int i;
     our_options_t *options = (our_options_t *)ctx;

     /*
      *  See if the part has:
      *
      *    1. A bad MIME content-type,
      *    2. A bad file name extension in the (deprecated) 
      *       NAME= content-type parameter, or
      *    3. A bad file name extension in the 
      *       FILENAME= content-disposition parameter.
      */
     i = 0;
     if ((i = is_bad_mime_type(ctx, dctx, buf, sizeof(buf))) ||
         is_bad_file_type(ctx,
                          mtaDecodeMessageInfoParams(dctx,
                                  MTA_DECODE_CTYPE_PARAMS, NULL),
                          "NAME", buf, sizeof(buf)) ||
         is_bad_file_type(ctx,
                          mtaDecodeMessageInfoParams(dctx,
                                  MTA_DECODE_CDISP_PARAMS, NULL),
                          "FILENAME", buf, sizeof(buf)))
     {
         char msg[BIGALFA_SIZE*4 + 10];

         /* 
          *  Replace this part with a text message indicating 
          *  that the parts content has been deleted.
          *  See explanatory comment 11 
          */
         if (i)
              i = sprintf(msg,
         "The content of this message part has been removed.\n"
         "It contained a potentially harmful media type of %.*s",
                          strlen(buf)-2, buf+1);

         else
              i = sprintf(msg,
      "The content of this message part has been removed.\n"
      "It contained a potentially harmful file named '%s'", buf);
         return(mtaDecodeMessagePartDelete(dctx, 
                              MTA_REASON, msg, i,
                              MTA_DECODE_CTYPE, "text", 4,
                              MTA_DECODE_CSUBTYPE, "plain", 5,
                              MTA_DECODE_CCHARSET, "us-ascii", 8,
                              MTA_DECODE_CDISP, "inline", 6,
                              MTA_DECODE_CLANG, "en", 2, 0));
     }
     else
          /*
           *  Keep the part
           *  See explanatory comment 12 
           */
          return(mtaDecodeMessagePartCopy(dctx, 0)); 
}

/* 
 *  is_bad_mime_type() -- See if the parts media type is in our
 *                        bad MIME content types, for example:
 *                        application/vbscript
 *  See explanatory comment 13 
 */
static int is_bad_mime_type(our_options_t *options,
                            mta_decode_t *dctx, char *buf,
                            size_t maxbuflen)
{
     const char *csubtype, *ctype;
     size_t i, len1, len2;
     char *ptr;

     /*
      *  Sanity checks
      */
      if (!options || !options->bmt_len || 
          !options->bad_mime_types[0] ||
          !dctx)
           return(0);

      /*
       *  Get the MIME content type
       */
      ctype = mtaDecodeMessageInfoString(dctx, MTA_DECODE_CTYPE,
                                         NULL, &len1);
      csubtype = mtaDecodeMessageInfoString(dctx, 
                                            MTA_DECODE_CSUBTYPE,
                                            NULL, &len2);

      /*
       *  Build the string: <0x01>type/subtype<0x01><0x00>
       */
      ptr = buf;
      *ptr++ = (char)0x01;
      for (i = 0; i < len1; i++)
           *ptr++ = tolower(*ctype++);
      *ptr++ = /;
      for (i = 0; i < len2; i++)
           *ptr++ = tolower(*csubtype++);
      *ptr++ = (char)0x01;
      *ptr = \0;

      /*
       *  Now see if the literal just built occurs in the list of 
       *  bad MIME content types
       */
      return((strstr(options->bad_mime_types, buf)) ? -1 : 0);
}

/* 
 *  is_bad_file_type() -- See if the part has an associated file 
 *                        name whose file extension is in our list 
 *                        of bad file names, such as .vbs.
 *  See explanatory comment 14 
 */
static int is_bad_file_type(our_options_t *options,
                            mta_opt_t *params,
                            const char *param_name, char *buf,
                            size_t maxbuflen)
{
      const char *ptr1;
      char fext[BIGALFA_SIZE+2], *ptr2;
      size_t i, len;

      /*
       *  Sanity checks
       */
      if (!options || !options->bft_len || !params || !param_name)
           return(0);

      len = 0;
      buf[0] = \0;
      if (mtaOptionString(params, param_name, 0, buf, &len,
                          maxbuflen - 1) ||
          !len || !buf[0])
          /*
           *  No file name parameter specified
           */
          return(0);

          /*
           *  A file name parameter was specified.  Parse it to 
           *  extract the file extension portion, if any.
           */
          ptr1 = strrchr(buf, .);
          if (!ptr1)
               /*
                *  No file extension specified
                */
               return(0);

          /*
           *  Now store the string created earlier in fext[] 
           *  Note that we drop the . from the extension.
           */
          ptr1++; /* Skip over the . */
          ptr2 = fext;
          *ptr2++ = (char)0x01;
          len = len - (ptr1 - buf);
          for (i = 0; i < len; i++)
               *ptr2++ = tolower(*ptr1++);
          *ptr2++ = (char)0x01;
          *ptr2++ = \0;

          /*
           *  Now return -1 if the string occurs in 
           *  options->bad_file_types.
           */
          return((strstr(options->bad_file_types, fext))
                 ? -1 : 0);
}

/* 
 *  load_options() -- Load our channel options from the channels 
 *                    option file 
 *  See explanatory comment 15 
 */
static int load_options(our_options_t *options)
{
     char buf[BIGALFA_SIZE+1];
     size_t buflen, i;
     mta_opt_t *channel_opts;
     int ires;
     const char *ptr0;
     char *ptr1;

     /*
      *  Initialize the our private channel option structure
      */
     memset(options, 0, sizeof(our_options_t));

     /* 
      *  Access the channels option file
      *  See explanatory comment 16 
      */
     channel_opts = NULL;
     if ((ires = mtaOptionStart(&channel_opts, NULL, 0, 0)))
     {
          mtaLog("Unable to access our channel option file");
          return(ires);
     }

     /*
      *  DEBUG=0|1
      */
     options->debug = 0;
     mtaOptionInt(channel_opts, "DEBUG", 0, &options->debug);
     if (options->debug)
         mtaDebug(MTA_DEBUG_SDK, 0);

     /*
      *  BAD_MIME_TYPES=type1/subtype1[,type2/subtype2[,...]]
      */
     buf[0] = \0;
     mtaOptionString(channel_opts, "BAD_MIME_TYPES", 0, buf,
                     &buflen, sizeof(buf));

     /*
      *  Now translate the comma separated list:
      * 
      *    Type1/Subtype1[,Type2/Subtype2[,...]]
      *
      *  to
      *
      *<0x01>type1/subtype1[<0x01>type2/subtype2[<0x01>...]]<0x01>
      */

     ptr0 = buf;
     ptr1 = options->bad_mime_types;
     *ptr1++ = (char)0x01;
     for (i = 0; i < buflen; i++)
     {
          if (*ptr0 != ,)
              *ptr1++ = tolower(*ptr0++);
          else
          {
               *ptr1++ = (char)0x01;
               ptr0++
           }
     }
     *ptr1++ = (char)0x01;
     *ptr1   = \0;
     options->bmt_len = ptr1 - options->bad_mime_types;

     /*
      *  BAD_FILE_TYPES=["."]Ext1[,["."]Ext2[,...]]
      */
     buf[0] = \0;
     buflen = 0;
     mtaOptionString(channel_opts, "BAD_FILE_TYPES", 0, buf,
                     &buflen, sizeof(buf));

     /*
      *  Now translate the comma separated list:
      *    ["."]Ext1[,["."]Ext2[,...]]
      * 
      *  to
      *
      *    <0x01>ext1[<0x01>ext2[<0x01>...]]<0x01>
      */
     ptr0 = buf;
     ptr1 = options->bad_file_types;
     *ptr1++ = (char)0x01;
     for (i = 0; i < buflen; i++)
     { 
          switch(*ptr0)
         {
          default :   /* copy after translating to lower case */
               *ptr1++ = tolower(*ptr0++);
               break;
          case . :  /* discard */
               break;
          case , :  /* end current type */
               *ptr1++ = (char)0x01;
               ptr0++;
               break;
          }
     }
     *ptr1++ = (char)0x01;
     *ptr1   = \0;
     options->bft_len = ptr1 - options->bad_file_types;

     /* 
      *  Dispose of the mta_opt_t context
      *  See explanatory comment 17 
      */
     mtaOptionFinish(channel_opts);
     /*
      *  And return a success
      */

     return(MTA_OK);
}

/*
 *  error_report()  Report an error condition when debugging is
 *                    enabled. 
 */
static void error_report(our_options_t *options, int ires, 
                         const char *func)
{
     if (options->debug)
         mtaLog("%s() returned %d; %s",
                (func ? func : "?"), ires, mtaStrError(ires));
}

/*
 *  error_exit() -- Exit with an error status and error message.
 */
static void error_exit(int ires, const char *msg)
{
     mtaLog("%s%s%s", (msg ? msg : ""), (msg ? "; " : ""),
            mtaStrError(ires));
     exit(1);
}

Example Option File

This example lists the MIME media types and file extensions this program is to consider potentially harmful.


DEBUG=1
BAD_MIME_TYPES=application/vbscript
BAD_FILE_TYPES=bat,com,dll,exe,vb,vbs

Sample Input Message

The example that follows is the text of a sample input message the program in Example 5–2 is to process. The second message part is a file attachment. The attached file name is trojan_horse.vbs. Consequently when this message is processed by the channel, it should remove the attachment as the file extension .vbs is in the list of harmful file extensions. The sample program replaces the attachment with a text attachment indicating the content has been deleted.


Received: from [129.153.12.22] ([129.153.12.22])
 by frodo.siroe.com (Sun Java System Messaging Server 6 2004Q2 (built Apr 7
 2003)) with SMTP id <0HD7001023OYDA00@frodo.siroe.com\> for
 for sue@sesta.com; Fri, 11 Apr 2003 13:03:23 -0700 (PDT)
Date: Fri, 11 Apr 2003 13:03:08 -0700
From: sue@sesta.com
Subject: test message
Message-id: <0HD7001033P1DA00@frodo.siroe.com\>
Content-type: multipart/mixed; boundary=BoundaryMarke

--BoundaryMarker
Content-type: text/plain; charset=us-ascii
Content-disposition: inline

This is a
  test message!

--BoundaryMarker
Content-type: application/octet-stream
Content-disposition: attachment; filename="trojan_horse.vbs"
Content-transfer-encoding: base64

IyFQUwoxMDAgMTAwIG1vdmV0byAzMDAgMzAwIGxpbmV0byBzdHJva2UKc2hvd3Bh
Z2UK

--BoundaryMarker--

Explanatory Text for Numbered Comments in the Decoding MIME Messages Complex Example

  1. The MTA SDK is explicitly initialized. This call is not really necessary as the MTA SDK will implicitly initialize itself when mtaDequeueStart() is called. However, for debugging purposes, it can be useful to make this call at the start of a program so that an initialization failure will show clearly in the diagnostic output. If the call is omitted, initialization failure will be less obvious. The failure will still be noted in the diagnostic output, but it will be obscured through the routine call that triggered implicit initialization.

  2. Channel options are loaded via a call to the load_options() routine. That routine is part of this example and, as discussed later, uses the SDK routines for obtaining channel option values from the channel’s option file.

  3. The message dequeue processing loop is initiated with a call to mtaDequeueStart().

  4. For each queued message to be processed, process_message() will be called by mtaDequeueStart().

  5. A message enqueue is started. This enqueue is used to re-enqueue the queued message currently being processed. As the message is processed, its non-harmful content will be copied to the new message being enqueued.

  6. The envelope recipient list is copied from the queued message to the new message being enqueued.

  7. Since this is an intermediate channel, that is, it doesn’t effect final delivery of a message, successful processing of a recipient address is associated with a disposition of MTA_DISP_RELAYED.

  8. After processing the message’s envelope, mtaDecodeMessage() is invoked to decode the message, breaking it into individual MIME message parts. mtaDecodeMessage() is told to use the current dequeue context as the input source for the message to decode. This supplies the queued message being processed as input to the MIME decoder. Further, the current enqueue context is supplied as the output destination for the resulting message. This directs mtaDecodeMessage() to output the resulting parsed message to the message being enqueued, less any harmful attachments that are explicitly deleted by the inspection routine. The routine decode_inspect() is supplied as the inspection routine. If the call to mtaDecodeMessage() fails, the CHECK() macro causes the queued message to be deferred and the message enqueue to be cancelled.

  9. After a successful call to mtaDecodeMessage(), the message enqueue is committed. It is important that this be done before committing the dequeue. If the operation is done in the other order– dequeue finish followed by enqueue finish– then mail may be lost. For example, the message would be lost if the dequeue succeeds and then deletes the underlying message file before the enqueue, and then the enqueue fails for some reason, such as insufficient disk space.

  10. The inspection routine, decode_inspect(). This routine checks the MIME header lines of each message part for indication that the part may contain harmful content.

  11. Message parts with harmful content are discarded with a call to mtaDecodeMessagePartDelete(). The discarded message part is replaced with a text message part containing a warning about the discarded harmful content.

  12. Message parts with safe content are kept by copying them to the output message with mtaDecodeMessagePartCopy().

  13. Using the configured channel options, this routine determines if a message part’s media type is in the list of harmful types.

  14. Using the configured channel options, this routine determines if a filename appearing in the MIME header lines has an extension considered harmful.

  15. The load_options() routine is used to load the channel’s site-configured options from a channel option file.

  16. The channel option file, if any, is opened and read by mtaOptionStart(). Since an explicit file path is not supplied, the file path specified with the PMDF_CHANNEL_OPTION environment variable gives the name of the option file to read.

  17. After loading the channel’s options, the option file context is disposed of with a call to mtaOptionFinish().

Decoding MIME Messages Complex Example Output

The example that follows shows the output generated by the MIME decoding program found in Example 5–2.


Received: from sesta.com by frodo.siroe.com
 Sun Java System Messaging Server Version 6 2004 Q2(built Apr 7 2003))
 id <0HDE00C01BFK6500@frodo.siroe.com\> for sue@sesta.com; Tue, 11
 Apr 2003 13:03:29 -0700 (PDT)
Received: from [129.153.12.22] ([129.153.12.22])
 by frodo.siroe.com (Sun Java System Messaging Server 6 2004 Q2 (built Apr 7
 2003)) with SMTP id <0HD7001023OYDA00@frodo.siroe.com\> for
 sue@sesta.com; Fri, 11 Apr 2003 13:03:23 -0700 (PDT)
Date: Fri, 11 Apr 2003 13:03:08 -0700
From: sue@sesta.com
Subject: test message
To: sue@sesta.com
Message-id: <0HD7001033P1DA00@frodo.siroe.com\>
Content-type: multipart/mixed;
 boundary="Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)"

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)
Content-type: text/plain; charset=us-ascii
Content-disposition: inline

This is a
 test message!

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)
Content-type: text/plain; charset=us-ascii
Content-language: en
Content-disposition: inline

The content of this message part has been removed.
It contained a potentially harmful file named "trojan_horse.vbs"

--Boundary_(ID_XIIwKLBET2/DDbPzRI7yzQ)--