CHAPTER 12 SIMS Monitoring and Logging

Logging Facilities

250

Logging and Tracking Messages in the IMTA

253

SIMS Monitoring Utilities

261

SNMP Monitoring

268

Monitoring enables you to manage your SIMS server more effectively. Through monitoring, problems are detected earlier so that you can take corrective actions before your mail service is affected.

SIMS provides a number of ways of monitoring the messaging components. The three primary ways are:

	1.	Monitoring utilities
	2.	SNMP support
	3.	Log files

In addition, there are several categories of information that you can monitor, these include:

Availability: Is a service or component operating?

Connectivity: Is the mail service accessible? (POP, IMAP etc.)

Response time: How is the service performing from an end user perspective?

Processes: Are critical processes running?

Resources: Is there sufficient disk space for the message store? Is there enough virtual memory? What is the system load?

Security: Are there denial of service attacks in progress? Is the service subject to a SPAM attack? If so, from where?

The monitoring capabilities built into SIMS allow you to answer these questions. This chapter explains how to monitor a SIMS based e-mail service, and how to automate the monitoring process so that you are notified as problem situations that need attention develop.

Logging Facilities

Message Store/Message Access Log Files

250

Sun Directory Service Log Files

251

IMTA Log Files

252

Message Store/Message Access Log Files	250
Sun Directory Service Log Files	251
IMTA Log Files	252

SIMS captures a lot of information about significant events in the operation of the email service. This information is logged to various log files that are listed in TABLE 12-3. The type of information that is collected can be controlled. Some components provide a configuration file where the logging detail can be set (refer to the appropriate section in the SIMS Reference Manual).

Since an active SIMS server can generate a lot of logging information, manual analysis of the log files is often impractical. For this reason, SIMS includes a number of monitoring tools that extract interesting statistics from the log files (see "SIMS Monitoring Utilities" on page 261). In addition useful analysis can be performed using Solaris shell scripts. For example, here's how to determine the number of successful IMAP logins today:

% grep -c "ims.imaccessd\[[0-9]*\]\: imap\[[0-9]*]\: Auth" /var/log/syslog252

and how to determine how many IMAP connections were lost:

% grep -c "ims.imaccessd\[[0-9]*\]\: imap\[[0-9]*]\: Connection broken" /var/log/syslog13

Message Store/Message Access Log Files

he MSMA logs all imaccessd and message store utility errors and events. Log messages are in the /var/log/syslog file and prefixed with SUNWmail.ims. Note that system messages are logged every time a user checks for mail. It logs the pop3 and imap login and logout. This can cause a clutter syslog messages. To stop login/logout messages from clogging you syslog file, change the syslog logging level in /etc/syslog.conf from mail.debug to mail.notice and restart the syslog daemon after the change. See the syslog(3) man page for details.

Sun Directory Service Log Files

If configured, logs various information such as protocol trace information for function calls and debug information about packets. Log files are in the /var/opt/SUNWconn/ldap/log directory and are shown in TABLE 12-1. Refer to Netscape Directory Service documentation for information on Netscape log files.

Note - Note that if you are using the Netscape Directory Service, the log files will be different. Refer to product documentation.

TABLE 12-1 Overview of Sun Directory Service Logging Files
Log File
Description
Further Information

dsserv.log

Directory events

Sun Directory Services documentation

dsweb.log.log

Directory Web Gateway events

Sun Directory Services documentation

dsnmpserv.log

Directory SNMP events

Sun Directory Services documentation

**TABLE 12-1** Overview of Sun Directory Service Logging Files
Log File	Description	Further Information
dsserv.log	Directory events	Sun Directory Services documentation
dsweb.log.log	Directory Web Gateway events	Sun Directory Services documentation
dsnmpserv.log	Directory SNMP events	Sun Directory Services documentation

By default, the directory server daemon logs information about connections and operations in /var/opt/SUNWconn/ldap/log/dsserv.log. In the event that you need more diagnostic ability than is provided by SIMS default Directory configuration, you may run dsservd from a shell manually, providing additional diagnostics. See the -d and -s arguments to dsservd, documented in dsservd(8) man page. Note that running with additional diagnostics turned on will run dsservd in the foreground, and will use up disk space quickly if you have all optional diagnostic output enabled.

In addition, certain messages may be logged to the console of the machine on which dsservd is running, and/or to /var/adm/messages.

TABLE 12-2 contains an example of the information that is logged during a search operation. This example shows two interactions with the directory, the first to add an entry, and the second to search the directory for all entries that have a commonName attribute. The log includes details of the bind and unbind, and operational information.

TABLE 12-2 dsserv.log Example


Thu May 15 16:03: conn=9 fd=15 connection from unknown (127.0.0.1)
Thu May 15 16:03: conn=9 op=0 BIND dn="CN=admin,O=sun,C=us" method=128
Thu May 15 16:03: conn=9 op=0 RESULT err=0 tag=97 nentries=0
Thu May 15 16:03: conn=9 op=1 ADD dn="O=sun,C=us"
Thu May 15 16:03: conn=9 op=1 RESULT err=0 tag=105 nentries=0
Thu May 15 16:03: conn=9 op=2 UNBIND
Thu May 15 16:03: conn=9 op=2 fd=15 closed errno=0
Thu May 15 16:26: conn=10 fd=15 connection from unknown (127.0.0.1)
Thu May 15 16:26: conn=10 op=0 BIND dn="CN=admin,O=sun,C=us" method=128
Thu May 15 16:26: conn=10 op=0 RESULT err=0 tag=97 nentries=0
Thu May 15 16:26: conn=10 op=1 SRCH base="O=sun,C=us" scope=2 filter="(cn=*)"
Thu May 15 16:26: conn=10 op=1 RESULT err=0 tag=101 nentries=150
Thu May 15 16:26: conn=10 op=-1 fd=15 closed errno=0
Thu May 15 16:26: conn=10 op=2 UNBIND

IMTA Log Files

If configured, an IMTA channel logs in each email as it enters and is removed from the channel queue. Unless otherwise indicated in the table below, IMTA log file names are located in the directory /var/opt/SUNWmail/imta/log.

TABLE 12-3 Overview of IMTA Logging Files
Log File
Description
Further Information

mail.log

Cumulative IMTA events

"Diagnostics Output" on page 105,
"Message Logging" on page 107
"Logging Messages Passing Through the IMTA" on page 254
"Viewing Enqueued Messages" on page 116

mail.log_current

Today's IMTA events

"Logging and Tracking Messages in the IMTA" on page 253

mail.log_yesterday

Yesterday's IMTA events

"Logging and Tracking Messages in the IMTA" on page 253

dispatcher.log-*

SMTP dispatcher

SIMS Reference Manual: log files. IMTA service dispatcher

job_controller.log-*

Job controller

SIMS Reference Manual: job controller

tcp_smtp_server.log-*

SMTP server

"Check Processing Log Files" on page 281

/var/log/syslog

prefixed with SUNWmail.imta

"Troubleshooting the IMTA" on page 278

channel_master.log-*

Master program output (usually client) for channel

"Troubleshooting the IMTA" on page 278

channel_slave.log-*

Output of slave program (usually server) for channel

"Troubleshooting the IMTA" on page 278

server-name_server.log-*

Logging for server server-name

"Troubleshooting the IMTA" on page 278

server-name_thread.log-*

Per thread errors for server-name

"Troubleshooting the IMTA" on page 278

post.log-*

Log output for the periodic IMTA delivery job

"Troubleshooting the IMTA" on page 278

return.log-*

Log output for the periodic IMTA message bouncer job

"Troubleshooting the IMTA" on page 278

Channel log files are not created unless an error occurs or debugging output is enabled for the channel with the master_debug channel keyword or slave_debug channel keyword.

Each new log file is created with a unique id to avoid overwriting an earlier log written by the same channel. You can use the imta find utility to aid in finding the desired "version" of a log file. You can purge back older log files using the imta purge command.

**TABLE 12-3** Overview of IMTA Logging Files
Log File	Description	Further Information
mail.log	Cumulative IMTA events	"Diagnostics Output" on page 105, "Message Logging" on page 107 "Logging Messages Passing Through the IMTA" on page 254 "Viewing Enqueued Messages" on page 116
mail.log_current	Today's IMTA events	"Logging and Tracking Messages in the IMTA" on page 253
mail.log_yesterday	Yesterday's IMTA events	"Logging and Tracking Messages in the IMTA" on page 253
dispatcher.log-*	SMTP dispatcher	SIMS Reference Manual: log files. IMTA service dispatcher
job_controller.log-*	Job controller	SIMS Reference Manual: job controller
tcp_smtp_server.log-*	SMTP server	"Check Processing Log Files" on page 281
/var/log/syslog	prefixed with SUNWmail.imta	"Troubleshooting the IMTA" on page 278
channel_master.log-*	Master program output (usually client) for channel	"Troubleshooting the IMTA" on page 278
channel_slave.log-*	Output of slave program (usually server) for channel	"Troubleshooting the IMTA" on page 278
server-name_server.log-*	Logging for server server-name	"Troubleshooting the IMTA" on page 278
server-name_thread.log-*	Per thread errors for server-name	"Troubleshooting the IMTA" on page 278
post.log-*	Log output for the periodic IMTA delivery job	"Troubleshooting the IMTA" on page 278
return.log-*	Log output for the periodic IMTA message bouncer job	"Troubleshooting the IMTA" on page 278

Logging and Tracking Messages in the IMTA

Identifying the Source of Incoming SMTP Messages

254

Logging Messages Passing Through the IMTA

254

Snapshots of Message Traffic through the IMTA

257

Identifying the Source of Incoming SMTP Messages	254
Logging Messages Passing Through the IMTA	254
Snapshots of Message Traffic through the IMTA	257

This section points out some message logging and tracking techniques.

Identifying the Source of Incoming SMTP Messages

When the identtcp or identtcpnumeric channel keyword is set in the tcp_local channel, the IMTA attempts to use an IDENT query to identify incoming SMTP connections. If the sending system is running an IDENT server, it will return to the IMTA the SMTP sender's identity for the IMTA to insert in the Received: header the IMTA constructs. If the sending system is not running an IDENT server, the IMTA will just use the port number (port 25) and the sending system IP address or name.

With identtcpnumeric, the IMTA uses the IDENT information (if any) and the actual IP address of the sending system; with identtcp, the IMTA also attempts to translate the IP address to a system name by performing a DNS reverse lookup.Thus identtcpnumeric incurs slightly less overhead because it does not do the DNS reverse lookup, and the actual IP address might be considered somewhat more authoritative that the name resulting from a DNS query. However, using the system name as with identtcp may be considered more user-friendly.

Identifying information in Received headers can assist in detecting spoofed email and in holding the senders of such spoofed email accountable. Note that user friendly identifying information is a not insignificant feature: even a naive user may notice that a Received header in a suspicious message contains an unexpected address, for example, anonymous@SpoofersAreUs.edu, but only a fairly sophisticated user is able to pay attention to any IP addresses showing up in Received headers. So a choice between these keywords may be affected by whether you are looking to provide forewarning to users that they may have received spoofed email, or whether you merely wish to preserve the identifying information for use in investigating cases of spoofed email.

Logging Messages Passing Through the IMTA

The IMTA provides facilities for logging each message as it is enqueued and dequeued. All log entries are made to the mail.log_current file in the IMTA log directory, /var/opt/SUNWmail/imta/log. Logging is controlled on a per-channel basis. The logging keyword activates logging for a particular channel while the nologging keyword disables it. Logging is enabled on all channels by default.

When logging is turned on, the cumulative mail.log file in the IMTA log directory will continue to grow and it is up to you to periodically write it to backup and delete it, or truncate it, or whatever your site prefers.

The log file is written as normal ASCII text and the format is quite simple. By default, each entry contains eight or nine fields. An example is shown in the

CODE EXAMPLE 12-1 mail.log_current File. date/time src-channel dest-ch. type size(kb) sender rcpt 3-May-1999 08:48:56 tcp_local sims-ms E 1 usr042@akaba.eng.cat.com rfc822;usr041@sims-ms-daemon 3-May-1999 08:48:58 sims-ms D 1 usr.042@eng.cat.com rfc822;usr041@sims-ms-daemon 3-May-1999 08:54:47 tcp_local sims-ms E 1 usr042@akaba.eng.cat.com rfc822;usr041@sims-ms-daemon 3-May-1999 08:54:47 tcp_local pipe E 1 usr042@akaba.eng.cat.com rfc822;usr1+autoreply@pipe-daemon 3-May-1999 08:54:49 sims-ms D 1 usr.042@eng.cat.com rfc822;usr041@sims-ms-daemon 3-May-1999 08:54:49 pipe D 1 usr.042@eng.cat.com rfc822;usr1+autoreply@pipe-daemon * For SMTP channels during dequeue only there will be an active form of the envelope To: address and a delivery status. Type: E: enqueued R: returned D: delivered Q: transient failure on delivery

You can also view the message queue by running:

% imta cache -view <channel_name>

The imta test -rewrite command verifies that an address is properly handled by the IMTA.

If you still cannot find a lost message, check whether it's being held, by looking for *.HELD messages in the channel queues.

The fields are described in TABLE 12-4:

TABLE 12-4 Log File Fields

date/time

Date and time when the entry was made.

src-channel

Channel name for the source channel.

dest-ch.

Channel name of the destination channel. For SMTP channels when LOG_CONNECTION is enabled, a minus sign (-) indicates inbound to the SMTP server, a plus sign (+) indicates outbound via the SMTP client.

Type

Type of entry. See TABLE 12-5.

Size

Size of the message. This is expressed in kilobytes by default, although this default can be changed by using the BLOCK_SIZE keyword in the IMTA option file.

Sender

Envelope From address. Note that for messages with an empty envelope From address, such as notification messages, this field will be blank.

Rcpt

Original form of the envelope To address.

smtp:250<marlow@alpha.com>

Active (current) form of the envelope To address (SMTP channels during dequeue only).

Recipient ok

Delivery status (SMTP channels only during dequeue).

The logging entry codes are described in TABLE 12-5:

TABLE 12-5 Logging Entry Codes
Entry
Description

General

D

Successful dequeue.

E

Enqueue.

J

Access control mapping of attempted enqueue.

Q

Temporary failure to dequeue.

R

Recipient address rejected on attempted dequeue.

Z

Some successful recipients, but this recipient was temporarily unsuccessful; the original message file of all recipients was dequeued, and in its place a new message file for this and other unsuccessful recipients will be immediately re-enqueued.

SMTP Channels' LOG_CONNECTION + or - entries

C

Connection closed.

O

Connection opened.

X

Connection rejected.

Y

Connection try failed before being established.

**TABLE 12-4** Log File Fields
date/time	Date and time when the entry was made.
src-channel	Channel name for the source channel.
`dest-ch.`	Channel name of the destination channel. For SMTP channels when `LOG_CONNECTION` is enabled, a minus sign (`-`) indicates inbound to the SMTP server, a plus sign (`+`) indicates outbound via the SMTP client.
`Type`	Type of entry. See TABLE 12-5.
`Size`	Size of the message. This is expressed in kilobytes by default, although this default can be changed by using the `BLOCK_SIZE` keyword in the IMTA option file.
`Sender`	Envelope `From` address. Note that for messages with an empty envelope `From` address, such as notification messages, this field will be blank.
`Rcpt`	Original form of the envelope `To` address.
`smtp:250<marlow@alpha.com>`	Active (current) form of the envelope `To` address (SMTP channels during dequeue only).
`Recipient ok`	Delivery status (SMTP channels only during dequeue).

**TABLE 12-5** Logging Entry Codes
Entry	Description
General
D	Successful dequeue.
E	Enqueue.
J	Access control mapping of attempted enqueue.
Q	Temporary failure to dequeue.
R	Recipient address rejected on attempted dequeue.
Z	Some successful recipients, but this recipient was temporarily unsuccessful; the original message file of all recipients was dequeued, and in its place a new message file for this and other unsuccessful recipients will be immediately re-enqueued.
SMTP Channels' LOG_CONNECTION + or - entries
C	Connection closed.
O	Connection opened.
X	Connection rejected.
Y	Connection try failed before being established.

Extra Logging Detail

In addition to the base set of logging enabled by the logging channel keyword, the IMTA has options that cause additional information to be included in the entries written to the mail.log* files. Note that logging such additional information tends to incur additional overhead.

Particularly likely to be of interest are the LOG_MESSAGE_ID, LOG_FILENAME, and LOG_CONNECTION options. Enabling LOG_MESSAGE_ID allows correlation of which entries relate to which message.

Enabling LOG_FILENAME makes it easier to immediately spot how many times delivery of a particular message file has been retried, and can be useful in understanding when the IMTA does or does not split a message to multiple recipients into separate message file copies on disk.

Enabling LOG_CONNECTION causes the IMTA to log TCP/IP connections, as well as message traffic, to the mail.log files by default; alternatively, the SEPARATE_CONNECTION_LOG option may be used to specify that connection log entries instead be written to connection.log files. That is, if SEPARATE_CONNECTION_LOG and LOG_CONNECTION are set, the connection log entries will be written to the connection.log_current file in the IMTA log directory.

Setting LOG_HEADER=1 may be of interest if you wish to save certain message headers to the mail.log* files.

Additionally, setting LOG_PROCESS=1 and LOG_USERNAME=1 on an IMTA firewall system generally results in fairly monotonous extra information being logged: the process ID of the process enqueuing a message on an IMTA firewall system would normally be that of an IMTA SMTP server process (for SMTP messages), and the user name would normally just be the user name of the user who last restarted the IMTA Service Dispatcher. Enable these options if you wish to confirm that the process IDs and user names of processes enqueuing messages are as expected.

Note - It is up to the individual sites whether or not to implement a log cleaning policy. By default, mail.log is never removed; this can potentially fill up your disk space.

Snapshots of Message Traffic through the IMTA

Command: imta-counters

The IMTA has channel statistics counters to collect and monitor channel counters based upon the SNMP Mail Monitoring MIB, RFC 1566. These counters tabulate on a per channel basis the twelve items described in TABLE 12-6.

TABLE 12-6 Channel Counters
Field name
Description

received_messages

The number of messages enqueued to the channel

submitted_messages

The number of messages enqueued by the channel

stored_messages

The total number of messages currently stored for the channel

delivered_messages

The number of messages dequeued by the channel

received_volume

Volume of messages enqueued to the channel as measured in IMTA kilobytes

submitted_volume

Volume of messages enqueued by the channel as measured in IMTA kilobytes

stored_volume

Volume of messages currently stored for the channel as measured in IMTA kilobytes

delivered_volume

Volume of messages dequeued by the channel as measured in IMTA kilobytes

received_recipients

Number of recipients specified in all messages enqueued to the channel

submitted_recipients

Number of recipients specified in all messages enqueued by the channel

stored_recipients

Number of recipients specified in all messages currently stored for the channel

delivered_recipients

Number of recipients specified in all messages dequeued by the channel

next

Pointer to the next list entry of channel counters.

rejected_messages

Count of messages which, upon trying to be enqueued to the channel, were rejected.

failed_messages

Count of messages enqueued to the channel which, when processed, failed to be delivered for one or more recipients owing to permanent errors of some sort (for example, invalid recipient address).

attempted_messages

Count of messages enqueued to the channel whose delivery has been attempted.

rejected_volume

Cumulative volume of messages which, upon trying to be enqueued to the channel, were rejected.

failed_volume

Cumulative volume of messages enqueued to the channel which, when processed, failed to be delivered for one or more recipients owing to permanent errors of some sort (for example, invalid recipient address).

attempted_volume

Cumulative volume of messages enqueued to the channel whose delivery has been attempted.

rejected_recipients

Cumulative count of recipient addresses which, upon trying to be enqueued to the channel, were rejected.

failed_recipients

Cumulative count of recipients enqueued to the channel which, when processed, failed to be delivered owing to permanent errors of some sort (e.g., invalid recipient address).

attempted_recipients

Cumulative count of recipients enqueued to the channel whose delivery has been attempted.

delivered_first_messages

Cumulative count of messages enqueued to the channel which were successfully delivered (or returned as undeliverable) on their first processing attempt.

delivered_first_queue_count

Cumulative count of first message delivery attempts made by the channel. When this value is less then received messages, it means that delivery has not yet been attempted for all received messages. This is not unusual: this value is expected to lag behind received messages.

delivered_first_queue_time

Cumulative count of elapsed seconds between when a message is enqueued and when processing of its first delivery attempt completes. The result of dividing delivered first queue time by delivered first queue count gives the average amount of time in seconds spent by a message in the processing queues as it awaits its initial delivery attempt.

delivered-queue_count

Cumulative count of message delivery attempts made by the channel.

delivered_queue_time

Cumulative count of elapsed seconds between when a message is enqueued and when it is finally removed from the channel queue. The result of dividing delivered queue time by delivered queue count gives the average amount of time in seconds spent by a message in the processing queues.

It is important to note that these counters generally need to be looked at over time noting the minimum values seen. The minimums may actually be negative for some channels. Such a negative value merely means that there were messages queued for a channel at the time that its counters were zeroed (e.g., the cluster-wide database of counters created). When those messages were dequeued, the associated counters for the channel were decremented therefore leading to a negative minimum. For such a counter, the correct "absolute" value is the current value less the minimum value that counter has ever held since being initialized.

**TABLE 12-6** Channel Counters
Field name	Description
`received_messages`	The number of messages enqueued to the channel
`submitted_messages`	The number of messages enqueued by the channel
`stored_messages`	The total number of messages currently stored for the channel
`delivered_messages`	The number of messages dequeued by the channel
`received_volume`	Volume of messages enqueued to the channel as measured in IMTA kilobytes
`submitted_volume`	Volume of messages enqueued by the channel as measured in IMTA kilobytes
`stored_volume`	Volume of messages currently stored for the channel as measured in IMTA kilobytes
`delivered_volume`	Volume of messages dequeued by the channel as measured in IMTA kilobytes
`received_recipients`	Number of recipients specified in all messages enqueued to the channel
`submitted_recipients`	Number of recipients specified in all messages enqueued by the channel
`stored_recipients`	Number of recipients specified in all messages currently stored for the channel
`delivered_recipients`	Number of recipients specified in all messages dequeued by the channel
`next`	Pointer to the next list entry of channel counters.
`rejected_messages`	Count of messages which, upon trying to be enqueued to the channel, were rejected.
`failed_messages`	Count of messages enqueued to the channel which, when processed, failed to be delivered for one or more recipients owing to permanent errors of some sort (for example, invalid recipient address).
`attempted_messages`	Count of messages enqueued to the channel whose delivery has been attempted.
`rejected_volume`	Cumulative volume of messages which, upon trying to be enqueued to the channel, were rejected.
`failed_volume`	Cumulative volume of messages enqueued to the channel which, when processed, failed to be delivered for one or more recipients owing to permanent errors of some sort (for example, invalid recipient address).
`attempted_volume`	Cumulative volume of messages enqueued to the channel whose delivery has been attempted.
`rejected_recipients`	Cumulative count of recipient addresses which, upon trying to be enqueued to the channel, were rejected.
`failed_recipients`	Cumulative count of recipients enqueued to the channel which, when processed, failed to be delivered owing to permanent errors of some sort (e.g., invalid recipient address).
`attempted_recipients`	Cumulative count of recipients enqueued to the channel whose delivery has been attempted.
`delivered_first_messages`	Cumulative count of messages enqueued to the channel which were successfully delivered (or returned as undeliverable) on their first processing attempt.
`delivered_first_queue_count`	Cumulative count of first message delivery attempts made by the channel. When this value is less then received messages, it means that delivery has not yet been attempted for all received messages. This is not unusual: this value is expected to lag behind received messages.
`delivered_first_queue_time`	Cumulative count of elapsed seconds between when a message is enqueued and when processing of its first delivery attempt completes. The result of dividing delivered first queue time by delivered first queue count gives the average amount of time in seconds spent by a message in the processing queues as it awaits its initial delivery attempt.
`delivered-queue_count`	Cumulative count of message delivery attempts made by the channel.
`delivered_queue_time`	Cumulative count of elapsed seconds between when a message is enqueued and when it is finally removed from the channel queue. The result of dividing delivered queue time by delivered queue count gives the average amount of time in seconds spent by a message in the processing queues.

Purpose and Use of Counters

The IMTA channel counters are intended for indicating the trend and health of your e-mail system. The IMTA channel counters are not designed nor intended to provide an accurate accounting of message traffic. For precise accounting, see "Logging Messages Passing Through the IMTA" on page 254. The IMTA's channel counters are implemented using the lightest weight mechanisms available, namely a shared memory section on each system.

Example of counters interpretation

The example below shows a sample excerpt of counters data, as might be seen using the imta counters -show utility.

Channel Messages Recipients Blocks ------- -------- ---------- ------ directory Received 6523 9042 69694 (1) Stored 4 4 149 (2) Delivered 6519 9038 69545 (3) Submitted 6811 9019 71123 (4)

In this example:

	1.	The "Received" value represents a count of messages coming from any channel to the channel named directory. That is, messages enqueued to the directory channel by any other channel.
	2.	The "Stored" value represents a count of messages stored in the channel queue to be delivered. This will generally correspond to the number of entries currently stored for the channel in the IMTA queue cache database.
	3.	The "Delivered" value represents a count of messages which have been processed (dequeued) by the channel directory; a dequeue operation may either correspond to a successful "delivery" (that is, an enqueue to another channel), or to a dequeue due to the message being returned to the sender. This will generally correspond to the number Received minus the number Stored.
	4.	The "Submitted" value represents a count of messages which have been enqueued from the channel directory to any other channel.

Note that in this example, the number of messages Submitted is greater than the number delivered. This is often the case, since each message the channel dequeues (delivers) will result in at least one new message enqueued (submitted) but possibly more than one. For example, if a message has two recipients reached via different channels, then two enqueues will be required. Or if a message bounces, a copy will go back to the sender and another copy may be sent to the postmaster. Usually that will be two submissions (unless both are reached through the same channel).

SIMS Monitoring Utilities

SIMS Monitoring Plan

261

SIMS Monitoring Examples

263

SIMS Monitoring Plan	261
SIMS Monitoring Examples	263

SIMS provides a set of command line monitoring tools (see TABLE 12-7) that provide detailed information about the status and performance of a SIMS server. For additional information refer to the man pages.

TABLE 12-7 SIMS Monitoring Commands
Command
Component Monitored
Description

immonitor-queue

IMTA

Displays the sizes of IMTA channel queues, domains for which there have been delivery problems.

immonitor-access

MSMA

Determines the responsiveness of the POP, IMAP and LDAP services and mail delivery round trip time.

immonitor-system

SIMS System

Monitors the status of the system resources: swap space, disk utilization and network status.

immonitor-users

IMTA

Provides a summary of the top mail submitter, useful for SPAM detection.

immonitor-reenqueue

IMTA

Allows messages to be moved from one channel to another to perform queue load balancing or to delay the delivery of messages to domains that are temporarily inaccessible

imta-counters

IMTA channel counters

Provides information about IMTA channel counters. Refer to the imta-counters man page and logging section of the SIMS Reference Manual.

imta dispatcher_stats_tty

SMTP

Outputs the current open incoming SMTP connections and by which SMTP server process they are handled. (no man page)

**TABLE 12-7** SIMS Monitoring Commands
Command	Component Monitored	Description
immonitor-queue	IMTA	Displays the sizes of IMTA channel queues, domains for which there have been delivery problems.
immonitor-access	MSMA	Determines the responsiveness of the POP, IMAP and LDAP services and mail delivery round trip time.
immonitor-system	SIMS System	Monitors the status of the system resources: swap space, disk utilization and network status.
immonitor-users	IMTA	Provides a summary of the top mail submitter, useful for SPAM detection.
immonitor-reenqueue	IMTA	Allows messages to be moved from one channel to another to perform queue load balancing or to delay the delivery of messages to domains that are temporarily inaccessible
imta-counters	IMTA channel counters	Provides information about IMTA channel counters. Refer to the imta-counters man page and logging section of the SIMS Reference Manual.
imta dispatcher_stats_tty	SMTP	Outputs the current open incoming SMTP connections and by which SMTP server process they are handled. (no man page)

SIMS Monitoring Plan

The key attributes of a SIMS mail server that determine server performance are:

Message Transport Queue Size. Messages passing through a SIMS server are stored in the IMTA message queues. The performance of the IMTA depends on the size of the queue. An extremely large Queue size can slow down the IMTA.

Accessibility and responsiveness of core SIMS services (SMTP, IMAP, POP, LDAP). A large number of connections (POP, IMAP, SMTP. LDAP) may indicate that connections are being abnormally terminated, either because of a failure of a SIMS service or a deliberate denial of service attack. Either of these can lead to reduced performance of the SIMS server.

System Resource Utilization. Normal SIMS operation contributes to system load and resource consumption, such as cpu, disk and virtual memory resources. To keep SIMS performing at an optimal level it is important to monitor system resources.

To ensure high availability of the e-mail service, the key attributes of the system should be continually monitored. This will allow the system administrator to be alerted automatically whenever a potential problem develops. We recommend monitoring at least the attributes listed in TABLE 12-8.

Each monitoring utility allows thresholds to be set which define the upper bounds on the resource utilization, for example, the maximum amount of disk space used by the message store directory. When the threshold is exceeded, the utility can generate an alert or initiate the notification of a system operator via a pager (must be e-mail accessible). Establishing a reasonable set of thresholds depends on the performance characteristics of the system on which the component is running (e.g., low-end or high-end server) and the usage profile of your e-mail service. TABLE 12-8 shows some example default thresholds settings that we recommend monitoring.

TABLE 12-8 SIMS Monitoring Methods and Example Thresholds
Resource
monitoring method
Threshold

IMTA queue size1

immonitor-queue

500 messages

IMTA HELD messages2

immonitor-queue

1 messages

POP, IMAP, SMTP, LDAP connect time

immonitor-access

5 seconds

# IMAP, POP connections3

immonitor-system

200

virtual memory

immonitor-system

60%

disk space: message store directory

immonitor-system

75%

disk space: imta channel queue directory

immonitor-system

50%

1IMTA queue size - This threshold will depend on the function of the IMTA. If the IMTA is responsible for routing outbound messages into the Internet, 500 is too low. 1000-2000 messages in the mail queue due to DNS problems or remote IMTA problems that are beyond the control of the local ISP are not uncommon. This number will also likely vary based on size of the ISP. If you have more customers, the chances that a mail to an address that is having problems will increase.

For internal SMTP routers, or IMTA's whose only function it is to deliver to the local message store, 500 would be too high. Except for a broadcast delivery, if the queue gets to 500 on this type of IMTA there is a problem that needs to be diagnosed.

2IMTA HELD messages - Although any number of messages may be held in the .HELD file, you should investigate even a single message caught in this file.

3Number of IMAP, POP connections - The IMAP and POP are going to be different since IMAP is a persistent connection and higher than POP.

An alternative approach to these thresholds would be to run the monitoring commands at a time of day when the service is heavily loaded, and over a period of an hour, measure the values of these resources every 5 minutes. You can then set thresholds to the highest value plus 10%.

It is likely that you will need to tune these thresholds. You can tell if they are set too low, because the thresholds will be frequently exceeded even though the service is performing satisfactorily. On the other hand, of the service fails and you were not notified in advance, the threshold may be set too high. For example if you set the disk space threshold for the message store to 95% you might not be notified in time before the disk space is exhausted.

**TABLE 12-8** SIMS Monitoring Methods and Example Thresholds
Resource	monitoring method	Threshold
IMTA queue size1	`immonitor-queue`	500 messages
IMTA HELD messages2	`immonitor-queue`	1 messages
POP, IMAP, SMTP, LDAP connect time	`immonitor-access`	5 seconds
# IMAP, POP connections3	`immonitor-system`	200
virtual memory	`immonitor-system`	60%
disk space: message store directory	`immonitor-system`	75%
disk space: imta channel queue directory	`immonitor-system`	50%

SIMS Monitoring Examples

The monitoring utilities take snapshots of the status of the SIMS components. To provide ongoing coverage we recommend that the utilities be run periodically using a facility such as cron(1M).

The monitoring utilities use a threshold based alerting system. The administrator needs to set some threshold values and take some corrective action when they are exceeded.

For example, if the sizes of any of the IMTA queues are unusually large or there are a large number of messages in the .HELD queue, this can indicate message delivery or addressing problems. The monitoring utilities can be configured to automatically detect these scenarios and send alerts.

Scenario 1: SIMS and LDAP Server Configured on a Single Machine

In this case the various SIMS components (IMTA, message access and store, directory server, admin server) are installed on the same system. So all the monitoring utilities are scheduled to run on a single system. Also let us assume that inetmail (the SIMS Postmaster) is the user to be notified when any threshold is exceeded. Note that in this example the administrator has set up a special account named 'test_user' with a password of `passwd' for the purposes of monitoring.

Monitored Value

Threshold

IMTA queue size

500 messages

HELD messages

100 messages

Message store size

2 gigabytes

"/" directory size

200 megabytes

SMTP connections

2000

LDAP response time

5 minutes

Mail sending time

2 minutes

POP/IMAP retrieval time

5 minutes

Schedule interval (cron)

30 minutes

Total round-trip time

5 minutes

Monitored Value	Threshold
IMTA queue size	500 messages
HELD messages	100 messages
Message store size	2 gigabytes
"/" directory size	200 megabytes
SMTP connections	2000
LDAP response time	5 minutes
Mail sending time	2 minutes
POP/IMAP retrieval time	5 minutes
Schedule interval (cron)	30 minutes
Total round-trip time	5 minutes

Scenario 1 Commands:

The alert mechanism sends e-mail alerts to inetmail and to all SIMS administrators, so if the SIMS server is having mail delivery problems the alerts may not be received. Use the -A option to specify an alternate SMTP server for the alert delivery channel. For optimal monitoring, schedule the utilities to run at an interval of 30 minutes (see the crontab(1) man page). Assuming that /opt/SUNWmail/sbin is in the PATH, the corresponding crontab entries for monitoring utilities using the thresholds listed above are as follows:

To alert inetmail when the IMTA queue size exceeds 500 messages or when the number of HELD messages exceeds 100:

30 * * * * immonitor queue -s 500 -H 100 -r inetmail > /dev/null 2>&1

To alert inetmail when utilization of /var/opt/SUNWmail/ims exceeds 2 gigabytes or when the utilization of "/" exceeds 200 Megabytes OR when the number of established SMTP connections exceeds 2000. The back slash (\) before the "%" is required since "%" is a keyword for crontab.

30 * * * * immonitor system -f /var/opt/SUNWmail/ims=2g -f /=200m -m 90\% -p smtp=2000 -r inetmail > /dev/null 2>&1

Alerts inetmail when the LDAP response time exceeds 5 minutes OR when mail sending time exceeds 2 minutes or when the mail retrieval time with POP/IMAP exceeds 5 minutes OR when the total round-trip time exceeds 5 minutes, use:

30 * * * * immonitor access -u test_user -w passwd -L localhost:=300 -S localhost:=120 -I localhost:=300 -P localhost:=300 -D 300 -r inetmail > /dev/null 2>&1

Scenario 2: The SIMS and LDAP Server on Different Machines

In this scenario a pure proxy server (proxy-svr) acts as a proxy for IMAP and POP requests but not SMTP. The back-end server (msma-svr) contains the message store and supports SMTP in addition to IMAP and POP protocols. Finally a dedicated server (dir-svr) provides directory services. For additional information on proxy servers, see Appendix A, "Configuring SIMS as a Proxy Message Access Server.

FIGURE 12-1 Monitoring Scenario 2 Environment.

Since each server isn't running the full set of SIMS services, we need to tailor the monitoring configuration on each server. In this case the services we need to monitor in the various systems are:

	1.	`proxy-svr:` IMAP/POP access and response time, resource utilization (cpu, disk).
	2.	`msma-svr:` IMAP/POP/SMTP access and response time, message round-trip delivery time, message store capacity and cpu utilization.
	3.	`dir-svr:` LDAP access and response time, directory capacity and CPU utilization.

We will assume the threshold values listed in the table below. As in the previous example, the alert mechanism sends e-mail alerts to inetmail and to all SIMS administrators, so if the SIMS server is having mail delivery problems the alerts may not be received. Use the -A option to specify an alternate SMTP server for the alert delivery channel. For optimal monitoring, schedule the utilities to run at an interval of 30 minutes (see the crontab(1)).

Thresholds and crontab entries for proxy-svr

Monitored Value

Threshold

1) IMAP: login, select inbox, close time

120 seconds

2) POP: login, stat inbox, quit time

150 seconds

3) Number of IMAP connections

500

4) Number of POP connections

250

5) Solaris: virtual memory usage

90%

6) Solaris: /var/opt used capacity

2 gigabytes

Monitored Value	Threshold
1) IMAP: login, select inbox, close time	120 seconds
2) POP: login, stat inbox, quit time	150 seconds
3) Number of IMAP connections	500
4) Number of POP connections	250
5) Solaris: virtual memory usage	90%
6) Solaris: /var/opt used capacity	2 gigabytes

The access times for IMAP and POP (items 1 and 2) can be checked with a single command:

30 * * * * immonitor access -u test_user -w passwd -I proxy-svr:=120 -P proxy-svr:=150 -r inetmail > /dev/null 2>&1

The number of established IMAP and POP connections (items 3 and 4) can be checked with the command:

40 * * * * immonitor system -p IMAP=500 -p POP3=250 -r inetmail > /dev/null 2>&1

Virtual memory usage (5) and the utilization of the /var/opt partition (6) can be measured with the following entry:

0 * * * * immonitor system -f /var/opt=2g -m 20g -m 90\‰ -r inetmail > /dev/null 2>&1

Thresholds and crontab entries for msma-svr

Monitored Value

Threshold

1) SMTP send message tim

60 seconds

2) IMAP login, select inbox, delete time

90 seconds

3) POP login, retrieve message, close time

120 seconds

4) Number of IMAP connections

500

5) Number of POP connections

250

6) Number of SMTP connections

100

7) Message store size

2 gigabytes

8) Solaris: virtual memory usage

90%

crontab entries: The message delivery time comprises the sum of the times to deliver a message using SMTP and then retrieve the message using IMAP or POP. So, items (2) and (3) above are measured in conjunction with the SMTP message send time (1).

SMTP / IMAP:

20 * * * * immonitor access -u test_user -w passwd -S msma-svr:=60 -I msma-svr:=90 -k "Testing" -r inetmail > /dev/null 2>&1

SMTP / POP:

40 * * * * immonitor access -u test_user -w passwd -S msma-svr:=60 -P msma-svr:=120 -k "Testing" -r inetmail > /dev/null 2>&1

If the IMAP or POP threshold is exceeded, messages sent using the -S option will not be deleted, so it is necessary to delete these messages periodically using another crontab entry:

0 * * * * immonitor access -u test_user -w passwd -I msma-svr:=600 -z -k "Testing" > /dev/null 2>&1

Established connections for POP, IMAP and SMTP can be checked with one command:

10 * * * * immonitor system -p POP3=200 -p IMAP=500 -p SMTP=100 -r inetmail > /dev/null 2>&1

Virtual memory usage and the size of the message store (/var/opt/SUNWmail/ims) can be measured with the following crontab entry:

10 * * * * immonitor system -f /var/opt/SUNWmail/ims=2g -m 90\‰ -r inetmail > /dev/null 2>&1

Thresholds and crontab entries for dir-svr

Monitored Value

Threshold

1) LDAP bind, search time

10 seconds

2) Directory size

200 megabytes

3) Solaris: virtual memory usage

90%

LDAP:

20 * * * * immonitor access -u test_user -L dir-svr:=10 -r inetmail > /dev/null 2>&1

Virtual memory usage (3) and the size of the directory (2)
(/var/opt/SUNWconn/ldap) can be measured with the following crontab entry:

10 * * * * immonitor system -f /var/opt/SUNWconn/ldap=200m -m 90\‰ -r inetmail > /dev/null 2>&1

SNMP Monitoring

The Sun Directory Service is SNMP enabled. It supports the X.500 Directory Monitoring MIB (rfc1567). It can therefore be monitored by any SNMP enabled management tool. Refer to Netscape Directory Service documentation for information on Netscape SNMP support.

To enable SNMP monitoring of the Sun Directory Service you should first enable the SNMP agent dsnmpserv(1m). The following describes how to enable this service:

Configure the SNMP configuration file /etc/opt/SUNWconn/ldap/current/dsnmpserv.conf by running the following command:

/opt/SUNWconn/sbin/dsnmpcfg

Start the agent:

/etc/init.d/init.dnsmpserv start

Monitored Value	Threshold
1) SMTP send message tim	60 seconds
2) IMAP login, select inbox, delete time	90 seconds
3) POP login, retrieve message, close time	120 seconds
4) Number of IMAP connections	500
5) Number of POP connections	250
6) Number of SMTP connections	100
7) Message store size	2 gigabytes
8) Solaris: virtual memory usage	90%

Monitored Value	Threshold
1) LDAP bind, search time	10 seconds
2) Directory size	200 megabytes
3) Solaris: virtual memory usage	90%

CHAPTER 12

SIMS Monitoring and Logging Logging Facilities 250 Logging and Tracking Messages in the IMTA 253 SIMS Monitoring Utilities 261 SNMP Monitoring 268

Logging Facilities Message Store/Message Access Log Files 250 Sun Directory Service Log Files 251 IMTA Log Files 252