67 Monitoring Disk Space

This chapter describes configuration options for monitoring disk and partition usage and for generating warnings about disk space availability.

Topics:

Overview

Inadequate disk space or inadequate space within a disk partition are among the most common causes of mail server problems and failure. Typical causes of inadequate space are:

  • Message store quotas are not enforced and the message store outgrows the disk space available for a partition.

  • Over-long MTA message queues.

  • Log files that are not adequately monitored and kept within defined limits. (Note that there are a number of log files such as LDAP, MTA, and Message Access, and that each of these log files can be stored on different disks.)

Symptoms of Insufficient Disk Space

Symptoms of insufficient disk-space are:

  • MTA queues overflow and reject SMTP connections.

  • Messages remain in the ims-master queue and are not delivered to the message store.

  • Log files overflow.

If a message store partition fills up, message access daemons can fail and message store data can be corrupted. Message store maintenance utilities such as imexpire and reconstruct can repair the damage and reduce disk usage. However, these utilities require additional disk space, and repairing a partition that has filled an entire disk can cause down time.

Monitoring Disk Space

Depending upon the system configuration you may need to monitor various disks and partitions. For example, MTA queues may reside on one disk/partition, message stores may reside on another, and log files may reside on yet another. Each of these spaces will require monitoring and the methods to monitor these spaces may differ.

Oracle Communications Messaging Server provides specific methods for monitoring message store disk usage and preventing partitions from filling up all available disk space.

You can take the following steps to monitor the message store's use of disk space:

  • Set options to monitor message store disk usage

  • Lock message store partitions when a disk-usage threshold is reached

Monitoring the Message Store

You can monitor message store disk usage by configuring the following attributes with the msconfig utility:

  • alarm.system:<alarmtype>.statinterval Specifies the length of time, in seconds, between disk availability checks. For example, to set the system to monitor disk space every 600 seconds, enter the following command:

    msconfig set alarm.system:diskavail.statinterval 600
    
  • alarm.system:<alarmtype>.threshold Specifies a percentage of disk space that must be available or a warning is generated. For example, it is recommended that disk-space usage should not exceed 75%; correspondingly, the following command generates a warning whenever the amount of disk space available falls below 25%:

    msconfig set alarm.system:diskavail.threshold 25
    
  • alarm.system:<alarmtype>.warninginterval Specifies an interval, in hours, between the repetition of disk availability alarms. For example, the following command sets an interval of one hour between one disk availability warning and another.

    msconfig set alarm.system:diskavail.warninginterval 1
    
  • alarm.system:<alarmtype>.description A description of the disk availability alarm. For example, the following sets the description of an availability alarm for a message-queue alarm:

    msconfig set alarm.system:diskavail.description "Percentage message-queue partition diskspace available"
    

Monitoring Message Store Partitions

By default, partition monitoring is in effect so that when a message-store partition uses more than a specified percentage of available disk space, the partition is locked and any incoming messages are held in the MTA message queue. Two msconfig options control partition monitoring:

  • checkdiskusage

    checkdiskusage enables partition monitoring. It takes a boolean value; the default is 1 (monitoring is enabled).

  • diskusagethreshold

    diskusagethreshold specifies a disk-usage threshold beyond which the partition is locked. It takes an integer value from 1 to 99; the default value is 99.

As a partition approaches the threshold specified in diskusagethreshold, the message-store daemon checks the partition with increasing frequency, ranging from once every 100 minutes to once every minute. If disk usage goes higher than the threshold specified in diskusagethreshold, the message-store daemon:

  • Locks the partition; incoming messages are held in the MTA message queue and are not delivered to mailboxes in the message store partition until it is unlocked.

  • Logs a message to the default log file.

  • Sends an email notification to the postmaster. (You can change the recipient of the notification by setting the msconfigalarm.noticercpt option.)

In setting the diskusagethreshold option, specify a usage percentage that is low enough to allow time for repartitioning or assigning more disk space to the local message store. For example, if a partition fills up disk space at a rate of 2 percent per hour and it takes an hour to allocate additional disk space, set the disk-usage threshold to a value lower than 98 percent.