17
Monitoring and Troubleshooting

eMail Server provides several tools and reports to assist you in monitoring and troubleshooting your messaging system. Using the Administration Tool or OOMGR, you can monitor the operation of your eMail Server system. These tools offer a wide variety of tests that you can run automatically to monitor message flow and database space usage and report any system problems.

This chapter contains the following information on using those tools and generating reports:

Using Server Process Logs

The eMail Server server process logs provide a continuously running account of system operations and events. They contain entries for all normal operations and for all errors that occur. Log files are useful for monitoring system performance, but unlike monitor reports, the information is not sent to a user account so you must check the log files periodically to identify problems.

Finding the Log Files

All log files are located in $ORACLE_HOME/office/log /
<node_ sid>.

The log filename format is <hostname_server>_<process_name><instance_number>.log, where hostname_server is the name of the host computer for the database, process_name is the process type, and instance_number is the number of the process instance you're checking.

Following are examples of the process log file names for instance 1 of each process. The database has the SID acme and is located on the acmehost computer.

Process Type Log Filename

collector
(previously called Garbage Collector)

acme_acmehost_collector01.log

guardian

acme_acmehost_guardian01.log

monitor

acme_acmehost_monitor01.log

postman

acme_acmehost_postman01.log

replicator

acme_acmehost_replicator01.log

statistics

acme_acmehost_statistics01.log

Process Type	Log Filename
collector (previously called Garbage Collector)	acme_acmehost_collector01.log
guardian	acme_acmehost_guardian01.log
monitor	acme_acmehost_monitor01.log
postman	acme_acmehost_postman01.log
replicator	acme_acmehost_replicator01.log
statistics	acme_acmehost_statistics01.log

See Also:
The Oracle eMail Server Installation Guide for more information about the tnsnames.ora file

Setting the Log Level

Most processes have parameters where you can specify the level of logging you want to see in the log files. eMail Server provides standards for log levels, but keep in mind that these standards are applied to the processes in different ways. For example, some processes may require only the first two log levels, or two processes may display different types of information displayed for log level 3 (medium information).

Log Level Description

0

No logging

1

Error messages only

2

Minimum information and warnings

3

Medium information

4

Maximum information

5

Debugging information

Log Level	Description
`0`	No logging
`1`	Error messages only
`2`	Minimum information and warnings
`3`	Medium information
`4`	Maximum information
`5`	Debugging information

See Also:
Chapter 11, "Process Parameter Reference", for more information about setting the log level parameter for a specific process

Note:
When setting log levels, remember that error messages and other types of information are appended to the log file for the entire time that a process is running. As log files increase in size, they can become difficult to manage, and they can become quite large. You may choose to archive the log files periodically.

Reading the Log Files

eMail Server provides standards for reporting messages in log files so that you can easily find and interpret the information you need to monitor your system effectively.

The standards are different for different types of messages, but all messages contain a time stamp and type code. The type code indicates to which log level the message belongs. You can use this information to determine whether the log level for a parameter is displaying the type of information you want to see. The type codes are as follows:

Type Code Log Level Description

ERR

1

Error messages only

INF

2

Minimum information and warnings

DIAG

3 and 4

Medium or Maximum information

DBG

5

Debugging information

Type Code	Log Level	Description
`ERR`	`1`	Error messages only
`INF`	`2`	Minimum information and warnings
`DIAG`	`3` and `4`	Medium or Maximum information
`DBG`	`5`	Debugging information

For example, if the log level for a postman process is set to 4, the log file will display messages containing the ERR, INF, and DIAG codes.

Normal Operation Messages

Log messages reporting normal operations contain the date, time, type code, and description of the log entry.

For example:

11/16 18:13:23 INF: Guardian process started.

Error Messages

Error messages that appear in the log files contain the date, time, type code, component ID, error number, and description of the log entry.

For example:

11/16 18:13:23 ERR ORA-942: Table or view does not exist

To display a cause and action for this error message, you can enter the following command at the command line (make sure ORACLE_HOME is set correctly first):

$ oerr ora 942

The error number (in this example, ORA-942) corresponds to the errors listed in Chapter 19, "Error Codes and Messages". The component ID indicates which server process experienced the error.

User Action Messages

Log messages that reflect user actions for the protocol server processes such as the POP3SRV process and the IMAP4SRV process contain the date, time, type code, user ID, thread ID, description of the log entry.

For example:

11/16 18:13:23 INF jdoe.10: Login succeeded

Using the OEM eMail Server Capacity Monitoring Pack

eMail Server provides a monitoring pack that tracks the capacity of the system through Oracle Enterprise Manager (OEM). The monitoring pack displays the length of message queues and number of concurrent IMAP server connections in a series of charts.

Note:
Oracle Enterprise Manager with the Diagnostics Pack must be installed prior to using the eMail Server capacity monitoring pack. See Also:
Oracle eMail Server Installation Guide for more information.

To use the eMail Server capacity monitoring pack, you must first configure your system.

Shut down the Oracle agent:
```
lsnrctl dbsnmp_stop
```
Start all eMail Server processes.
```
IOFCMGR> startup all:
```
Restart the Oracle agent:
```
lsnrctl dbsnmp_start
```
Start Oracle Enterprise Manager
From the Oracle Enterprise Manager Console Administrator screen, select Navigator --> Discover Nodes. The Discovery Wizard screen displays.
In Step1 of the Specify Nodes screen, enter the hostname of the system where eMail Server is installed and select Next. The Progress screen displays.
Verify that the information displayed is correct and select Finish. A confirmation screen displays.
From the Oracle Enterprise Manager Console Administrator screen, expand the Mail Servers folder. Your mail servers should be displayed.
Select a mail server and select Tools --> Diagnostics Pack --> Performance Manager. This launches the Performance Manager application.
From the Oracle Performance Manager screen, expand IMAP or Message Queue.

Note:
The IMAP Connection Chart can only be used with mail servers with names beginning with IMAP_ . The Message Queue Length Chart can be used with all other mail servers.
Select IMAP Connection Chart or Message Queue Length.

If you select IMAP Connection Chart, the Selected Data Sources section will display a list of message stores. If you select Message Queue Length, the Selected Data Sources section will display message queues.

Once the Data Sources have been selected, select Show Chart. The selected chart displays.

See Also:
Oracle Enterprise Manager Administrator's Guide for more information on Oracle Enterprise Manager.

Using Monitoring Reports

eMail Server provides a default user account called ORAPOST that receives error messages from the monitor reports and statistic tasks and notifications regarding messages that could not be sent. This is a standard user account, so you can check the messages using the same client software that your users use to check their messages. You can also change the recipient account by modifying certain process parameters.

Monitor reports identify problem areas and suggest ways to avoid possible problems, such as low disk space, before they occur. You can choose from a wide variety of monitoring reports to help you keep your system running smoothly.

You select tests from the list provided, and specify how often to run them. Once the tests are run, the monitor sends a report. Monitor reports are sent as e-mail messages to an account that you specify (the default is ORAPOST). You can save these messages, write them to a file, print them, or do anything with them that you can do with an e-mail message.

See Also:

"Monitoring Messages in the ORAPOST Account" for more information about changing the ORAPOST account

"Running the Monitor Tests and Statistics Tasks" for more information on monitor tests and statistics tasks

Monitoring Messages in the ORAPOST Account

Messages from the monitor and statistics processes are directed to a user account in your directory called ORAPOST. If there are no problems, the monitor sends a "No Problem" report to this account. A No Problem report is a blank message with a subject line that you specify in the goodSubj parameter for the monitor process. If a test discovers a problem or potential problem, the monitor sends a "Problem" report to this account. Problem reports contain information about any problems found, as well as suggestions for fixing the problems. Problem reports consist of subreports that correspond to particular tests.

The ORAPOST account also receives messages from the postman process.

You can change the user account to which this information is sent by modifying the postmaster parameter for the postman process, or the probRecips and noProbRecips parameters for the monitor and statistics processes.

You can view the ORAPOST account by logging on to the system with a messaging client as you would for any other user. You can change the password for the ORAPOST account by changing the password attribute for the ORAPOST directory entry. Refer to "Setting a User's Password" for instructions.

See Also:
Chapter 9, "Managing Processes", for more information about changing parameters

Running the Monitor Tests and Statistics Tasks

You must run the monitor and statistics processes to collect data and run the tests used to create reports. The monitor process checks message flow and database space usage, and the statistics process collects information about delivery time and database space usage that is used by the monitor tests to create reports.

You can start the monitor and statistics processes and run them at configurable intervals, or you can use the following instructions to see immediate results in the Administration Tool.

See Also:
"Starting a Registered Process" for more information about the configurations of the monitor and statistics processes

Steps for Running the Monitor Tests and Statistics Tasks

This task can only be performed through the Administration Tool GUI.

In the Administration Tool GUI

Start the Administration Tool GUI.
In the navigation tree, select Messaging System > Nodes > node_name.
In the menu, choose Message System > Diagnose...
In the Diagnose dialog box, select the node you want to diagnose from the Node list box.
Click the Start Diagnosis button.
Click Yes, then OK in the confirmation boxes.
Wait for the diagnostic tests to complete.
When you see the message "Background command is complete!", click OK.

The results appear in the Diagnose dialog box. The Test Fail? column contains an X for any tests that failed.

See Also:
Chapter 18, "Monitor Test Reference", for more information about the monitor test results

Activating the Monitor Tests

After selecting the tests you want to run, you must activate the tests. Once activated, the tests run automatically whenever the monitor process is running.

See Also:
Chapter 18, "Monitor Test Reference", for a complete list of the available tests

Prerequisites to Activating the Monitor Tests

In the Administration Tool, you must run the statistics tasks and monitor tests once before you can activate a monitor test. Refer to "Running the Monitor Tests and Statistics Tasks" for instructions.

Steps for Activating the Monitor Tests

This task can be performed through either the Administration Tool GUI, or the OOMGR command-line interface.

In the Administration Tool GUI

Start the Administration Tool GUI.
In the navigation tree, select Messaging System > Nodes > node_name.
In the right pane, select the Monitor Tests Results tab.
In the Monitor Tests Results tab, double-click the test that you want to activate.
In the dialog box, select Yes from the Active list box.
Click OK.

In OOMGR

Start OOMGR.
For each test that you want to activate, enter the following command at the OOMGR prompt:
```
IOFCMGR>modify monitor name=<test_name> to active=Y; 
```

Specifying the Tasks for the Statistics Process

You can specify which tasks you want the statistics process to perform and how often these tasks are performed.

Note:
The statistics process only gathers data for the tests that are active. Refer to "Activating the Monitor Tests" for instructions on activating tests.

Steps for Specifying the Tasks for the Statistics Process

This task can only be performed through the OOMGR command-line interface.

In OOMGR

Start OOMGR.

For each task that you want to perform, enter the following command at the OOMGR prompt:

IOFCMGR>modify statistics name=<task_name>
2>to active=Y frequency=<task_performance_interval_in_minutes>;

If you want to cancel a task that is already active, enter the following command at the OOMGR prompt:
```
IOFCMGR>modify statistics name=<task_name> to active=N; 
```
See Also:
"Task Names of the Statistics Process" for more information about the values to enter
If the statistics process was running when you made these changes, refresh the process so that the changes can take effect. Refer to "Refreshing a Process" for instructions.

Task Names of the Statistics Process

The statistics process can perform the following tasks:

Task Name Description:

segments

Gathers information for the database space tests, except the tablespace_full and tablespace_frag tests.

free_space

Gathers information for the database space tests, except the tablespace_full and tablespace_frag tests.

tablespaces

Gathers information for the tablespace_full and tablespace_frag tests.

queues

Gathers information for the queue_status test.

delivery

Gathers information for the message flow tests, except the queue_status test.

Task Name	Description:
`segments`	Gathers information for the database space tests, except the `tablespace_full` and `tablespace_frag` tests.
`free_space`	Gathers information for the database space tests, except the `tablespace_full` and `tablespace_frag` tests.
`tablespaces`	Gathers information for the `tablespace_full` and `tablespace_frag` tests.
`queues`	Gathers information for the `queue_status` test.
`delivery`	Gathers information for the message flow tests, except the `queue_status` test.

Deactivating a Monitor Test

You can deactivate a test if you no longer want it to run automatically.

See Also:
Chapter 18, "Monitor Test Reference", for a complete list of the available tests

Steps for Deactivating a Monitor Test

This task can be performed through either the Administration Tool GUI, or the OOMGR command-line interface.

In the Administration Tool GUI

Start the Administration Tool GUI.
In the navigation tree, select Messaging System > Nodes > node_name.
In the right pane, select the Monitor Tests Results tab.
In the Monitor Tests Results tab, double-click the test that you want to deactivate.
In the dialog box, select No from the Active list box.
Click OK.

In OOMGR

Determine which test you want to deactivate. To see which tests are currently activated, use SQL*Plus to query the om_mon_test table. The Active column in this table contains a Y if the test is active.
Start OOMGR.

Enter the following command at the OOMGR prompt:

IOFCMGR>modify monitor name=<test_name> to active=N;

Checking All Tablespaces on a Node

In addition to the information available through the database space tests, you can also check all tablespaces on a given node. This procedure displays the bytes used, bytes free, and the maximum free extents for each tablespace.

See Also:
"Database Space Tests" for information on database space tests

Steps for Checking All Tablespaces on a Node

This task can only be performed through the OOMGR command-line interface.

In OOMGR

Start OOMGR.
Enter the following command at the OOMGR prompt:
```
IOFCMGR>display db_space;
```

Checking Space Used by an Individual User

In addition to the information available through the database space tests, you can also check the space used by an individual user.

See Also:
"Database Space Tests" for information on database space tests

Steps for Checking Space Used by an Individual User

This task can only be performed through the OOMGR command-line interface.

In OOMGR

Start OOMGR.
Enter the following command at the OOMGR prompt:
```
IOFCMGR>display quota user=<username>
```

Monitoring Protocol Server Processes

Protocol server tests probe the listener to retrieve run-time information about the protocol server processes. You can use this information to determine whether eMail Server is configured for optimal performance. For example, if you notice that some of the database connections are not being used regularly, you can lower the minimum number of connections for the protocol server and reduce the memory usage.

See Also:
"Understanding the Probe Results" for information about how to interpret the listener probe results

Prerequisites to Monitoring Protocol Server Processes

Before running the listener probe, determine what kind of information you want to see. The following information is available:

Connections

Displays information about the protocol server process connections with the database

Users

Displays information about the users connected to the protocol server processes

Shared Memory Info

Displays information about how much memory all protocol server processes are using

Steps for Monitoring Protocol Server Processes

This task can be performed through either the Administration Tool GUI, or through a telnet session.

In the Administration Tool GUI

Start the Administration Tool GUI.
In the navigation tree, select Messaging System > Nodes > node_name > Processes > Instances > process_instance > Running Hosts > host_name.
In the right pane, select the tab for the type of information you want to monitor.
In the toolbar, click (Probe).

You can click this button whenever you want to refresh the information in the right pane.

In a Telnet Session

Open a shell tool.

Enter the following command at the shell prompt:

$ telnet <server> <diagport>

The default diagport is 5010.

You should see something like the following:

Trying 111.11.1.111...
Connected to <server>.
Escape character is '^]'.
usage: probe <class_ID> <instance_ID> [action=dbinfo | clientinfo | all]
probe 0 0 action=shminfo

See Also:
"IOLISTENER Process Parameters" for more information about the diagport parameter

To use the listener probe, enter one of the following commands:

To perform this task: Enter this command:

Display database connection information

probe <class_ID> <instance_ID> action=dbinfo

Display client connection information

probe <class_ID> <instance_ID> action=clientinfo

Display both database and client connection information

probe <class_ID> <instance_ID> action=all

Display shared memory information

probe 0 0 action=shminfo

Shared memory information is for all protocol server processes, so you must enter zeros for the class ID and instance ID.

See Also:
"Parameters for Monitoring Protocol Server Processes" for more information about the parameters available with this command

Parameters for Monitoring Protocol Server Processes

Use the following values when running the Probe through a telnet session:

Value Description

class_ID

Unique server identification
Valid values:

27 IMAP4

23 POP3

instance_ID

Instance number of the protocol server process you want to check

Understanding the Probe Results

The following example shows typical results of the probe command for displaying database connection information (probe <class_ID> <instance_ID> action=dbinfo).

Tue May 19 11:53:34 1998
Database: msgdata
  Con  Audsid    St   Access#   T#   F#   UserName.ThreadId   XCur FCur
  ---  --------  --   -------   --   --   -----------------   ---- ----
  0    18275860  0    3875      0    0    llane5.43
  1    18275862  5    1154      1    0    ckent1.12           112
  2    18275863  1    398       0    1    jolsen2.143               27
  3    18275864  0    4         0    0    bwayne1.12
  4    18275866  0    0         0    0
  5    18275867  0    0         0    0
  6    18275868  0    0         0    0
  7    18275869  0    0         0    0
  8    18275870  0    0         0    0

These results show that there are eight connections to the msgdata database. It also shows runtime usage information about these connections.

Database Connections

The following parameters appear in the listener probe output for database connection information (probe <class_ID> <instance_ID> action=dbinfo).

Parameter Description

access#

Number of times this connection is accessed for SQL processing.

audsid

Auditing session identifier used by the database.

con

Unique number assigned to the connection for record keeping.

f#

Number of active database fetches.

fcur

List of eMail Server cursor IDs that have active fetches.

st

Connection state.
Valid values:

1 Locked

2 Dynamic connection

4 Active transaction

If there are multiple states, then the values are added together. For example, a connection that is locked and an active transaction would have a state of 5.

t#

Number of active database transactions.

username.threadId

Most recent user and protocol server thread that accessed this connection.

xcur

eMail Server cursor ID that started the transaction.

Users

The following parameters appear in the listener probe output for client connection information (probe <class_ID> <instance_ID> action=clientinfo).

Parameter Description

cur

eMail Server cursorid associated with the user.

database

Connect string for the database to which the user is connected.

ip_address

IP address associated with the user.

key

Unique number assigned to the user for record keeping.

login_time

Last time a database connection was accessed.

session_ID

Auditing session identifier used by the database.

username.threadId

User and protocol server thread that accessed a connection.

Shared Memory

The following parameters appear in the listener probe output for shared memory information (probe 0 0 action=shminfo).

Parameter Description

class

Protocol server process class ID.
Valid values:

27 IMAP4

23 POP3

instance

Instance ID for the protocol server process.

load

Number of active client connections for this instance of the protocol server process.

max_cli

Maximum number of clients that can connect to this instance of the protocol server process.

pid

UNIX process ID number for this instance of the protocol server process.

port_ID

Port receiving incoming messages for this instance of the protocol server process.
Default values:

143 IMAP4 clients

110 POP3 clients

status

Status of this instance of the protocol server process.
Valid values:

active

conn_lost

Unlocking the User's INBOX

If a client is disconnected from the e-mail server unexpectedly, the user's INBOX may be locked so the user cannot access it after logging in again.

Steps for Unlocking the User's INBOX

This task can only be performed through SQL*Plus.

In SQL*Plus

Use SQL*Plus to connect to the database as user OO.
Run the $ORACLE_HOME/admin/rsql/OM_inboxlock.sql script.
When the script asks for the user name, enter the user name for the user who has the locked INBOX. The script hangs if the user's INBOX is locked by another session. The script also returns the session ID (SID) number. Write down this number to use in step 6.
To release the INBOX, open another shell tool and use SQL*Plus to connect to the database as user SYS.
Run the $ORACLE_HOME/admin/rsql/OM_inboxlockrel.sql script.
When the script asks for the session ID, enter the SID from step 3.

The script tells you what SQL statement to use to unlock the INBOX. In the following example, 11 represents the SID, and 2474 is the serial number.
```
ALTER SYSTEM KILL SESSION '11, 2474';
```
Enter the SQL statement from step 6 (ALTER SYSTEM KILL SESSION '11, 2474';). This removes the lock so the user can access the INBOX.
Commit your changes and exit from all SQL*Plus sessions.

Connections	Displays information about the protocol server process connections with the database
Users	Displays information about the users connected to the protocol server processes
Shared Memory Info	Displays information about how much memory all protocol server processes are using

To perform this task:	Enter this command:
Display database connection information	probe <`class_ID`> <`instance_ID`> action=dbinfo
Display client connection information	probe <`class_ID`> <`instance_ID`> action=clientinfo
Display both database and client connection information	probe <`class_ID`> <`instance_ID`> action=all
Display shared memory information	probe 0 0 action=shminfo Shared memory information is for all protocol server processes, so you must enter zeros for the class ID and instance ID.

Parameter	Description
`access#`	Number of times this connection is accessed for SQL processing.
`audsid`	Auditing session identifier used by the database.
`con`	Unique number assigned to the connection for record keeping.
`f#`	Number of active database fetches.
`fcur`	List of eMail Server cursor IDs that have active fetches.
`st`	Connection state. Valid values: `1` Locked `2` Dynamic connection `4` Active transaction If there are multiple states, then the values are added together. For example, a connection that is locked and an active transaction would have a state of `5`.
`t#`	Number of active database transactions.
`username.threadId`	Most recent user and protocol server thread that accessed this connection.
`xcur`	eMail Server cursor ID that started the transaction.

Parameter	Description
`cur`	eMail Server cursorid associated with the user.
`database`	Connect string for the database to which the user is connected.
`ip_address`	IP address associated with the user.
`key`	Unique number assigned to the user for record keeping.
`login_time`	Last time a database connection was accessed.
`session_ID`	Auditing session identifier used by the database.
`username.threadId`	User and protocol server thread that accessed a connection.

Parameter	Description
`class`	Protocol server process class ID. Valid values: `27` IMAP4 `23` POP3
`instance`	Instance ID for the protocol server process.
`load`	Number of active client connections for this instance of the protocol server process.
`max_cli`	Maximum number of clients that can connect to this instance of the protocol server process.
`pid`	UNIX process ID number for this instance of the protocol server process.
`port_ID`	Port receiving incoming messages for this instance of the protocol server process. Default values: `143` IMAP4 clients `110` POP3 clients
`status`	Status of this instance of the protocol server process. Valid values: `active` `conn_lost`

17Monitoring and Troubleshooting

Using Server Process Logs

Finding the Log Files

Setting the Log Level

Reading the Log Files

Normal Operation Messages

Error Messages

User Action Messages

Using the OEM eMail Server Capacity Monitoring Pack

Using Monitoring Reports

Monitoring Messages in the ORAPOST Account

Running the Monitor Tests and Statistics Tasks

Steps for Running the Monitor Tests and Statistics Tasks

In the Administration Tool GUI

Activating the Monitor Tests

Prerequisites to Activating the Monitor Tests

Steps for Activating the Monitor Tests

In the Administration Tool GUI

In OOMGR

Specifying the Tasks for the Statistics Process

Steps for Specifying the Tasks for the Statistics Process

In OOMGR

Task Names of the Statistics Process

Deactivating a Monitor Test

Steps for Deactivating a Monitor Test

In the Administration Tool GUI

In OOMGR

Checking All Tablespaces on a Node

Steps for Checking All Tablespaces on a Node

In OOMGR

Checking Space Used by an Individual User

Steps for Checking Space Used by an Individual User

In OOMGR

Monitoring Protocol Server Processes

Prerequisites to Monitoring Protocol Server Processes

Steps for Monitoring Protocol Server Processes

In the Administration Tool GUI

In a Telnet Session

Parameters for Monitoring Protocol Server Processes

Understanding the Probe Results

Database Connections

Users

Shared Memory

Unlocking the User's INBOX

Steps for Unlocking the User's INBOX

In SQL*Plus

17
Monitoring and Troubleshooting