3 Stopping and Starting Messaging Server

This chapter describes how to stop and start Oracle Communications Messaging Server services.

Starting and Stopping Services

Topics in this section:

To stop and start a Cassandra message store, see "Restarting a Cassandra Message Store Cluster". To stop and start Messaging Server services installed in a highly available environment, see "Stopping and Starting Messaging Services in an HA Environment".

To Start and Stop Messaging Server Services

Start and stop Messaging Server services from the command line by using the following commands:

MessagingServer_home/bin/start-msg
MessagingServer_home/bin/stop-msg

Though you can use the command template to start and stop services individually (MessagingServer_home/bin/stop-msg service (where service can be mta, imap, pop, store, http, ens, sched, purge, mfagent, snmp, mmp, sms, metermaid, cert, dispatcher, job_controller or watcher)), do not do so except in specific tasks as described. Certain services have dependencies on other services and must be started in a prescribed order. Complications can arise when trying to start services on their own. For this reason, you should start and stop all the services together by using the start-msg and stop-msg commands.

Note:

You must first enable services before starting or stopping them. See "Enabling and Disabling Services" for more information.

Important:

If a server process crashes, other processes might hang as they wait for locks held by the server process that crashed. If you are not using automatic restart (see "Automatic Restart of Failed or Unresponsive Services"), and if any server process crashes, it is generally safer to stop all processes, then restart all processes. This includes the POP, IMAP, and MTA processes, as well as the stored (message store) process, and any utilities that modify the message store, such as mboxutil, deliver, reconstruct, readership, or upgrade.

To Start Up, Shut Down, or View the Status of Messaging Services

Do not shut down individual services except in the specific tasks as described. Certain services have dependencies on other services and must be started in a prescribed order. Complications can arise when trying to start services on their own. For this reason, you should start and stop all the services together by using the start-msg and stop-msg commands.

However, when you make a configuration change that requires restart of a service, and you want to minimize service disruption, then use stop-msgservice followed by start-msgservice. For example, when changing an MTA option that requires a restart of the dispatcher but does not require a restart of the job_controller, it is better to run stop-msg dispatcher; start-msg dispatcher to avoid unnecessarily flushing the job_controller's cache. See "start-msg" and "stop-msg" for more information.

The services must be enabled to stop or start them. See "To Specify What Services Can Be Started" for more information.

To Specify What Services Can Be Started

By default the following services are started with start-msg:

start-msg
Connecting to watcher ...
Launching watcher ... 9347
Starting store server .... 9356
Checking store server status ..... ready
Starting purge server .... 9413
Starting imap server .... 9420
Starting pop server .... 9425
Starting http server .... 9437
Starting sched server ... 9451
Starting dispatcher server .... 9461
Starting job_controller server .... 9466

The complete list of scopes with an "enable" option is as follows. Some of these, such as notifytarget, service, and task are named scopes. See the discussion on scope syntax in "msconfig Command" for more information.

autorestart
notifytarget
ens
folderquota
imap
indexer
messagetype
msghash
dbreplicate
mta
metermaid
mmp
pab
pop
purge
relinker
schedule
service
smime
sms_gateway
snmp
store
task
typequota
watcher
http 

Set both imap.enable and imap.enablesslport to 0 to disable IMAP. The same goes for POP and HTTP. See Messaging Server Reference for details.

Starting and Stopping a Messaging Server Running in MTA-only Mode

To start an MTA-only system, you should also start imsched. Before you do this, remove any scheduled jobs that are not appropriate to your installations.

imsched is an individual component of Messaging Server that must be started separately if you are not starting all of Messaging Server. If you start your MTA-only system by using start-msg imta or start-msg mta, then you do not run the imsched process.

To run messaging server in MTA mode only (no store, imap, or pop processes), you can either select the MTA to be only installed and configured during the Messaging Server configuration after initial install (MessagingServer_home/bin/configure), or manually disable the message store and mshttp process by using the following commands:

msconfig set store.enable 0
msconfig set http.enable 0

Once you have disabled HTTP and other store processes, you can then start Messaging Server by running the following command:

start-msg
Connecting to watcher ...
Launching watcher ... 4034
Starting ens server ... 4035
Starting sched server ... 4036
Starting dispatcher server .... 4038
Starting job_controller server .... 4042

All the appropriate processes are started, including imsched and imta. This way you do not have to remember to start the sched process.

Stopping and Starting Messaging Services in an HA Environment

While Messaging Server is running under HA control, you cannot use the normal Messaging Server start, restart, and stop commands to control individual Messaging Server services. For example, if you attempt a stop-msg in an HA deployment, the system warns that it has detected an HA setup and informs you how to properly stop the system.

The appropriate HA start, stop, and restart commands are shown in Table 3-1, Table 3-2, Table 3-3, and Table 3-4. There are no specific HA commands to individually start, restart, or stop other Messaging Server services (for example, SMTP). However, you can run a stop-msgservice command to stop/restart individual servers such as imap, pop or sched.

The finest granularity in Oracle Solaris Cluster (formerly known as Sun Cluster) is that of an individual resource. Because Messaging Server is known to Oracle Solaris Cluster as a resource, the Oracle Solaris Cluster scswitch commands affect all Messaging Server services as a whole.

Table 3-1 Start, Stop, Restart in an Oracle Solaris Cluster 3.0/3.1 Environment

Action Individual Resource Entire Resource Group

Start

scswitch -e -jresource

scswitch -Z -gresource_group

Restart

scswitch -n -jresourcescswitch -e -jresource

scswitch -R -gresource_group

Stop

scswitch -n -jresource

scswitch -F -gresource_group


Table 3-2 Start, Stop, Restart in an Oracle Solaris Cluster 3.2 Environment

Action Individual Resource Entire Resource Group

Start

clrs onlineresource

clrg onlineresource_group

Restart

clrs disableresourceclrs enableresource

clrg restartresource_group

Stop

clrs offlineresource

clrg disableresource_group


Table 3-3 Start, Stop, Restart in Veritas 3.5, 4.0, 4.1 and 5.0 Environments

Action Individual Resource Entire Resource Group

Start

hares -onlineresource_name -sys system_name

hagrp -onlinegroup_name -sys system_name

Restart

hares -offline resource_name -sys system_name

hares -online resource_name -sys system_name

hagrp -offline service_group -sys system_name

hagrp -online service_group -sys system_name

Stop

hares -offline resource_name -sys system_name

hagrp -offline service_group -sys system_name


Table 3-4 Start, Stop, Restart in an Oracle Clusterware 12.1 Environment

Action Individual Resource

Start

crsctl start resource MS resource -n system

Restart

crsctl stop resource MS resource -n system

crsctl start resource MS resource -n system

Stop

crsctl stop resource MS resource -n system


Automatic Restart of Failed or Unresponsive Services

This section describes how Messaging Server monitors and automatically restarts unresponsive services.

Overview of Messaging Server Monitoring Processes

Messaging Server provides two processes called watcher and msprobe that transparently monitor services and automatically restart them if they crash or become unresponsive (the services hangs). watcher monitors server crashes. msprobe monitors non-responsive server processes by checking their response times. When a server fails or stops responding to requests, it is automatically restarted. Table 3-5 shows the services monitored by each utility.

Table 3-5 Services Monitored by watcher and msprobe

watcher (crash) msprobe (unresponsive hang)

IMAP, POP, HTTP, job controller, dispatcher, message store (stored), imsched, MMP. (LMTP/SMTP servers are monitored by the dispatcher and LMTP/SMTP clients are monitored by the job_controller.) The watcher also monitors all processes that access the message store in such a way that they could hold outstanding message store locks when they crash. This includes ims_master, lmtp_server, and store utilities.

IMAP, POP, HTTP, cert, job controller, message store (stored), imsched, ENS, LMTP, SMTP


Enabling the watcher (watcher.enable 1, default) monitors process failures and unresponsive services and logs error messages to the default log file indicating specific failures. To enable automatic server restart, use msconfig to set base.autorestart.enable 1. By default, this option is set to no (0).

If any of the message store services fail or freeze, all message store services that were enabled at start-up are restarted. For example, if imapd fails, at the least, stored and imapd are restarted. If other message store services were running, such as the POP or HTTP servers, then those are restarted as well, whether or not they failed.

Automatic restart also works if a message store utility fails or freezes. For example, if mboxutil fails or freezes, the system automatically restarts all the message store servers. However, it does not restart the utility. msprobe runs every 10 minutes. Service and process restarts are performed once within a 10-minute period (and are configurable by using base.autorestart). If a server fails more than once during this designated period of time, then the system stops trying to restart this server. If this happens in an HA system, Messaging Server is shut down and a failover to the other system occurs.

Whether or not base.autorestart.enable is enabled, the system still monitors the services and sends failure or non-response error messages to the console and DataRoot/log/watcher listens to port 49994 by default, but this is configurable with the watcher.port option.

A watcher log file is generated in DataRoot/log/watcher. This log file is not managed by the logging system (no rollover or purging) and records all server starts and stops. The following is an example log:

watcher process 13425 started at Mon June 4 11:23:54 2012

Watched 'imapd' process 13428 exited abnormally
Received request to restart:  store imap pop http
Connecting to watcher ...
Stopping http server 13440 .... done
Stopping pop server 13431 ... done
Stopping pop server 13434 ... done
Stopping pop server 13435 ... done
Stopping pop server 13433 ... done
imap server is not running
Stopping store server 13426 .... done
Starting store server .... 13457
checking store server status ...... ready
Starting imap server ..... 13459
Starting pop server ....... 13462
Starting http server ...... 13471

See "Monitoring Using msprobe and watcher Functions" for more details on how to configure this feature.

msprobe is controlled by imsched. If imsched crashes, this event is detected by watcher and triggers a restart (if autorestart is enabled). However, in the rare occurrence of imsched hanging, you must kill imsched with a killimsched_pid command, which causes the watcher to restart it.

Automatic Restart in High Availability Deployments

Table 3-6 shows the configuration options to be set for automatic restart in high availability deployments:

Table 3-6 HA Automatic Restart Options

Option Description/ HA Value

watcher.enable

Enable watcher on start-msg startup. Default is enabled (1).

base.autorestart.enable

Enable automatic restart of failed or frozen (unresponsive) servers including IMAP, POP, HTTP, job controller, dispatcher, and MMP servers. Default is enabled).

base.autorestart.timeout

Failure retry time-out. If a server fails more than once during this designated period of time, then the system stops trying to restart this server. If this happens in an HA system, Messaging Server is shutdown and a failover to the other system occurs. The value (set in seconds) should be set to a period value longer than the msprobe interval. (See schedule.task: in the following section). Default is 600.

schedule.task:msprobe.crontab

msprobe runs schedule. A crontab style schedule string. Default is 5,15,25,35,45,55 * * * * lib/msprobe. To disable, run the following command:msconfig set schedule.task:msprobe.enable 0