61 Message Store Automatic Failover with Database Replication

This chapter describes Oracle Communications Messaging Server database replication and classic message store failover.

Overview of Message Store Database Replication

Berkeley Database provides support for building high availability (HA) applications based on replication. Message store database replication uses Berkeley Database HA facilities and low cost NFS storage devices to build an HA message store.

The mboxlist database is a transactional database. Database changes are written to the transaction logs. You can replicate the database by transporting the transaction log records from one site to another. Berkeley Database HA architecture supports single writer (master) and multiple reader replication. You must perform all database updates on the master. Replicas are available for read only activity. When the master fails, an election takes place, and one of the replicas will take over as master.

Either all replicas are on SPARC CPUs or all replicas are on x86 CPUs. Replication between these two types of CPUs is not supported by Messaging Server.

The database replication and message store failover feature requires that replicas be on the same platform. Either all replicas are SPARC or all replicas are x86_64. It is also best practice for all replicas to be homogeneous (same hardware/CPU/OS), unless you are in the middle of a rolling hardware upgrade.

A message store replication group consists of one or more message store nodes. The message store nodes typically run on different physical hosts. The mboxlist database is replicated on every node. You can store the database locally. You store the mailbox partitions on remote storage devices running NFS servers. The NFS file systems are always mounted on all of the nodes in a replication group. You can configure the message store nodes with one or more remote hosts. If remote hosts are configured, the message store contacts a remote host to retrieve the replication group data on start up. If a master has not been established in a group, an election is called. A priority value is assigned to a node. When an election is held, the node with the most up-to-date log record and the highest priority becomes the new master. A node with priority 0 cannot be elected.

When the master fails, the replicas will automatically hold an election to select a new master. You are responsible for monitoring the master. The automatic failover facility will redirect incoming connections to a new master. The message store replicas run in read only mode. Any attempt to modify a mailbox on a replica returns an error. You can perform read only operations such as mboxutil -l and imsbackup on the replicas. You can install Multiple message store nodes on the same host in different zones. Each message store node must have a unique base.listenaddr.

Berkeley Database maintains an internal database to keep track of the replication group data. Starting the first node in a replication group for the first time initializes the database. To start up the first node the first time, run start-msg -m. Database transaction commits blocks until it either receives enough acknowledgments, or the acknowledgment timeout expires. You can configure the acknowledgment policy and timeout.

A two-site replication group is particularly vulnerable to duplicate masters when there is a disruption to communications between the sites. Two-site replication is disabled by default. You can enable it with the store.dbreplicate.twosites option. When this option is disabled, the message store cannot take over as master if the original master fails in a replication group with only two sites. In the event this happens, the message store cluster will be unavailable for write access.

When a replication-aware client, such as the MMP (POP or IMAP), LMTP client or ENS subscriber connects to a replication group, it has to be aware of the hosts in the replication group so if one is down it can attempt to connect to the next one. This is presently accomplished by setting the proxy:repgroup.storehostlist option to the same value on all replication-aware clients and back-ends. These clients also remember the last host to which they connected in the replication group so it's not necessary to failover on every connection. In addition, replication-aware clients that perform write operations need to connect to the replication group master. This is accomplished by having the back-end servers refer the client to the replication group master. Details about how each replication-aware client works are as follows:

IMAP server

Advertises OK login referrals (RFC 2221) that point to the master. Third-party use of such referrals is supported. Note that login referrals are also provided when a user logs in into an IMAP server that does not contain their INBOX (as may be necessary to access shared folders). Login referrals of the latter form will point to a host that is not in the storehostlist for the replication group.
POP server

Advertises referrals with a SYS/REFER/hostname extended error on login. This is an Oracle extension to the protocol based on RFC 3206. Third-party use of such referrals is supported.
LMTP server

The referral is indicated via a private protocol extension in the LMTP greeting. Oracle does not support third-party use of our LMTP server.
ENS server

No changes have been made.
Message Store ENS publisher

Publishes to all available hosts in the replication group for the current store by default. The recommended deployment for ENS with store failover is to have enpd running on every message store master and replica.
ENS and JMQ publishers for Message Store

The hostname attribute will use the replication group name instead of the local host name.
ENS C Client API

If storehostlist is configured, it will perform failover and cache the last successful host in the host list.
MMP IMAP client

Performs failover, caches the last successful master, and follows referrals.
MMP POP client

Performs failover, caches the last successful master, and follows referrals.
MTA LMTP client

When the affinitylist channel option is set, this performs failover, caches the last successful master, and follows referrals.
MTA BURL IMAP client

Performs failover and caches the last successful host. As this is read-only, it does not follow referrals.
Shared folder IMAP client in imapd

Performs failover, caches the last successful master, and follows referrals.
mshttpd IMAP client

Performs failover, caches the last successful master, and follows referrals.
Glassfish MQ (aka JMQ)

No changes other than the hostname attribute mentioned above. Support for Glassfish MQ is deprecated.

Configuration Options

Topics in this section include:

Configuration Options
Command-line Utilities

Configuration Options

Table 61-1 shows the configuration options (only supported in Unified Configuration), their descriptions, data types, and defaults.

Table 61-1 Configuration Options

Option	Description	Data Type	Default
store.dbreplicate.enable	Enable database replication	boolean	0
store.dbreplicate.port	Replication port number	integer	55000 Note that storehostlist does not support non-default port numbers.
store.dbreplicate.dbremotehost	Remote host name list; this option is deprecated in 8.0.1.	host[:port] [host[:port]]...	In 8.0.1 and later this defaults the value of the proxy:repgroup.storehostlist with the current host omitted from the list.
store.dbreplicate.dbpriority	Host priority	integer	100
store.dbreplicate.ackpolicy	Replication acknowledgment policy	0=none, 1=one, 2=one peer, 3=quorum, 4=all peer, 5=all available, 6=all clients	3
store.dbreplicate.acktimeout	Replication acknowledgment timeout	number of seconds	1 second
store.dbreplicate.twosites	Enable two sites replication group	boolean	0
proxy:repgroup.storehostlist	Sets list of hosts in each store replication group for all relevant servers in the deployment. The preferred master should be listed first.	host [host]	Value of LDAP mailHost attribute for a user.
proxy:repgroup.imapport	The port used to connect to IMAP for this replication group.	unsigned 16-bit integer	Value of base.proxyimapport option.
proxy:repgroup.imapadmin	The administrative user name used when connecting to this replication group.	non-empty-string	Value of base.proxyadmin option.
proxy:repgroup.imapadminpass	The administrative password used to connect to this replication group.	password	Value of base.proxyadminpass option.

Command-line Utilities

The imcheck subsystem option prints the database replication statistics. See "imcheck" for additional information.

To print the database replication statistics, run the following command:
```
imcheck -s rep
```
Note:
The imcheck -s command is only valid for classic message store.
The start-msg -m option starts the message store as a replication master. See Messaging Server Refernce Guide for additional information on the start-msg command.

To start the message store as a replication master, run the following command:
```
start-msg -m
```
The stored -d site option removes a replication site from the replication database. See "stored" for additional information.

To delete an mboxlist replication site called grumpy from the cluster, run the following command:
```
stored -d grumpy
```

Configuring Message Store Database Replication

The following configuration examples show you how to configure message store database replication.

To Configure a Three Node Cluster for HA

The following example sets up a cluster with three electable nodes (huey, dewey, and louie at example.com). This example assumes LMTP has been configured on these back-ends. The message store partition is on a shared storage mounted at /zfssa/primary.

On huey.example.com:

msconfig set store.dbreplicate.enable 1
msconfig set proxy:cluster1.storehostlist "huey.example.com dewey.example.com louie.example.com"
msconfig set partition:primary.path /zfssa/primary
msconfig set task:snapshot.enable 0
msconfig set task:snapshotverify.enable 0
start-msg -m

On dewey.example.com:

msconfig set store.dbreplicate.enable 1
msconfig set proxy:cluster1.storehostlist "huey.example.com dewey.example.com louie.example.com"
msconfig set partition:primary.path /zfssa/primary
msconfig set task:snapshot.enable 0
msconfig set task:snapshotverify.enable 0
start-msg

On louie.example.com:

msconfig set store.dbreplicate.enable 1
msconfig set proxy:cluster1.storehostlist "huey.example.com dewey.example.com louie.example.com"
msconfig set partition:primary.path /zfssa/primary
msconfig set task:snapshot.enable 0
msconfig set task:snapshotverify.enable 0
start-msg

In addition, the storehostlist has to be set on all front-end servers as well, and the LMTP client has to be configured on the front-ends with the affinitylist channel option. Due to the complexity of setting up LMTP, it is recommended to copy the example recipe file LMTPBackendFailover.rcp and modify it with appropriate settings for the back-end stores.

To Change the DB Replication Local Instance Port

The replication group info is maintained by BDB. store.dbreplicate.port is for the local site only.

To change the port number on one node with the cluster running:

Stop the message store.
```
stop-msg store
```
Remove the local site from the group.
```
stored -d
```

Change the port number on the local site in Unified Configuration.

msconfig set store.dbreplicate.port <newport>

Or in legacy configuration.

configutil -o store.dbreplicate.port -v <newport>

Restart the message store
```
start-msg
```
Run imcheck -s rep on the other sites. You should see the new port for this site.
```
imcheck -s rep
```
Note:
The imcheck -s command is only valid for classic message store.

Message Store Automatic Failover

This section describes Messaging Server's message store automatic failover feature and its configuration. It contains the following sections:

Basic Requirements
Overview of Message Store Automatic Failover
Configuring Message Store Automatic Failover

Basic Requirements

In order to use message store automatic failover, the following is necessary.

Messaging Server must be running in Unified Configuration mode.
Messaging Server must be deployed using LMTP.

Overview of Message Store Automatic Failover

The message store automatic failover feature is useful for customers who already have 24/7 operators in their machine room.

The basic model is that all Messaging Server hosts in the deployment need to be manually configured with an ordered list of hostnames for each mailStore. Each hostname corresponds to a separate product installation, but in a given mailStore list all the hosts must use a shared disk (for example, NFS, filer) for the product data, but have a configuration that is largely identical except for the "base.hostname" setting. The first host in the list is the primary host for that mailStore. The primary host is running and the secondary hosts are not running (on standby).

The MMP, LMTP client, and imapd shared folder support will now automatically use the secondary host for requests related to the primary mailHost; there is no need to refresh or restart these services for this to happen.

Note that a standby hostname does not mean it is necessary to have unused hardware. If multiple IP addresses are used on the same server, then it can support multiple installations. However, if the primary host is dedicated and the secondary host is shared for a mailStore, then the service response time will be reduced when automatic failover happens. Sites need to consider how much spare capacity is needed to service users when there is a hardware outage.

To enable safety during rolling version upgrades where a node is deliberately shutdown during upgrade, we recommend having at least 3 hostnames associated with each mailHost.

Configuring Message Store Automatic Failover

For this example, assume the following hosts are in a deployment:

LDAP mailStore "store1.example.com"

store1a.example.com primary

store1b.example.com

store1c.example.com

LDAP mailStore "store2.example.com"

store2a.example.com primary

store2b.example.com

store2c.example.com

mta.example.com - an MTA configured to use LMTP

mmp.example.com - an MMP

We recommend that customers use Unified Configuration recipes to set up automatic failover to avoid typographical errors when configuring multiple machines by running a recipe similar to the following on all machines in the deployment.

set_option("proxy:store1\\.example\\.com.storehostlist",
"store1a.example.com store1b.example.com store1c.example.com");
set_option("proxy:store2\\.example\\.com.storehostlist",
"store2a.example.com store2b.example.com store2c.example.com");

Save the recipe above to a plain text file, for example, automatic-failover.rcp

Some extra configuration need be added to LMTP server and all the clients communicating to LMTP server.

To Configure the LMTP Server

Add the automatic-failover.rcp file configuration to the existing LMTP server configuration. Or, if you prefer, there is also a sample recipe LMTPBackendFailover.rcp that configures a backend LMTP server for use with failover. If you want to use this, you must copy the recipe script and manually add in to your LMTP client IP addresses and mailstore proxy information. This is available in the Messaging Server installed location MessagingServer_home/lib/recipes/LMTPBackendFailover.rcp.
Run the recipe script on all back-end machines in the deployment by executing the following the command.
```
msconfig run manual-failover.rcp OR LMTPBackendFailover.rcp
```
If you are running a compiled a configuration, recompile by running:
```
imsimta cnbuild
```
Start the Messaging Server by running:
```
start-msg
```

To Configure the Client

Add the automatic-failover.rcp file configuration to all clients' configurations. For the LMTP client, you must add some extra configuration.
Run the recipe script on all client machines (MMP, LMTP client, and do on) in the deployment by executing the following command:
```
msconfig run automatic-failover.rcp
```
For the LMTP client only, you must set the affinity list channel option on the LMTP client channel:
```
msconfig set channel:tcp_lmtpcs.affinitylist
imsimta cnbuild
stop-msg
start-msg
```
Stop the Messaging Server Client by running:
```
stop-msg mmp or mta
```
Start the Messaging Server Client by running:
```
start-msg mmp or mta
```