Sun ONE Messaging Server 6.0 Installation Guide for Solaris Operating Systems |
Chapter 3
Configuring High Availability SolutionsThis chapter contains the following sections to help you determine which high availability (HA) model is right for you and how to set up your system to run high availability with Messaging Server. This chapter covers the following topics:
For more information on High Availability models supported with Messaging Server, the following product documentation is recommended:
High Availability ModelsThere are different high availability models that can be used with Messaging Server. Three of the more common ones are:
- Asymmetric (hot standby)
Each of these models is described in greater detail in the following subsections. In addition, the following topics are covered:
Note that different HA products may or may not support different models. Refer to the HA documentation to determine which models are supported.
Asymmetric
The basic asymmetric or “hot standby” high availability model (Figure 3-1) consists of two clustered host machines or “nodes.” A logical IP address and associated host name are designated to both nodes.
In this model, only one node is active at any given time; the backup or hot standby node remains idle most of the time. A single shared disk array between both nodes is configured and is mastered by the active or “primary” node. The message store partitions and Message Transfer Agent (MTA) queues reside on this shared volume.
Figure 3-1 Asymmetric High Availability Model
Before failover, the active node is Physical-A. Upon failover, Physical-B becomes the active node and the shared volume is switched so that it is mastered by Physical-B. All services are stopped on Physical-A and started on Physical-B.
The advantage of this model is that the backup node is dedicated and completely reserved for the primary node; there is no resource contention on the backup node when a failover occurs. However, this model also means that the backup node stays idle most of the time and this resource is therefore under utilized.
Symmetric
The basic symmetric or “dual services” high availability model consists of two hosting machines, each with its own logical IP address. Each logical node is associated with one physical node, and each physical node controls one disk array with two storage volumes. One volume is used for its local message store partitions and MTA queues, and the other is a mirror image of its partner’s message store partitions and MTA queues.
In the symmetric high availability mode (Figure 3-2), both nodes are active concurrently, and each node serves as a backup node for the other. Under normal conditions, each node runs only one instance of the messaging server.
Figure 3-2 Symmetric High Availability Model
Upon failover, the services on the failing node are shut down and restarted on the backup node. At this point, the backup node is running Messaging Server from both nodes and is managing two separate volumes.
The advantage of this model is that both nodes are active simultaneously, thus fully utilizing machine resources. However, during a failure, the backup node will have more resource contention as it runs services for Messaging Server from both nodes. Therefore, you should repair the failed node as quickly as possible and switch the servers back to their dual services state.
This model also provides a backup storage array; in the event of a disk array failure, its mirror image can be picked up by the service on its backup node.
To configure a symmetric model, you need to install shared binaries on your shared disk. Note that doing so may prevent you from performing rolling upgrades, a feature that will allow you to update your system during Messaging Server patch releases (which will be available in future releases).
N+1 (N Over 1)
The N + 1 or “N over 1” model operates in a multi-node asymmetrical configuration. N logical host names and N shared disk arrays are required. A single backup node is reserved as a hot standby for all the other nodes. The backup node is capable of concurrently running Messaging Server from the N nodes.
Figure 3-3 illustrates the basic N + 1 high availability model.
Figure 3-3 N + 1 High Availability Model
Upon failover of one or more active nodes, the backup node picks up the failing node’s responsibilities.
The advantages of the N + 1 model are that the server load can be distributed to multiple nodes and that only one backup node is necessary to sustain all the possible node failures. Thus, the machine idle ratio is 1/N as opposed to 1/1, as is the case in a single asymmetric model.
To configure an N+1 model, you need to install shared binaries on your shared disk as with symmetric models. Note that doing so may prevent you from performing rolling upgrades, a feature that will allow you to update your system during Messaging Server patch releases (which will be available in future releases).
Which High Availability Model is Right for you?
Table 3-1 summarizes the advantages and disadvantages of each high availability model. Use this information to help you determine which model is right for you.
System Down Time Calculations
Table 3-2 illustrates the probability that on any given day the mail service will be unavailable due to system failure. These calculations assume that on average, each server goes down for one day every three months due to either a system crash or server hang, and that each storage device goes down one day every 12 months. They also ignore the small probability of both nodes being down simultaneously.
Configuring High AvailabilityThis section provides the information you need to configure the Veritas Cluster Server or Sun Cluster high availability clustering software and prepare it for use with the Messaging Server. (Refer to your Veritas or Sun Cluster Server documentation for detailed installation instructions, required patches, and information as needed.)
Table 3-3 lists the versions of Sun Cluster Server and Veritas Cluster Server that are currently supported with Messaging Server:
For the latest updates on version support, refer to the Sun ONE Messaging Server 6.0 Release Notes.
The following topics are covered in this section:
Cluster Agent Installation
A cluster agent is a Messaging Server program that runs under the cluster framework.
The Sun Cluster Messaging Server agent (SUNWscims) is installed when you select Sun Cluster 3.1 through the Java Enterprise System installer. The Veritas Cluster Messaging Server agent (SUNWmsgvc) can be found in the Messaging Server Product subdirectory on the Java Enterprise System CD. (Note that you must use the pkgadd(1M) command to install the VCS cluster agent.)
Some items of note regarding the Messaging Server and high availability (applies to both Veritas Cluster, and Sun Cluster) installation:
- High Availability cluster agents for the Messaging Server are not installed by default; be sure to select the appropriate agent package during the Java Enterprise System installation process.
- Before running the Java Enterprise System installer, make sure that the HA logical host names and associated IP address for Messaging Server are active. The reason for this is because portions of the installation will make TCP connections using them. Run the installation on the cluster node currently pointed at by the HA logical host name for Messaging Server.
- When you are asked for the msg_svr_base in the Java Enterprise System installer, be sure that the msg_svr_base is on the local disk. In other words, you need to install the Messaging Server package locally on each node. However, your configuration and data should be on a disk that is shared between the nodes. Otherwise, if you install your configuration and data on a local disk and one node fails over to the other node, the servers will not see the data accumulated by the server on the failed node.
- After you configure the Administration Server through the Java Enterprise System installer, be sure that the IP address of the Administration Server is associated with the logical IP address for the machine not the IP address of the physical host.
- After running the Messaging Server Initial Runtime Configuration (see Create the Initial Messaging Server Runtime Configuration), be sure to manually configure the fully-qualified HA logical host name of the cluster of Messaging Server.
Using the useconfig utility
The useconfig utility allows you to share a single configuration between multiple nodes in an HA environment. This utility is not meant to upgrade or update an existing configuration.
For example, if you are upgrading your first node, you will install through the Java Enterprise System installer and then configure Messaging Server (See Chapter 2, "Installing Messaging Server."). You will then failover to the second node where you will install the Messaging Server package through the Java Enterprise System installer, but you will not have to run the Initial Runtime Configuration Program (configure) again. Instead, you can use the useconfig utility.
To enable the utility, run useconfig utility to point to your previous Messaging Server configuration.
where configure_YYYYMMDDHHMMSS is your previous configuration settings file.
On a brand new node, you can find the configure_YYYYMMDDHHMMSS in the msg_svr_base/data/setup directory on the shared disk.
The following sections on Veritas Cluster Server Agent Installation and Sun Cluster Agent Installation describe when you can use the useconfig utility.
Veritas Cluster Server Agent Installation
Messaging Server can be configured with Veritas Cluster Server 1.3, 2.0, and 3.5. The instructions in this section only cover Veritas Cluster 3.5; for Veritas 1.3 and 2.0, review the Messaging Server 5.2 Installation Guide.
Be sure to review the Veritas Cluster Server documentation prior to following these procedures.
After installing Messaging Server through the Java Enterprise System installer and configuring HA, be sure to review Binding IP Addresses on a Server for additional steps associated with configuring HA support.
Veritas Cluster Server Requirements
- Veritas Cluster Software is already installed and configured.
- As described in the following instructions (in VCS 3.5 Installation and Configuration Notes), you will install the Veritas Cluster Agent package for Messaging Server along with the Messaging Server software on both nodes.
VCS 3.5 Installation and Configuration Notes
The following instructions describe how to configure Messaging Server as an HA service, by using Veritas Cluster Server 3.5.
The default main.cf configuration file sets up a resource group called ClusterService that launches the VCSweb application. This group includes network logical host IP resources like csgnic and webip. In addition, the ntfr resource is created for event notification.
- Launch Cluster Explorer from one of the nodes.
Note that these Veritas Cluster Server instructions assume you are using the graphical user interface to configure Messaging Server as an HA service.
To launch Cluster Explorer, run the following command:
The VRTScscm package must be installed in order to use the GUI.
- Add s1ms_dg disk group resource of type DiskGroup and enable it.
- Add s1ms_mt mount resource of type Mount.
- Create a link between s1ms_mt and s1ms_dg. Enable the resource s1ms_mt.
See the following dependency tree:
- Run the Java Enterprise System installer, selecting Messaging Server.
- Run the Messaging Server Initial Runtime Configuration (see Chapter 2, "Installing Messaging Server") from the primary node (for example, Node_A) to install Messaging Server.
- Install the Veritas Cluster Server agent package, SUNWmsgvc, (located in the Messaging Server Product subdirectory on the Java Enterprise System CD) by using the pkgadd(1M) command.
- Check to make sure that the logical host name and the logical IP address are specified whenever a host name or an IP address is required during the installation.
Messaging Server and the Veritas agent are now installed on Node_A.
- Switch to the backup node (for example, Node_B).
- Run the Java Enterprise System installer to install Messaging Server on the backup node (Node_B).
- After installing Messaging Server, you can use the useconfig utility to obviate the need for creating an additional initial runtime configuration on the backup node (Node_B). The useconfig utility allows you to share a single configuration between multiple nodes in an HA environment. This utility is not meant to upgrade or update an existing configuration. See Using the useconfig utility.
The Veritas agent is now installed on Node_B.
- From the Cluster Explorer, Select Import Types... from the File menu which will display a file selection box.
- Import the MsgSrvTypes.cf type from the /etc/VRTSvcs/conf/config directory. Import this type file. Note that you need to be on a cluster node to find this file.
- Now create a resource of type MsgSrv (for example, Mail). This resource requires the logical host name property to be set.
- The Mail resource depends on s1ms_mt and webip. Create links between the resources as shown in the following dependency tree:
- Switch over to Node_A and check if the High Availability configuration is working.
- Change the group attribute OnlineRetryLimit from 3 to 0, otherwise the failed-over service might restart on the same node.
MsgSrv Attributes
This section describes MsgSrv additional attributes that govern the behavior of the mail resource. To configure Messaging Server with Veritas Cluster Server, see Table 3-4.
Sun Cluster Agent Installation
This section describes how to install and configure the Messaging Server as a Sun Cluster Highly Available (HA) Data Service. These installation instructions apply to both Sun Cluster 3.0 Update 3 and Sun Cluster 3.1. The following topics are covered in this section:
Documentation for Sun Cluster 3.0 Update 3 and for Sun Cluster 3.1 can be found at:
http://docs.sun.com/db/prod/cluster#hic
Note that Veritas File System (VxFS) is supported with Sun Cluster 3.0 Update 3 and with Sun Cluster 3.1.
Sun Cluster Requirements
This section presumes the following:
Configuring Messaging Server HA Support for Sun Cluster
This section describes how to configure HA support for Sun ONE Messaging Server for Sun Cluster 3.0 Update 3 and 3.1 through a simple example.
After configuring HA, be sure to review Binding IP Addresses on a Server for additional steps associated with HA support.
The following example assumes that the messaging server has been configured with a HA logical host name and IP address. The physical host names is assumed to be mail-1 and mail-2, with an HA logical host name of budgie. Figure 3-4 depicts the nested dependencies of the different HA resources you will create in configuring Messaging Server HA support.
Figure 3-4 A Simple Sun ONE Messaging Server HA configuration
- Become the superuser and open a console.
All of the following Sun Cluster commands require that you have logged in as superuser. You will also want to have a console or window for viewing messages output to /dev/console.
- Add required resource types.
Configure Sun Cluster to know about the resources types we will be using. This is done with the scrgadm -a -t command:
# scrgadm -a -t SUNW.HAStorage
# scrgadm -a -t SUNW.ims- Create a resource group for the Messaging Server.
If you have not done so already, create a resource group and make it visible on the cluster nodes which will run the Messaging Server. The following command creates a resource group named MAIL-RG, making it visible on the cluster nodes mail-1 and mail-2:
# scrgadm -a -g MAIL-RG -h mail-1,mail-2
You may, of course, use whatever name you wish for the resource group.
- Create an HA logical host name resource and start resource group.
If you have not done so already, create and enable a resource for the HA logical host name, placing it in the resource group. The following command does so using the logical host name budgie. Since the -j switch is omitted, the name of the resource created will also be budgie.
# scrgadm -a -L -g MAIL-RG -l budgie
# scswitch -Z -g MAIL-RG- Create an HA storage resource.
Next, you need to create an HA storage resource type for the file systems on which Messaging Server is dependent. The following command creates an HA storage resource named disk-rs and the file system disk_sys_mount_point is placed under its control:
# scrgadm -a -j disk-rs -g MAIL-RG \
-t SUNW.HAStorage \
-x ServicePaths=disk_sys_mount_point-1, disk_sys_mount_point-2The comma-separated list of ServicePaths are the mount points of the cluster file systems on which Messaging Server is dependent. In the above example, only two mount points, disk_sys_mount_point-1 and disk_sys_mount_point-2, are specified. If one of the servers has additional file systems on which it is dependent, then you can create an additional HA storage resource and in Step 8 to indicate that additional dependency.
- Install and configure Messaging Server (Chapter 2, "Installing Messaging Server"); be sure to use the HA logical host name created in Step 4.
- In the Initial Runtime Configuration, you are asked to specify a configuration directory in Step 3 of Create the Initial Messaging Server Runtime Configuration. Be sure to use the shared disk directory path of your HA Storage resource (or HAStoragePlus resource, described in Enabling HAStoragePlus).
- Be sure to run the following command to enable the watcher process under Sun Cluster:
For more information on the watcher process, refer to the Sun ONE Messaging Server 6.0 Administrator’s Guide.
- Run the ha_ip_config script to set service.listenaddr and service.http.smtphost and to configure the dispatcher.cnf and job_controller.cnf files for high availability. The script will ensure that the logical IP address is set for these parameters and files, rather than the physical IP address.
For instructions on running the script, see Binding IP Addresses on a Server.
The ha_ip_config script should only be run once on the machine with the shared disk (for configuration and data).
- Create an HA Messaging Server resource.
It’s now time to create the HA Messaging Server resource and add it to the resource group. This resource is dependent upon the HA logical host name and HA disk resource.
In creating the HA Messaging Server resource, we need to indicate the path to the Messaging Server top-level directory—the msg_svr_base path. These are done with the IMS_serverroot extension properties as shown in the following command.
# scrgadm -a -j mail-rs -t SUNW.ims -g MAIL-RG \
-x IMS_serverroot=msg_svr_base \
-y Resource_dependencies=disk-rs,budgieThe above command, creates an HA Messaging Server resource named mail-rs for the Messaging Server which is installed on IMS_serverroot in the msg_svr_base directory. The HA Messaging Server resource is dependent upon the HA disk resource disk-rs as well as the HA logical host name budgie.
If the Messaging Server has additional file system dependencies, then you can create an additional HA storage resource for those file systems. Be sure to include that additional HA storage resource name in the Resource_dependencies option of the above command.
- Enable the Messaging Server resource.
It’s now time to activate the HA Messaging Server resource, thereby bringing the messaging server online. To do this, use the command
# scswitch -e -j mail-rs
The above command enables the mail-rs resource of the MAIL-RG resource group. Since the MAIL-RG resource was previously brought online, the above command also brings mail-rs online.
- Verify that things are working.
Use the scstat command to see if the MAIL-RG resource group is online. You may want to look at the output directed to the console device for any diagnostic information. Also look in the syslog file, /var/adm/messages.
- Fail the resource group over to another cluster node in order to make sure failover properly works.
Manually fail the resource group over to another cluster node. Use the scstat command to see what node the resource group is currently running on (“online” on). For instance, if it is online on mail-1, then fail it over to mail-2 with the command:
# scswitch -z -g MAIL-RG -h mail-2
Enabling HAStoragePlus
SUNW.HAStoragePlus is a resource type that can be used to make locally mounted file systems highly available within a Sun Cluster environment. Any file system resident on a Sun Cluster global device group can be used with HAStoragePlus. Unlike a globally mounted filesystem like HAStorage, HAStoragePlus is available only on one cluster node at any given point of time. These locally mounted file systems can only be used in failover mode and in failover resource groups. HAStoragePlus offers FFS (failover file system), in contrast to HAStorage’s GFS (global file system).
HAStoragePlus has a number of benefits:
- HAStoragePlus bypasses the global file service layer completely. For disk-IO intensive data services, this leads to a significant performance increase.
- HAStoragePlus can work with any file system (like UFS,VxFS, and so forth), even those that might not work with the global file service layer. If a file system is supported by the Solaris operating system, it will work with HAStoragePlus.
For more information on HAStoragePlus, read the Sun Cluster 3.1 Data Service Planning and Administration Guide.
To enable HAStoragePlus on your cluster:
- Disable your messaging and storage resources.
# scwitch -n -j mail-rs
# scwitch -n -j disk-rs- Remove your messaging and storage resources.
# scrgadm -r -j mail-rs
# scrgadm -r -j disk-rs- Create the disk type SUNW.HAStoragePlus.
# scrgadm -a -t SUNW.HAStoragePlus
- Create your disk resource and resource dependencies with HAStoragePlus.
HA Storage Resource
# scrgadm -a -j disk-rs -g MAIL-RG \
-t SUNW.HAStoragePlus \
-x FileSystemMountPoints=file_sys_mount_point-1Messaging Server Resource
# scrgadm -a -j mail-rs -g MAIL-RG \
-x IMS_serverroot=msg_svr_base
-y Resource_dependencies=disk-rs,budgie- Remove the term global from the /etc/vfstab file. At bootup, /etc/vbstab must be set to ‘no.’ For more information, refer to your Sun Cluster 3.1 documentation.
Before the vfstab file is enabled with HAStoragePlus, you might first umount the file systems that are currently global file systems. You can then enable the vfstab file with HAStoragePlus and remount the file systems.
- Start your cluster server.
# scswitch -Z -g MAIL-RG
Binding IP Addresses on a Server
If you are using the Symmetric or N + 1 high availability models, there are some additional things you should be aware of during configuration in order to prepare the Sun Cluster Server for Messaging Server.
Messaging Server running on a server requires that the correct IP address binds it. This is required for proper configuration of Messaging in an HA environment.
Part of configuring Messaging Server for HA involves configuring the interface address on which the Messaging Servers bind and listen for connections. By default, the servers bind to all available interface addresses. However, in an HA environment, you want the servers to bind specifically to the interface address associated with an HA logical host name.
A script is therefore provided to configure the interface address used by the servers belonging to a given Messaging Server instance. Note that the script identifies the interface address by means of the IP address which you have or will be associating with the HA logical host name used by the servers.
The script effects the configuration changes by modifying or creating the following configuration files. For the file
msg_svr_base/config/dispatcher.cnf
it adds or changes INTERFACE_ADDRESS option for the SMTP and SMTP Submit servers. For the file
msg_svr_base/config/job_controller.cnf
it adds or changes the INTERFACE_ADDRESS option for the Job Controller.
Finally it sets the configutil service.listenaddr and service.http.smtphost parameters used by the POP, IMAP, and Messenger Express HTTP servers.
Note that the original configuration files, if any, are renamed to *.pre-ha.
Run the script as follows:
- Become superuser.
- Execute msg_svr_base/sbin/ha_ip_config
- The script presents the questions described below. The script may be aborted by typing control-d in response to any of the questions. Default answers to the questions will appear within square brackets, [ ]. To accept the default answer, simply press the RETURN key.
- Logical IP address: Specify the IP address assigned to the logical host name which the Messaging Server will be using. The IP address must be specified in dotted decimal form, for example, 123.456.78.90.
The logical IP address is automatically set in the configutil parameter service.http.smtphost which allows you to see which machine is running your messaging system in a cluster. For example, if you are using Messenger Express, your server will be able to determine from which mail host to send outgoing mail.
- Messaging Server Base (msg_svr_base): Specify the absolute path to the top-level directory in which Messaging Server is installed.
- Do you wish to change any of the above choices: answer “no” to accept your answers and effect the configuration change. Answer “yes” if you wish to alter your answers.
Unconfiguring High AvailabilityThis section describes how to unconfigure high availability. To uninstall high availability, follow the instructions in your Veritas or Sun Cluster documentation.
The High Availability unconfiguration instructions differ depending on whether you are removing Veritas Cluster Server or Sun Cluster.
The following topics are covered in this section:
Unconfiguring Veritas Cluster Server
To unconfigure the high availability components for Veritas Cluster Server:
- Bring the iMS5 service group offline and disable its resources.
- Remove the dependencies between the mail resource, the logical_IP resource, and the mountshared resource.
- Bring the iMS5 service group back online so the sharedg resource is available.
- Delete all of the Veritas Cluster Server resources created during installation.
- Stop the Veritas Cluster Server and remove following files on both nodes:
/etc/VRTSvcs/conf/config/MsgSrvTypes.cf
/opt/VRTSvcs/bin/MsgSrv/online
/opt/VRTSvcs/bin/MsgSrv/offline
/opt/VRTSvcs/bin/MsgSrv/clean
/opt/VRTSvcs/bin/MsgSrv/monitor
/opt/VRTSvcs/bin/MsgSrv/sub.pl- Remove the Messaging Server entries from the /etc/VRTSvcs/conf/config/main.cf file on both nodes.
- Remove the /opt/VRTSvcs/bin/MsgSrv/ directory from both nodes.
Unconfiguring Messaging Server HA Support for Sun Cluster 3.x
This section describes how to undo the HA configuration for Sun Cluster. This section assumes the simple example configuration (described in the Sun Cluster Agent Installation). For other configurations, the specific commands (for example, Step 3) may be different but will otherwise follow the same logical order.
- Become the superuser.
All of the following Sun Cluster commands require that you be running as user superuser.
- Bring the resource group offline.
To shut down all of the resources in the resource group, issue the command
# scswitch -F -g MAIL-RG
This shuts down all resources within the resource group (for example, the Messaging Server and the HA logical host name).
- Disable the individual resources.
Next, remove the resources one-by-one from the resource group with the commands:
# scswitch -n -j mail-rs
# scswitch -n -j disk-rs
# scswitch -n -j budgie- Remove the individual resources from the resource group.
Once the resources have been disabled, you may remove them one-by-one from the resource group with the commands:
# scrgadm -r -j mail-rs
# scrgadm -r -j disk-rs
# scrgadm -r -j budgie- Remove the resource group.
Once the all the resources have been removed from the resource group, the resource group itself may be removed with the command:
# scrgadm -r -g MAIL-RG
- Remove the resource types (optional).
Should you need to remove the resource types from the cluster, issue the commands:
# scrgadm -r -t SUNW.ims
# scrgadm -r -t SUNW.HAStoragePlus