This chapter describes the procedure for configuring a failover Processing Engine system that will immediately take over processing of network traffic in the event that the primary Processing Engine system becomes unavailable. Note that the described procedure assumes that the primary Processing Engine system has been installed, configured, and is fully operational.
The procedure to configure failover Reporter and Collector systems is described in Section 9, "Configuring a Failover Reporter System" and Section 10, "Configuring a Failover Collector System".
The configuration of a secondary (or failover) Processing Engine system offers the advantage that it can seamlessly take over processing of monitored traffic in the event that the primary Processing Engine system becomes unavailable. In this way, a high level of operational reliability is achieved. The configuration of a failover Processing Engine system is shown in Figure 11-1.
Figure 11-1 Failover Processing Engine Configuration
At server level, a crossover cable connects the primary and secondary Processing Engine systems. As long as a regular "heartbeat" continues between the primary and secondary servers, the secondary server will not initiate processing of traffic. However, the secondary server will immediately take over the processing task of the primary server as soon as it detects an alteration in the "heartbeat" of the primary server. This process is referred to as failover.
Note that failback (that is, the process of restoring the RUEI installation to its original state), must be performed manually. The procedure is described in Section 11.5, "Instigating Processing Engine Failback".
In order to configure a failover Processing Engine installation, the following conditions must be met:
The primary and secondary Processing Engine systems must be directly connected via a crossover cable. In addition, both systems must also be connected to a local or public network to in order to connect to the Reporter, remote Collector, and database systems.
The database and Collector instances used by the RUEI installation must both be remote.
The primary and secondary Processing Engine systems must share the same storage (such as SAN or NFS). In particular, the RUEI_DATA/processor/data
and RUEI_DATA/processor/data/sslkeys
directories.
Make the RUEI_DATA/processor/data
and RUEI_DATA/processor/sslkeys
directories available on a shared storage location.
Stop all processing on the primary Processing Engine system by issuing the following command as the RUEI_USER
user:
project -stop
Mount the shared Processing Engine location on the primary Processing Engine system. To do so, edit the /etc/fstab
file so that it is mounted at boot
. For example:
10.6.5.9:/home/nfs /processing_share nfs rsize=1024,wsize=1024 0 0
Move the existing data
and sslkey
directories to the shared Processing Engine location. For example:
mv RUEI_DATA/processor/data /processing_share mv RUEI_DATA/processor/sslkeys /processing_share
where processing_share
specifies the shared location for data and SSL keys on the primary and secondary Processing Engine systems.
The installation procedure for a secondary Processing Engine system is almost identical to that of a standalone Processing Engine system. Note that Initial Setup Wizard should not be run. Do the following:
When starting the installation procedure for the secondary Processing Engine system, ensure that the /etc/ruei.conf
file is identical to that of the primary Processing Engine system.
Install the Linux operating system and Processing Engine software on the secondary Processing Engine system. The procedure to do this is described in Chapter 2, "Installing the RUEI Software". Specifically:
Follow the instructions described in Chapter 2, "Installing the RUEI Software" up to and including Section 2.7.4, "Installing the Zend Decoder".
Copy the following files from the RUEI_DATA
directory on the primary Processing Engine system to the secondary Processing Engine system: cwallet.sso
, ewallet.p12
, sqlnet.ora
, and tnsnames.ora
. You should ensure that the ownerships and permissions of these files are identical on both Processing Engine systems.
Follow the instructions described in steps 1-5 in Section 2.7.6, "Installation of the Reporter Software".
Follow the instructions described in Section 2.8, "Configuring the Network Interface".
If you performed the instructions described in Section 2.9, "Enabling International Fonts (Optional, but Recommended)" through Section 2.12, "Configuring Automatic Browser Redirection (Optional)" for the primary Processing Engine system, then you will need to repeat them for the secondary Processing Engine system.
Do the following:
If you have not already done so, login to the primary Processing Engine system as the RUEI_USER
user, and issue the following command to stop all processing of monitored traffic:
project -stop
Copy the .ssh
directory of the RUEI_USER
user on the primary Processing Engine system, created while performing the procedure described in Section 2.13, "Configuring Reporter Communication (Split-Server Setup Only)", to the secondary Processing Engine system. Note that it must be copied to the same location.
Ensure that the uid
and gid
settings of the RUEI_USER
user are the same on both the primary and secondary Processing Engine systems. For example:
id moniforce uid=501(moniforce) gid=502(moniforce) groups=502(moniforce)
Configure the static IP addresses on both Processing Engine systems used for the crossover cable. This can be done using a utility such as system-config-network.
Edit the /etc/fstab
file so the RUEI_DATA/processor/data
and RUEI_DATA/processor/sslkeys
directories are mounted at boot
. For example:
10.6.5.9:/home/nfs /reporter_share nfs rsize=1024,wsize=1024 0 0
where reporter_share
specifies the shared location for data and SSL keys on the primary and secondary Processing Engine systems.
Move the local data
and sslkeys
directories for the secondary Processing Engine system to the shared Processing Engine location by issuing the following commands:
rm -rf RUEI_DATA/processor/data rm -rf RUEI_DATA/processor/sslkeys ln -s /reporter_share/data RUEI_DATA/processor/data ln -s /reporter_share/sslkeys RUEI_DATA/processor/sslkeys
Login to the secondary Processing Engine system as the RUEI_USER
user, and issue the following command:
project -new -fromdb UX
This creates the secondary Processing Engine's on-disk configuration files using the primary Processing Engine's database configuration.
Edit the /etc/ruei.conf
file on both the primary and secondary Processing Engines to specify the virtual, primary, and standby IP addresses. For example:
export RUEI_REP_FAILOVER_PRIMARY_IP=192.168.56.201 export RUEI_REP_FAILOVER_STANDBY_IP=192.168.56.202 export RUEI_REP_FAILOVER_VIRTUAL_IP=10.11.12.23 export RUEI_REP_FAILOVER_VIRTUAL_DEV=eth0 export RUEI_REP_FAILOVER_VIRTUAL_MASK=255.255.255.0
THE RUEI_REP_FAILOVER_PRIMARY_IP and RUEI_REP_FAILOVER_STANDBY_IP settings should specify the IP addresses of the crossover cable between the two Processing Engine systems. See Section 2.4.1, "The RUEI Configuration File" for an explanation of these settings. Note that the settings specified on both Processing Engine systems must be identical except for the RUEI_REP_FAILOVER_VIRTUAL_DEV setting.
Issue the following command to restart processing of monitored traffic on the primary Processing Engine system:
project -start
Install the ruei-reporter-failover.sh
script on both Processing Engine systems. For example, in the /usr/local/sbin
directory. It is located in the RUEI zip file (see Section 2.3, "Unpacking the RUEI Software").
Add the following entry to the root
user's crontab
file of both the primary and secondary Processing Engine systems:
* * * * * /usr/local/sbin/ruei-reporter-failover.sh
This causes the secondary Processing Engine to send a heartbeat signal to the primary Processing Engine every 60 seconds, and take over processing of RUEI monitored traffic in the event that the Primary Processing Engine becomes unavailable.
Wait at least 60 seconds.
Ensure that all user access to the Reporter GUI is via the specified virtual IP address. This is necessary to ensure automatic failover to the secondary Processing Engine system in the event that the primary Processing Engine system becomes unavailable.
Check the RUEI_DATA/processor/log/failover.log
file on both Processing Engine systems. These files contain the results of the "ping" commands. Ensure that there are no error messages. For example, about unspecified failover configuration settings.
Check the output of the /sbin/ifconfig
command on the primary Processing Engine to ensure that the virtual IP address has been correctly configured. For example:
/sbin/ifconfig eth0 Link encap:Ethernet HWaddr 08:00:27:F7:B0:14 inet addr:192.168.56.201 Bcast:192.168.56.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fef7:b014/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:80 errors:0 dropped:0 overruns:0 frame:0 TX packets:311 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:12793 (12.4 KiB) TX bytes:26268 (25.6 KiB) eth0:0 Link encap:Ethernet HWaddr 08:00:27:F7:B0:14 inet addr:10.11.12.23 Bcast:192.168.56.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Shutdown the primary Processing Engine system, and verify that the secondary Processing Engine begins processing monitored traffic. A warning that the primary system is unreachable and that the secondary system is being activated is reported in the Event log. Note that after doing so, you must perform a failback to return your RUEI installation to its original state.
Failback to the primary Processing Engine system must be performed manually in order to return your RUEI installation to its original state. Do the following:
Load your global RUEI configuration settings using the following command as the root
user:
. /etc/ruei.conf
Ensure that the heartbeat mechanism between the primary and secondary Processing Engine systems is functioning correctly. To do so, verify that they can 'ping' each other on the RUEI_REP_FAILOVER_PRIMARY_IP and RUEI_REP_FAILOVER_STANDBY_IP IP addresses.
To instigate the fallback, remove the active-failover-server
file, and shutdown the virtual interface on the secondary server by issuing the following commands:
rm $RUEI_DATA/processor/data/active-failover-server ifconfig $RUEI_REP_FAILOVER_VIRTUAL_DEV:0 down