Skip Headers
Oracle® Real User Experience Insight Installation Guide
11g Release 1 for Linux x86-64

Part Number E22308-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

8 Configuring a Failover Collector System

This chapter describes the procedure for configuring a failover remote Collector system that will take over monitoring of network traffic in the event that the primary Collector system becomes unavailable. Note that the described procedure assumes that the primary Collector system has been installed, configured, and is fully operational.

The procedure to configure a failover Reporter system is described in Chapter 7, "Configuring a Failover Reporter System".

Introduction

The configuration of a secondary (or failover) Collector system offers the advantage that it can seamlessly take over monitoring of network traffic in the event that the primary Collector system becomes unavailable. In this way, a high level of operational reliability is achieved. Note that this facility is only available for remote Collectors. The configuration of a failover Collector system is shown in Figure 8-1.

Figure 8-1 Failover Collector Configuration

Description of Figure 8-1 follows
Description of "Figure 8-1 Failover Collector Configuration"

At server level, a crossover network cable connects the primary and secondary Collector systems.As long as a regular "heartbeat" continues between the primary and secondary servers, the secondary server will not initiate monitoring of the network traffic. However, the secondary server will take over the monitoring task of the primary Collector as soon as it detects a failure in the "heartbeat" of the primary server. This process is referred to as failover. The secondary Collector will take over the primary Collector's virtual IP address, and it is through this that the Reporter system will communicate with it.

Note that failback (that is, the process of restoring the primary Collector to its original state), must be performed manually. The procedure is described in Initiating Collector Failback.

Prerequisites

In order to configure a failover Collector installation, the following conditions must be met:

Important

When configuring a failover Collector system, be aware of the following:

Installing the Secondary Collector

The installation procedure for a secondary Collector system is identical to that of a remote Collector system.

  1. Install the Linux operating system and the RUEI Collector software on both Collector systems. The procedure to do so is described in Prerequisites.

  2. When starting the installation procedure for the secondary Collector system, ensure that the /etc/ruei.conf file is identical to that of the primary Collector system.

Configuring the Secondary Collector

Do the following:

  1. Copy the .ssh directory (created when following the procedure described in Configuring Reporter Communication (Split-Server Setup Only) on the primary Collector to the secondary Collector. Note that it must be copied to the same location.

  2. On the primary Collector system, issue the following commands to add the "host keys" for the Collector to the global known_hosts file on the Reporter system:

    . /etc/ruei.conf
    ifconfig ${RUEI_COL_FAILOVER_VIRTUAL_DEV}:0 $RUEI_COL_FAILOVER_VIRTUAL_IP \
    netmask $RUEI_COL_FAILOVER_VIRTUAL_MASK up
    sleep 2
    arping -c 3 -A -I $RUEI_COL_FAILOVER_VIRTUAL_DEV $RUEI_COL_FAILOVER_VIRTUAL_IP
    

    On the Reporter system, use an arp -a or ping command to check that you can reach the virtual IP address on the primary Collector system.

    Then, issue the following command:

    ssh-keyscan -t rsa,dsa Collector-virt-ip-address >> /etc/ssh/ssh_known_hosts
    

    As the RUEI_USER user, ensure that the virtual Collector IP address is not specified in the ~/.ssh/known_hosts file.

    Attempt to establish an SSH connection as the RUEI_USER user from the Reporter system to the primary Collector system. Note that you should not receive any warning or prompt about the host key, and you should be logged in automatically.

    On the primary Collector system, bring down the virtual IP address using the following command:

    ifconfig ${RUEI_COL_FAILOVER_VIRTUAL_DEV}:0
    $RUEI_COL_FAILOVER_VIRTUAL_IP netmask $RUEI_COL_FAILOVER_VIRTUAL_MASK down
    

    Repeat the above procedure for the secondary Collector system. Upon completion, four keys should be specified in the /etc/ssh/ssh_known_hosts file for the virtual IP address.

  3. Ensure that the uid and gid settings of the RUEI_USER user are the same on both the primary and secondary Collector systems. For example:

    id moniforce uid=501(moniforce) gid=502(moniforce) groups=502(moniforce)
    

    Important

    If you need to change the UID of the RUEI_USER user on an operational Collector system, you should:

    • Issue the following commands as the RUEI_USER user:

      appsensor stop wg
      sslloadkeys -f
      

      Note that you should enter yes (written in full) when prompted.

    • Change the user:group ownership of all files and directories under /var/opt/ruei/collector to the new UID.

    • Issue the following command as the root user:

      /etc/init.d/crond restart
      
  4. Configure the static IP addresses on both Collector systems used for the crossover cable. This can be done using a utility such as system-config-network.

  5. Mount the shared storage on the RUEI_DATA/collector directory, and edit the /etc/fstab file so that it is mounted at boot. For example:

    10.6.5.9:/home/nfs /var/opt/ruei/collector/data nfs rsize=1024,wsize=1024  0 0
    

    Important:

    Note that if the Collector is already operational before this step, and the $RUEI_DATA/collector directory is not shared, the existing directory content must be copied to the mount point specified above. Security Officers should be aware that this copying process includes server SSL keys.

    Note that if the Collector is already operational before this step, and the $RUEI_DATA/collector directory is not shared, the existing directory content must be copied to the mount point specified above. Security Officers should be aware that this copying process includes server SSL keys.

    Alternatively, if your shared storage does not provide sufficient bandwidth to keep up with the storage of replay data, you can symlink the REPLAY directories to a local location instead. In this case, only the HTTP log files and logs will be written to the shared disk. However, be aware that if you specify this configuration, replay data recorded before failover is initiated will be lost, and only sessions after the failover are accessible. In addition, these links will be reset to factory defaults and, therefore, the directories do not currently exist in the initial Collector setup.

  6. Edit the /etc/ruei.conf file on both the primary and secondary Collector systems to specify the virtual, primary, and standby IP addresses. For example:

    RUEI_COL_FAILOVER_PRIMARY_IP=192.168.56.201 # crossover cable primary
    RUEI_COL_FAILOVER_STANDBY_IP=192.168.56.202 # crossover cable secondary
    RUEI_COL_FAILOVER_VIRTUAL_IP=10.11.12.23    # (virtual) IP to access Collector
    RUEI_COL_FAILOVER_VIRTUAL_DEV=eth0
    RUEI_COL_FAILOVER_VIRTUAL_MASK=255.255.255.0
    

    The RUEI_COL_FAILOVER_PRIMARY_IP and RUEI_COL_FAILOVER_STANDBY_IP settings should specify the IP addresses of the crossover cable between the two Collector systems. See The RUEI Configuration File for an explanation of these settings. Note that the settings specified on both Collector systems must be identical.

  7. Ensure that all communication between the Reporter and the Collector is via the specified virtual IP address. This is necessary to ensure automatic failover to the secondary Collector system in the event that the primary Collector system becomes unavailable. Note that this may require you to reconfigure existing Collector systems.

  8. Install the ruei-collector-failover.sh script on both Collector systems. For example, in the /usr/local/bin directory. It is located in the RUEI zip file (see Unpacking the RUEI Software).

  9. Add the following entry to the root user's crontab file of both the primary and secondary Collector systems:

    * * * * * /usr/local/bin/ruei-collector-failover.sh
    

    This causes the secondary Collector to send a heartbeat signal to the primary Collector every 60 seconds, and take over processing of RUEI monitored traffic in the event that the Primary Collector becomes unavailable.

    Wait at least 60 seconds

  10. Check the output of the /sbin/ifconfig command on the primary Collector to ensure that the virtual IP address has been correctly configured. For example:

    $ /sbin/ifconfig
    eth0      Link encap:Ethernet  HWaddr 08:00:27:F7:B0:14
              inet addr:192.168.56.201  Bcast:192.168.56.255  Mask:255.255.255.0
              inet6 addr: fe80::a00:27ff:fef7:b014/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:80 errors:0 dropped:0 overruns:0 frame:0
              TX packets:311 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:12793 (12.4 KiB)  TX bytes:26268 (25.6 KiB)
    eth0:0    Link encap:Ethernet  HWaddr 08:00:27:F7:B0:14
              inet addr:10.11.12.23  Bcast:192.168.56.255  Mask:255.255.255.0
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    
  11. Unregister the primary remote Collector with the Reporter, and re-register it using the virtual IP address.

  12. Shutdown the primary Collector system, and verify that the secondary Collector begins processing monitored traffic. A warning that the primary system is unreachable and that the secondary system is being activated should be reported in the event log. Note that after doing so, you must perform a failback to return your RUEI installation to its original state.

Initiating Collector Failback

Failback to the primary Collector system must be performed manually in order to return your RUEI installation to its original state. Do the following:

  1. On the primary Collector system, issue the following commands:

    . /etc/ruei.conf 
    echo $RUEI_COL_FAILOVER_PRIMARY_IP > \ /var/opt/ruei/collector/active-failover-server
    
  2. On the secondary Collector system, issue the following commands:

    . /etc/ruei.confifconfig ${RUEI_COL_FAILOVER_VIRTUAL_DEV}:0 $RUEI_COL_FAILOVER_VIRTUAL_IP \ netmask $RUEI_COL_FAILOVER_VIRTUAL_MASK down
    
  3. On the primary Collector system (with the /etc/ruei.conf file still loaded), issue the following commands:

    ifconfig ${RUEI_COL_FAILOVER_VIRTUAL_DEV}:0 $RUEI_COL_FAILOVER_VIRTUAL_IP \
    netmask $RUEI_COL_FAILOVER_VIRTUAL_MASK up
    sleep 2
    arping -c 3 -A -I $RUEI_COL_FAILOVER_VIRTUAL_DEV $RUEI_COL_FAILOVER_VIRTUAL_IP