Configuring Oracle Traffic Director for High Availability

4 Configuring Oracle Traffic Director for High Availability

High availability is the ability of a system or device to be available when it is needed. A high availability architecture ensures that users can access a system without loss of service. Deploying a high availability system minimizes the time when the system is down, or unavailable, and maximizes the time when it is running, or available.

This section describes the instructions for configuring Oracle Traffic Director for failover across Oracle Traffic Director instances. It includes the following sections:

Overview

The high availability solution for Oracle Traffic Director provides the ability to create multiple instances and then configure the IP failover for a given VIP between the instances.

In Oracle Traffic Director, high availability solution creates a redundancy for a given virtual IP address (VIP) by configuring IP failover between two or more instances. The IP failover is configured as a Failover Group which is a grouping of a VIP, an instance designated as the primary and one or more instances designated as the backup instances. The failover is transparent to both the client which is sending the traffic and the Oracle Traffic Director instance that is receiving the traffic.

Failover configuration modes

You can configure the Oracle Traffic Director instances in a failover group to work in the following modes:

Active-passive: A single VIP address is used. One instance in the failover group is designated as the primary node. If the primary node fails, the requests are routed through the same VIP to the other instance.
Active-active: A single VIP address is used. One of the nodes is the master node, and the other nodes are backup nodes. The incoming requests to VIP is distributed among the OTD instances. If the master node fails, then the backup node having the highest priority will be chosen as the next master node.

Failover in Active-Passive Mode

In the active-passive setup described here, one node in the failover group is redundant at any point in time.

Oracle Traffic Director provides support for failover between the instances in a failover group by using an implementation of the Virtual Routing Redundancy Protocol (VRRP), such as keepalived for Linux and vrrpd (native) for Solaris. This mode of failover is supported only on Solaris and Linux platforms.

Description of active-passive-ha-tpoplogy.gif follows

Description of the illustration active-passive-ha-tpoplogy.gif

Keepalived provides other features such as load balancing and health check for origin servers, but Oracle Traffic Director uses only the VRRP subsystem. For more information about Keepalived, go to http://www.keepalived.org.

VRRP specifies how routers can failover a VIP address from one node to another if the first node becomes unavailable for any reason. The IP failover is implemented by a router process running on each of the nodes. In a two-node failover group, the router process on the node to which the VIP is currently assigned is called the master. The master continuously advertises its presence to the router process on the second node.

Note:

On a Linux host that has an Oracle Traffic Director instance configured as a member of a failover group, Oracle Traffic Director should be the only consumer of Keepalived. Otherwise, when Oracle Traffic Director starts and stops the keepalived daemon for effecting failovers during instance downtime, other services using keepalived on the same host can be disrupted.

If the node on which the master router process is running fails, the router process on the second node waits for about three seconds before deciding that the master is down, and then assumes the role of the master by assigning the VIP to its node. When the first node is online again, the router process on that node takes over the master role. For more information about VRRP, see RFC 5798 at http://datatracker.ietf.org/doc/rfc5798.

Failover in Active-Active Mode

Oracle Traffic Director provides support for failover between the instances by deploying instances on the nodes which are in the same subnet. One of the nodes is chosen as the active router node and the remaining node(s) are the backup router node(s).The traffic will be managed among all the Oracle Traffic Director instances.

This mode of failover is supported only on the Linux platform. The solution also uses Keepalived v 1.2.13 and Linux Virtual Server (LVS) to perform load balancing and failover tasks. In addition, the following packages are required.

ipvsadm (1.26 or later)
iptables (1.4.7 or later)

Description of active-active-ha-tpoplogy.eps follows

Description of the illustration active-active-ha-tpoplogy.eps

In the beginning, all the nodes are configured as the backup nodes and the nodes are assigned different priorities. The highest priority node is chosen as the master and the other nodes are the backup nodes. If the master node fails, then the backup node having the highest priority is chosen as the next master node. The keepalived master node will also be the master node for LVS.

Keepalived does following:

Plumbs the virtual IP on the master
Sends out gratuitous ARP messages for the VIP
Configure the LVS (ipvsadm)
Health-check for Keepalived on other nodes

LVS does following:

Balance the load across the Oracle Traffic Director instances
Share the existing connection information to the backup nodes via multicasting.
To check the integrity of the services on each Oracle Traffic Director instance. In case, an Oracle Traffic Director instance fails then that instance will be removed from the LVS configuration and when it comes back online then it will be added again.

Preparing your System for High Availability

The typical deployment topology for Oracle Traffic Director is a three-node installation which are part of a single domain.

There is the WebLogic Server Administration machine on which Oracle Traffic Director is collocated with a WebLogic Server and JRF installation. This hosts the WebLogic Server Administration server.
There are two other machines which have the Oracle Traffic Director standalone installations which have only a subset of WebLogic Server binaries and which hosts managed Oracle Traffic Director domains.
The two Oracle Traffic Director instances typically provide high availability for a VIP (virtual IP) by forming a failover group.

Prerequisites

Operating System	High Availability Requirements	System Requirements
Linux	Active-Active keepalived package (1.2.13 or later)	3 machines with OEL 6.6 or above. 2 machines must be in the same subnet. A virtual IP on the same subnet to create a failover group. This IP address should be free and should not already be assigned to any machine on the network.
Linux	Active-Passive keepalived package (1.2.12 or later)
Solaris	Active-Active	Not Supported
Solaris	Active-Passive vrrpadm package ipadm package	3 machines with S11u2 or above. 2 machines must be in the same subnet. A virtual IP on the same subnet to create a failover group. This IP address should be free and should not already be assigned to any machine on the network.
Windows	Active-Active	Not Supported
Windows	Active-Passive	Not Supported
AIX	Active-Active	Not Supported
AIX	Active-Passive	Not Supported

Notations

The following tokens are used to represent the same data throughout this chapter.

Token	Description
`OTD_HOST_1`	Host name of the machine on which the primary instance of the failover group is running.
`OTD_HOST_2`	Host name of the machine on which the back up instance of the failover group is running.
`OTD_MACHINE_1`	Name specified while creating machine corresponding to OTD_HOST_1 in Weblogic Server Console.
`OTD_MACHINE_2`	Name specified while creating machine corresponding to OTD_HOST_2 in Weblogic Server Console.
`VIP`	The virtual IP (on the same subnet as OTD_HOST_1 and OTD_HOST_2) used to create a failover group.
`WLS_ADMIN_HOST`	Host name of the server on which WebLogic Server admin runs.

Configuring High Availability

Configuring high availability for Oracle Traffic Director starts with installing WebLogic Server, Oracle Traffic Director. It also includes creating a WebLogic domain, a standalone Oracle Traffic Director installation and then creating failover instances.

Install WebLogic Server. See Installing WebLogic Server.
Install Oracle Traffic Director on a collocated mode with WebLogic Server and restricted JRF template. See Installing Oracle Traffic Director.
Create a WebLogic domain using JRF template to manage Oracle Traffic Director instances on remote machines. See Creating a Domain using JRF Template.
Create a standalone Oracle Traffic Director installation on 2 other hosts OTD_HOST_1 and OTD_HOST_2. See Installing Oracle Traffic Director in a Standalone Domain.

Create Managed Domains with Standalone Oracle Traffic Director Installations.

Open a new terminal on the WLS_ADMIN_HOST and start the WebLogic Administration Server.

Syntax: $DOMAIN_HOME/startWeblogic.sh
Log on to the WebLogic Server console at http://$WLS_ADMIN_HOST:$WLS_ADMIN_PORT/console with credentials WLS_ADMIN_USER and WLS_ADMIN_PASSWORD.

Create machine entries corresponding to OTD_HOST_1 and OTD_HOST_2 with the properties shown in the table below. For information on creating and configuring machines, see Create and configure machines.

Table 4-1 Create machine entries using these values

Parameter	Sub-Parameter	Value
`Machine Properties`	`Name`	It is recommended that this be the same as the host name of the machine ( `OTD_HOST_1` or `OTD_HOST_2`). This name will be referred to hereafter as `OTD_MACHINE_1` and `OTD_MACHINE_2`.
`Machine Properties`	`Machine Properties-Machine OS`	Use the default value.
`Node Manager Properties`	`Type`	SSL
	Listen Address	Host name of the remote host ( `OTD_HOST_1` or `OTD_HOST_2`).
	Listen Port	The port the Node Manager is configured to listen on the remote host (default is 5556).

For example:

Table 4-2 Example for creating machine entries

Parameter	Sub-Parameter	Value
`Machine Properties`	`Name`	example.com.
`Machine Properties`	`Machine Properties-Machine OS`	Use the default value
`Node Manager Properties`	`Type`	SSL
	Listen Address	example.com.
	Listen Port	5556

Open a terminal on WLS_ADMIN_HOST and execute the pack.sh command.

Note:
Ensure that both the remote hosts OTD_HOST_1 and OTD_HOST_2 are added as machines on the console before packing the domain.

Syntax: $ORACLE_HOME/oracle_common/common/bin/pack.sh -domain=<full path to the domain that needs to be packed> -template=<full path to a template jar file to be created> -template_name=<description> -managed=true

Example: $ORACLE_HOME/oracle_common/common/bin/pack.sh -domain=$DOMAIN_HOME -template=/share/files/otd_ha.jar -template_name=ha -managed=true

This command creates a template archive .jar file that contains a subset of the domain that can be used to create an Oracle Traffic Director managed domain on the remote machine.
Copy the template jar created by pack to OTD_HOST_1 and OTD_HOST_2 or keep the jar in a file system location which can be accessed from OTD_HOST_1 and OTD_HOST_2.
Execute the unpack.sh command on both OTD_HOST_1 and OTD_HOST_2 to create an Oracle Traffic Director managed domain.

Syntax: $ORACLE_HOME/oracle_common/common/bin/unpack.sh -domain=<full path to the domain that needs to be created> -template=<full path to the template jar file created using pack>

Example: $ORACLE_HOME/oracle_common/common/bin/unpack.sh -domain=$DOMAIN_HOME -template=/share/files/otd_ha.jar
Start the Node Manager in a new terminal on both OTD_HOST_1 and OTD_HOST_2.

Syntax: $DOMAIN_HOME/bin/startNodeManager.sh
Log on to the Weblogic Server Console on WLS_ADMIN_HOST and ensure that the status of the Node Manager on OTD_HOST_1 and OTD_HOST_2 is shown as active. For information about monitoring the Node Monitor status, see Monitor Node Manager status .

Create Remote Instances and Enable Failover.

Install Oracle Traffic Director on two remote hosts OTD_HOST_1 and OTD_HOST_2.

Note:
To enable failover, the two remote hosts must be on the same subnet.
Ensure that WebLogic Server is running on WLS_ADMIN_HOST and Node Managers are running on OTD_HOST_1 and OTD_HOST_2.

Connect to the Administration Server and start an edit session using WLST command interface.

Syntax:

> $ORACLE_HOME/oracle_common/common/bin/wlst.sh
> connect('$WLS_ADMIN_USER','$WLS_ADMIN_PASSWORD','t3://$WLS_ADMIN_HOST:$WLS_ADMIN_PORT')
> editCustom()
> startEdit()

Create a new Oracle Traffic Director configuration.

Syntax:

> props = {'origin-server': '<origin servers>', 'listener-port': '<listener port>', 'name': '<config name>', 'server-name': '<server name>'}
> otd_createConfiguration(props)

Example:

> props = {'origin-server': 'localhost:20004', 'listener-port': '20009', 'name': 'ha', 'server-name': 'myservername'}
> otd_createConfiguration(props)

Create Oracle Traffic Director instance on the 2 remote machines.

Syntax:

> props={'configuration': '<config name>', 'machine': '$OTD_MACHINE_1'}
> otd_createInstance(props)
> props={'configuration': '<config name>', 'machine': '$OTD_MACHINE_2'}
> otd_createInstance(props)

This creates 2 instances,

otd_<config name>_$OTD_MACHINE_1

and

otd_<config
                                                  name>_$OTD_MACHINE_2

> props={'configuration': 'ha', 'machine': 'example.com'}
> otd_createInstance(props)
> props={'configuration': 'ha', 'machine': 'example.com'}
> otd_createInstance(props)

Create a failover group using the two instances.

To create an active-active failover:

Syntax:

> props = {'configuration': '<config name>', 'virtual-ip': '$VIP', 'failover-type': 'active-active'}
> otd_createFailoverGroup(props)

Example:

> startEdit()
> props = {'configuration': 'ha', 'virtual-ip': '10.20.30.40', 'failover-type': 'active-active'}
> otd_createFailoverGroup(props)

Add 2 failover instance details from different nodes to the active-active Failover group.

Syntax:

> props = {'configuration': '<config name>', 'virtual-ip': '$VIP', 'instance': 'otd_ha_$OTD_MACHINE_1', 'nic': '<network interface name>'}
> otd_addFailoverInstance(props)

Example:

> startEdit()
> props = {'configuration': 'ha', 'virtual-ip': '10.20.30.40', 'instance': 'otd_ha_example1.com', 'nic': 'eth0'}
> otd_addFailoverInstance(props)
> props = {'configuration': 'ha', 'virtual-ip': '10.20.30.40', 'instance': 'otd_ha_example2.com', 'nic': 'eth0'}
> otd_addFailoverInstance(props)

Verify the failover group creation.

Syntax:

> props = {'configuration': '<config name>', 'virtual-ip': '$VIP'}
> otd_getFailoverGroupProperties(props)

Example:

> props = {'configuration': 'ha', 'virtual-ip': '10.20.30.40'}
> otd_getFailoverGroupProperties(props)

The details of the created group are displayed as follows:

> failover-type=active-active 
> router-id=72
> virtual-ip=10.20.30.40
> instances=otd_ha_example1.com,otd_ha_example2.com

To create an active-passive failover:

Syntax:

> props = {'configuration': '<config name>', 'virtual-ip': '$VIP', 'primary-instance': 'otd_ha_$OTD_MACHINE_1', 'backup-instance': 'otd_ha_$OTD_MACHINE_2', 
           'primary-nic': 'network interface on the primary instance', 'backup-nic': 'network interface on the backup instance'}
> otd_createFailoverGroup(props)

Note:

VIP should be an IP address on the same subnet as OTD_HOST_1 and OTD_HOST_2.
Ensure that VIP has not been assigned to any machine on the subnet.

Example:

> startEdit()
> props = {'configuration': 'ha', 'virtual-ip': '10.20.30.40', 'primary-instance': 'otd_ha_example.com', 'backup-instance': 'otd_ha_example.com', 
           'primary-nic': 'eth0', 'backup-nic': 'eth0'}
> otd_createFailoverGroup(props)

Activate the changes. This propagates the failover configuration to the 2 instances on OTD_HOST_1 and OTD_HOST_2.
Syntax: activate()

The instance is created in the path, <DOMAIN_HOME>/config/fmwconfig/components/OTD/instances/otd_<config name>_<machine name> on both the machines.

Start Failover. After creating remote Oracle Traffic Director instances on the remote hosts OTD_HOST_1 and OTD_HOST_2, you must start failover using Superuser privileges.

Ensure that Weblogic Server is running on WLS_ADMIN_HOST and Node Manager is running on both OTD_HOST_1 and OTD_HOST_2.

Using the WLST Command interface on the WLS_ADMIN_HOST connect to the WebLogic Administration Server.

Syntax:

> $ORACLE_HOME/oracle_common/common/bin/wlst.sh 
> connect('$WLS_ADMIN_USER','$WLS_ADMIN_PASSWORD','t3://localhost:$WLS_ADMIN_PORT')

Start instance otd_<config name>_$OTD_MACHINE_N using WLST command.

Syntax:

> start('otd_<config name>_$OTD_MACHINE_1', 'SystemComponent')
> state('otd_<config name>_$OTD_MACHINE_1')
Current state of "otd_<config name>_$OTD_MACHINE_1" : RUNNING
> start('otd_<config name>_$OTD_MACHINE_2', 'SystemComponent')
> state('otd_<config name>_$OTD_MACHINE_2')
Current state of "otd_<config name>_$OTD_MACHINE_2" : RUNNING

Example:

> start('otd_ha_example.com', 'SystemComponent')
> state('otd_ha_example.com')
Current state of "otd_ha_example.com" : RUNNING
> start('otd_ha_example.com', 'SystemComponent')
> state('otd_ha_example.com')
Current state of "otd_ha_example.com" : RUNNING

Start failover on both OTD_HOST_1 and OTD_HOST_2. Run otd_startFailover with Superuser privileges since the failover daemon needs to be started as root.

On OTD_HOST_1, run the following:

sudo $ORACLE_HOME/oracle_common/common/bin/wlst.sh
> props={'domain-home': $DOMAIN_HOME, 'instance': otd_ha_$OTD_MACHINE_1, 'log-verbose': 'true'}
> otd_startFailover(props)

On OTD_HOST_2, run the following:

sudo $ORACLE_HOME/oracle_common/common/bin/wlst.sh
> props={'domain-home': $DOMAIN_HOME, 'instance': otd_ha_$OTD_MACHINE_2, 'log-verbose': 'true'}
> otd_startFailover(props)

For example:

sudo $ORACLE_HOME/oracle_common/common/bin/wlst.sh

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

wls:/offline> props = {'domain-home': '$ORACLE_HOME/user_projects/domains/base_domain', 'instance': 'otd_ha_example.com', 'log-verbose': 'true'}
wls:/offline> otd_startFailover(props)

OTD-67856 Failover has been started successfully.