2.9 Configuring Fast Connection Failover

To fully benefit from fast instance and database failover and switchover with Oracle RAC and Data Guard, you should configure Fast Connection Failover. When a database service becomes unavailable, Fast Connection Failover enables clients (mid-tier applications or any program that connects directly to a database) to failover quickly and seamlessly to an available database service.

Because client failover features have evolved over several Oracle Database releases, the time required for clients to respond to different outages varies by release. The time required for failover in certain cases is directly related to TCP/IP network timeouts.

Table 2-9 shows typical wait times when using client failover features.

Table 2-9 Typical Wait Times for Client Failover

Oracle Database Release Client Type Site Failure RAC Node Failure Non-RAC Instance Failure RAC Instance Failure

8.0, 8i, 9i

All

TCP timeout

TCP timeout

Seconds to minutesFoot 1 

Seconds

10g Release 1 (10.1)

JDBC

TCP timeout

Seconds

Seconds to minutes1

Seconds

10g Release 1 (10.1)

OCI

TCP timeout

TCP timeout

Seconds to minutes1

Seconds

10g Release 2 (10.2)

11g Release 1 (11.1)

JDBC

Seconds

Seconds

Seconds

Seconds

10g Release 2 (10.2)

11g Release 1 (11.1)

OCI

SecondsFoot 2 

Seconds

Seconds

Seconds


Footnote 1 The wait times required in non RAC instance failures are determined by how much time is needed to activate the standby database as the primary database and for the client to establish a connection.

Footnote 2 Excluding ODP.NET clients that suffer an outage equal to that of TCP timeout.

Note:

Clients that cannot take advantage of FAN can achieve fast failover by configuring timeouts and application retries, as follows:
  • Configure timeouts using either operating system TCP parameters or custom application timeouts.

  • Configure applications to automatically attempt to reconnect in the event that an exception is generated from a timeout.

Use the best practices in the following sections to configure client failover:

See Also:

The MAA white paper "Client Failover Best Practices for Highly Available Oracle Databases" at http://www.oracle.com/technology/deploy/availability/pdf/MAA_WP_10gR2_ClientFailoverBestPractices.pdf

2.9.1 Configure JDBC and OCI Clients for Failover

Delays caused by TCP/IP network timeouts can be overcome for both JDBC clients and OCI clients by using Fast Connection Failover. With Fast Connection Failover, a trigger on the standby database is invoked by the DB_ROLE_CHANGE system event when the standby database transitions to the primary role. The trigger calls a publisher program that notifies clients that the database service is available on the new primary, breaking stalled clients out of their TCP timeout.

For JDBC clients, follow these best practices:

  1. Enable Fast Connection Failover for JDBC clients by setting the DataSource property FastConnectionFailoverEnabled to TRUE.

  2. Configure JDBC clients to use a connect descriptor that includes an address list that includes the VIP address for each node in the cluster and connects to an existing service.

  3. Configure a remote Oracle Notification Service (ONS) subscription on the JDBC client so that an ONS daemon is not required on the client.

For OCI clients, follow these best practices:

  1. Enable Fast Application Notification (FAN) for OCI clients by initializing the environment with the OCI_EVENTS parameter.

  2. Link the OCI client applications with the thread library.

  3. Set the AQ_HA_NOTIFICATIONS parameter to TRUE and configure the transparent application failover (TAF) attributes for services.

2.9.2 Configure Client Failover in an Oracle RAC Environment

For client failover in an Oracle RAC database, use these best practices:

  1. Use Oracle Enterprise Manager to create services.

  2. Add all hosts in the cluster to the Oracle RAC ONS configuration.

2.9.3 Configure Failover in an Oracle Data Guard Environment

The process of configuring for client failover in a Data Guard environment consists of the following general tasks:

Service relocation: The database service that the primary application uses to connect to the database should be active only on the primary database. In the event of a failover or switchover, this service should be migrated automatically to the new primary database and stopped on the original primary database.

Client notification: Once the failover has completed and the service is available on the new primary database, the application should be notified that a failover has occurred and connections should be migrated to the new primary database.

Efficient reconnection: The sessions being migrated and any new connections should quickly be located to the new primary database; clients should not stall due to waiting for timeouts on unavailable hosts or networks.

2.9.4 Prevent Login Storms

The process of failing over an application that has a large number of connections may create a login storm. A login storm is a sudden spike in the number of connections to a database instance, which drains CPU resources. As CPU resources are depleted, application timeouts and application response times are likely to increase.

To control login storms:

  • Implement the Connection Rate Limiter

    The primary method of controlling login storms is to implement the Connection Rate Limiter feature of the Oracle listener. This feature limits the number of connections that can be processed in seconds. Slowing down the rate of connections ensures that CPU resources remain available and that the system remains responsive.

    See Also:

  • Configure Oracle Database for shared server operations

    In addition to implementing the Connection Rate Limiter, some applications can control login storms by configuring Oracle Database for shared server operations. By using shared server, the number of processes that must be created at failover time are greatly reduced, thereby avoiding a login storm.

    See Also:

    Oracle Database Administrator's Guide for more information about configuring and controlling shared server operations
  • Adjust the maximum number of connections in the mid tier connection pool

    If such a capability is available in your application mid tier, try limiting the number of connections by adjusting the maximum number of connections in the mid tier connection pool.