Configuring Level 1: Basic Application High Availability

Implement a level of high availability that allows applications to immediately react to instance, node, or database failures, and quickly establish new connections to surviving database instances.

With application HA Level 1, downtime is minimized for unplanned and planned outages. You get these benefits by ensuring that the application configuration implements these recommendations. No code changes are required.

Step 1: Configure High Availability Database Services

Create a non-default, role-based database service to use high-availability features.

A database service is a logical abstraction for managing workloads or a group of applications sharing similar SLAs or types of workloads (for example, OLTP vs. batch). Database services provide location transparency and hide complex aspects of the underlying system from the client.

Your application must connect to a non-default database service to use high-availability features. You must explicitly create a service (or several services as needed for different application workloads) instead of using the default database service or the default PDB service (that is, the service with the same name as the database or PDB).

On Oracle Autonomous Database, services are created for you using recommended attributes.

About Server-Side Configuration for Services

These services are configured by a database administrator to set up services through Oracle Clusterware.

When using Oracle Data Guard and standby databases, create services using the primary role to ensure that applications connect to the primary database for read/write operations, and standby role for services to optionally offload read-only and small infrequent writes to the standby database.

Services start and stop automatically after a Data Guard role transition (for example, switchover or failover) based on their roles.

Configure your services according to your architecture in one of the following sections:

Note:

Services must be started so that they can be used after creating them. Use a command like this:

$ srvctl start service -db mydb -service my_service

See also:

Using Oracle Services in Oracle Real Application Clusters Administration and Deployment Guide

Configure High Availability Services

Create a non-default, role-based database service to use high-availability features.

A service may be configured to direct connections to a single preferred instance, or alternatively, if the preferred instance is down, to an available instance. When a service is available only on one instance, it is called a singleton service. This allows you to isolate workloads among instances in a cluster.

You could also configure a service to put connections on multiple instances of a cluster, to spread work across all instances. Also, if one instance is down, connections can be made on the surviving instances.

There are other combinations where you can configure a subset of instances as "preferred" and another subset of instances as "available". These subsets provide for spreading load across some instances while isolating work from others (and still have instances available in case of a failure).

Example 1: Singleton Service

This example creates a singleton service called MyService for the primary role, where the connections are made on instance inst1, unless that instance is not available. If the instance is not available, connections are made on inst2. It also configures a default drain timeout of 300 seconds to wait for sessions to drain; at the end of that time any remaining sessions are terminated due to the IMMEDIATE option.

The settings for commit_outcome and failovertype enable Transparent Application Continuity (TAC) for the future, if you decide to implement it (this is an advanced feature; see Oracle Application Continuity on Oracle MAA for details). Enabling TAC has no detrimental impact and automatically provides benefits when prerequisites are met, if you should decide to move to HA Level 3.

$ srvctl add service -db mydb -service my_service -pdb mypdb
 –preferred inst1 -available inst2 -commit_outcome TRUE 
 -failovertype AUTO -notification TRUE -drain_timeout 300
 -stopoption IMMEDIATE -role PRIMARY

If you want your application to gracefully switch to another Oracle RAC instance with no application blackout, set the drain_timeout interval to a sufficient timeout that allows your applications to close their connections between transactions and gracefully stop or move to another instance. The drain_timeout interval is best leveraged for short OLTP applications. For large batch operations, it's best defer or suspend these operations before a planned maintenance window.

Example 2: Service with Multiple Instances

This example creates a service that is similar to the singleton above but spreads connections across multiple instances in this cluster:

$ srvctl add service -db mydb -service my_service -pdb mypdb
 –preferred inst1,inst2 -commit_outcome TRUE -failovertype AUTO
 -notification TRUE -drain_timeout 300 -stopoption IMMEDIATE
 -role PRIMARY

Configure High Availability Services for Oracle Active Data Guard or Standby Roles

Create a service used to connect to a standby database (read-only physical standby).

Create a service as shown in the following example:

$ srvctl add service -db mydb -service my_standby_service
 -pdb mypdb –preferred inst1 -available inst2 -notification TRUE
 -drain_timeout 300 -stopoption IMMEDIATE -role PHYSICAL_STANDBY

Step 2: Configure the Connection String for High Availability

Oracle recommends that your application use the connection string configuration shown here to connect successfully during various scenarios including database switchover and failover to other sites.

Example 1: Connect string with Oracle RAC primary database and no standby

Alias = (DESCRIPTION =
(CONNECT_TIMEOUT= 90)(RETRY_COUNT=20)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=1000ms)
    (ADDRESS_LIST =
        (LOAD_BALANCE=on)
        (ADDRESS = (PROTOCOL = TCP)(HOST=clu_site1-scan)(PORT=1521))) 
    (CONNECT_DATA=(SERVICE_NAME = my_service)))

Example 2: Connect string with Oracle RAC primary and standby databases

This example makes connections to an Oracle RAC primary database or a standby database, depending on which one is available.

Alias = (DESCRIPTION =
(CONNECT_TIMEOUT= 90)(RETRY_COUNT=100)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=1000ms)
    (ADDRESS_LIST =
        (LOAD_BALANCE=on)
        (ADDRESS = (PROTOCOL = TCP)(HOST=clu_site1-scan)(PORT=1521)))
    (ADDRESS_LIST =
        (LOAD_BALANCE=on)
        (ADDRESS = (PROTOCOL = TCP)(HOST=clu_site2-scan)(PORT=1521)))      
    (CONNECT_DATA=(SERVICE_NAME = my_service)))

Note:

clu_site1-scan and clu_site2-scan refer to SCAN listeners in a cluster on site1 and site2, respectively.

It's recommended that you use the most recent drivers, but all Oracle drivers from release 12.2 and later should use the example connection strings above. Specific values can be tuned, but the values shown in this example are reasonable starting points, and so usable for almost all cases.

It is highly recommended that you maintain your connect string or URL in a central location, such as LDAP or tnsnames.ora. Do not scatter the connect string or URL in property files or private locations, as doing so makes it extremely difficult to maintain. Using a centralized location helps you preserve standard format, tuning, and service settings. Oracle's solution for this is to use LDAP with the Oracle Unified Directory product.

See also:

Step 3: Ensure That FAN Is Used

FAN provides an intelligent and immediate interrupt when outages occur allowing for a much smaller application impact or brownout.

When a service needs to drain for routine maintenance, or unplanned failures (such as node or network outages), the application needs to be informed in real time, so that it moves connections quickly to another instance or site. This is accomplished using Oracle's Fast Application Notification (FAN) feature. Enable FAN to prevent applications from hanging when physical failures, such as node, network, or site failures occur.

FAN uses Oracle Clusterware's Oracle Notification Service (ONS) to receive events from the cluster. ONS requires ports to be available between the client and the servers, and in some cases this requires a firewall port to be opened.

Registration to receive FAN events is enabled automatically when using the recommended service and connect string in steps 1 and 2 above.

The ONS port (by default, 6200) needs to be opened on all of your database servers, the firewall, and Oracle Active Data Guard nodes. For cloud environments, this step is very important, including for Oracle Autonomous Database on Dedicated Exadata Infrastructure (ADB-D), Exadata Database Service on Dedicated Infrastructure (ExaDB-D), and Oracle Exadata Database Service on Cloud@Customer (ExaDB-C@C), as shown in the example.

Enabling FAN for Clients

There are no application code changes to use FAN. FAN only requires an Oracle driver and the recommended database connect string.

FAN is auto-configured and is enabled out of the box. When connecting to the Oracle database, the database uses the URL or TNS connect string to auto-configure FAN at the client.

It is important to use the TNS formats shown in Step 2 for auto-configuration of FAN; using a different format syntax can prevent FAN from being auto-configured. To use FAN, you must connect to a database service (That you configured in Step 1) and you must be able to receive events from the Oracle Notification Service (ONS), which may require a port to be opened, as mentioned above.

FAN may also be configured manually using connection pool settings (see below), if needed.

See below for configuration requirements with different pool types.

JDBC FAN Requirements

For client drivers using UCP:

  • Use the recommended connection URL/string (see above) for auto-configuration of ONS.
  • Include JDBC JAR files ojdbc8.jar (or later), ons.jar, and simplefan.jar on the CLASSPATH (plus optional wallet jars, if needed: osdt_cert.jar, osdt_core.jar, and oraclepki.jar).
  • Set the pool or driver property to enable Fast Connection Failover (for example, in UCP it is set for the PoolDataSource using setFastConnectionFailoverEnabled(true)).
  • Disable auto-commit connection property (for example, in UCP it is disabled for the PoolDataSource using setConnectionProperty(OracleConnection.CONNECTION_PROPERTY_AUTOCOMMIT, "false"); )
  • For third-party JDBC pools, Oracle recommends using Universal Connection Pool (UCP) as a data source.
  • Open port 6200 for ONS communication from your database server (6200 is the default port, a different port may have been chosen).

If you are not able to use the recommended connect URL/string, configure your clients manually by setting:

oracle.ons.nodes=Node01:6200, Node02:6200, Node03:6200 

Additional settings might be needed when configuring manually. For example walletfile and walletpassword.

Connection pools other than UCP will have analogous requirements.

OCI FAN Requirements

  • For Oracle Call Interface (OCI) clients:

    OCI clients embed FAN at the driver level so that all clients can use them regardless of the pooling solution.

    The database service must have the attribute "-notification TRUE" set

    If oraaccess.xml is in use, ensure that the events tag is TRUE:

    <oraaccess> xmlns="http://xmlns.oracle.com/oci/oraaccess"
     xmlns:oci="http://xmlns.oracle.com/oci/oraaccess"
     schemaLocation="http://xmlns.oracle.com/oci/oraaccess
     http://xmlns.oracle.com/oci/oraaccess.xsd">
     <default_parameters>
     <events>true</events>
     </default_parameters>
    </oraaccess> 
  • For ODP.Net clients

    Specify HA events in the connect string

    "user id=oracle; password=oracle; data source=HA; pooling=true; HA events=true;"

See also:

Overview of Oracle Integrated Clients and FAN in Oracle Real Application Clusters Administration and Deployment Guide

Step 4: Ensure Application Implements Reconnection Logic

Applications should be written to catch connection failure exceptions and errors during database calls so they can obtain new connections and continue with new work.

For JDBC-based apps, the SQLRecoverableException can be caught to distinguish connection errors from typical application or SQL errors. If a connection error is caught, then a new connection should be obtained. This is simpler and more robust than checking for individual Oracle errors (which can adjust by Oracle Database release) in the SQLException class.