Configuring Level 1: Basic Application High Availability
Implement a level of high availability that allows applications to immediately react to instance, node, or database failures, and quickly establish new connections to surviving database instances.
With application HA Level 1, downtime is minimized for unplanned and planned outages. You get these benefits by ensuring that the application configuration implements these recommendations. No code changes are required.
At a high level, the steps to implement Level 1 are:
Step 1: Configure High Availability Database Services
Create a non-default, role-based database service to use high-availability features.
A database service is a logical abstraction for managing workloads or a group of applications sharing similar SLAs or types of workloads (for example, OLTP vs. batch). Database services provide location transparency and hide complex aspects of the underlying system from the client.
Your application must connect to a non-default database service to use high-availability features. You must explicitly create a service (or several services as needed for different application workloads) instead of using the default database service or the default PDB service (that is, the service with the same name as the database or PDB).
On Oracle Autonomous Database, services are created for you using recommended attributes.
About Server-Side Configuration for Services
These services are configured by a database administrator to set up services through Oracle Clusterware.
When using Oracle Data Guard and standby databases, create services using the primary role to ensure that applications connect to the primary database for read/write operations, and standby role for services to optionally offload read-only and small infrequent writes to the standby database.
Services start and stop automatically after a Data Guard role transition (for example, switchover or failover) based on their roles.
Configure your services according to your architecture in one of the following sections:
Note:
Services must be started so that they can be used after creating them. Use a command like this:
$ srvctl start service -db mydb -service
my_service
See also:
Using Oracle Services in Oracle Real Application Clusters Administration and Deployment Guide
Configure High Availability Services
Create a non-default, role-based database service to use high-availability features.
A service may be configured to direct connections to a single preferred instance, or alternatively, if the preferred instance is down, to an available instance. When a service is available only on one instance, it is called a singleton service. This allows you to isolate workloads among instances in a cluster.
You could also configure a service to put connections on multiple instances of a cluster, to spread work across all instances. Also, if one instance is down, connections can be made on the surviving instances.
There are other combinations where you can configure a subset of instances as "preferred" and another subset of instances as "available". These subsets provide for spreading load across some instances while isolating work from others (and still have instances available in case of a failure).
Example 1: Singleton Service
This example creates a singleton service called MyService for the
primary role, where the connections are made on instance inst1, unless that
instance is not available. If the instance is not available, connections are
made on inst2. It also configures a default drain timeout of 300 seconds to wait
for sessions to drain; at the end of that time any remaining sessions are
terminated due to the IMMEDIATE
option.
The settings for commit_outcome
and
failovertype
enable Transparent Application Continuity
(TAC) for the future, if you decide to implement it (this is an advanced
feature; see Oracle Application Continuity on
Oracle MAA for details). Enabling TAC has no detrimental impact and
automatically provides benefits when prerequisites are met, if you should decide
to move to HA Level 3.
$ srvctl add service -db mydb -service my_service -pdb mypdb
–preferred inst1 -available inst2 -commit_outcome TRUE
-failovertype AUTO -notification TRUE -drain_timeout 300
-stopoption IMMEDIATE -role PRIMARY
If you want your application to gracefully switch to another Oracle
RAC instance with no application blackout, set the
drain_timeout
interval to a sufficient timeout that allows
your applications to close their connections between transactions and gracefully
stop or move to another instance. The drain_timeout
interval is
best leveraged for short OLTP applications. For large batch operations, it's
best defer or suspend these operations before a planned maintenance window.
Example 2: Service with Multiple Instances
This example creates a service that is similar to the singleton above but spreads connections across multiple instances in this cluster:
$ srvctl add service -db mydb -service my_service -pdb mypdb
–preferred inst1,inst2 -commit_outcome TRUE -failovertype AUTO
-notification TRUE -drain_timeout 300 -stopoption IMMEDIATE
-role PRIMARY
Configure High Availability Services for Oracle Active Data Guard or Standby Roles
Create a service used to connect to a standby database (read-only physical standby).
Create a service as shown in the following example:
$ srvctl add service -db mydb -service my_standby_service
-pdb mypdb –preferred inst1 -available inst2 -notification TRUE
-drain_timeout 300 -stopoption IMMEDIATE -role PHYSICAL_STANDBY
Step 2: Configure the Connection String for High Availability
Oracle recommends that your application use the connection string configuration shown here to connect successfully during various scenarios including database switchover and failover to other sites.
Example 1: Connect string with Oracle RAC primary database and no standby
Alias = (DESCRIPTION =
(CONNECT_TIMEOUT= 90)(RETRY_COUNT=20)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=1000ms)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=clu_site1-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = my_service)))
Example 2: Connect string with Oracle RAC primary and standby databases
This example makes connections to an Oracle RAC primary database or a standby database, depending on which one is available.
Alias = (DESCRIPTION =
(CONNECT_TIMEOUT= 90)(RETRY_COUNT=100)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=1000ms)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=clu_site1-scan)(PORT=1521)))
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=clu_site2-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = my_service)))
Note:
clu_site1-scan and clu_site2-scan refer to SCAN listeners in a cluster on site1 and site2, respectively.It's recommended that you use the most recent drivers, but all Oracle drivers from release 12.2 and later should use the example connection strings above. Specific values can be tuned, but the values shown in this example are reasonable starting points, and so usable for almost all cases.
It is highly recommended that you maintain your connect string or URL in a central
location, such as LDAP or tnsnames.ora
. Do not scatter the connect
string or URL in property files or private locations, as doing so makes it extremely
difficult to maintain. Using a centralized location helps you preserve standard
format, tuning, and service settings. Oracle's solution for this is to use LDAP with
the Oracle Unified Directory product.
See also:
-
Connection Time Estimates During Data Guard Switchover or Failover
-
Oracle Unified Directory in Administering Oracle Unified Directory
-
Overview of Local Naming Parameters in Oracle Database Net Services Reference
Step 3: Ensure That FAN Is Used
FAN provides an intelligent and immediate interrupt when outages occur allowing for a much smaller application impact or brownout.
When a service needs to drain for routine maintenance, or unplanned failures (such as node or network outages), the application needs to be informed in real time, so that it moves connections quickly to another instance or site. This is accomplished using Oracle's Fast Application Notification (FAN) feature. Enable FAN to prevent applications from hanging when physical failures, such as node, network, or site failures occur.
FAN uses Oracle Clusterware's Oracle Notification Service (ONS) to receive events from the cluster. ONS requires ports to be available between the client and the servers, and in some cases this requires a firewall port to be opened.
Registration to receive FAN events is enabled automatically when using the recommended service and connect string in steps 1 and 2 above.
The ONS port (by default, 6200) needs to be opened on all of your database servers, the firewall, and Oracle Active Data Guard nodes. For cloud environments, this step is very important, including for Oracle Autonomous Database on Dedicated Exadata Infrastructure (ADB-D), Exadata Database Service on Dedicated Infrastructure (ExaDB-D), and Oracle Exadata Database Service on Cloud@Customer (ExaDB-C@C), as shown in the example.
Enabling FAN for Clients
There are no application code changes to use FAN. FAN only requires an Oracle driver and the recommended database connect string.
FAN is auto-configured and is enabled out of the box. When connecting to the Oracle database, the database uses the URL or TNS connect string to auto-configure FAN at the client.
It is important to use the TNS formats shown in Step 2 for auto-configuration of FAN; using a different format syntax can prevent FAN from being auto-configured. To use FAN, you must connect to a database service (That you configured in Step 1) and you must be able to receive events from the Oracle Notification Service (ONS), which may require a port to be opened, as mentioned above.
FAN may also be configured manually using connection pool settings (see below), if needed.
See below for configuration requirements with different pool types.
JDBC FAN Requirements
For client drivers using UCP:
- Use the recommended connection URL/string (see above) for auto-configuration of ONS.
- Include JDBC JAR files
ojdbc8.jar
(or later),ons.jar
, andsimplefan.jar
on theCLASSPATH
(plus optional wallet jars, if needed:osdt_cert.jar
,osdt_core.jar
, andoraclepki.jar
). - Set the pool or driver property to enable Fast Connection Failover (for example,
in UCP it is set for the
PoolDataSource
usingsetFastConnectionFailoverEnabled(true)
). - Disable auto-commit connection property (for example, in UCP it is disabled for
the
PoolDataSource
usingsetConnectionProperty(OracleConnection.CONNECTION_PROPERTY_AUTOCOMMIT, "false");
) - For third-party JDBC pools, Oracle recommends using Universal Connection Pool (UCP) as a data source.
- Open port 6200 for ONS communication from your database server (6200 is the default port, a different port may have been chosen).
If you are not able to use the recommended connect URL/string, configure your clients manually by setting:
oracle.ons.nodes=Node01:6200, Node02:6200, Node03:6200
Additional settings might be needed when configuring manually. For example
walletfile
and walletpassword
.
Connection pools other than UCP will have analogous requirements.
OCI FAN Requirements
-
For Oracle Call Interface (OCI) clients:
OCI clients embed FAN at the driver level so that all clients can use them regardless of the pooling solution.
The database service must have the attribute "-notification TRUE" set
If oraaccess.xml is in use, ensure that the events tag is TRUE:
<oraaccess> xmlns="http://xmlns.oracle.com/oci/oraaccess" xmlns:oci="http://xmlns.oracle.com/oci/oraaccess" schemaLocation="http://xmlns.oracle.com/oci/oraaccess http://xmlns.oracle.com/oci/oraaccess.xsd"> <default_parameters> <events>true</events> </default_parameters> </oraaccess>
-
For ODP.Net clients
Specify HA events in the connect string
"user id=oracle; password=oracle; data source=HA; pooling=true; HA events=true;"
See also:
Overview of Oracle Integrated Clients and FAN in Oracle Real Application Clusters Administration and Deployment Guide
Step 4: Ensure Application Implements Reconnection Logic
Applications should be written to catch connection failure exceptions and errors during database calls so they can obtain new connections and continue with new work.
For JDBC-based apps, the SQLRecoverableException
can be caught to
distinguish connection errors from typical application or SQL errors. If a
connection error is caught, then a new connection should be obtained. This is
simpler and more robust than checking for individual Oracle errors (which can adjust
by Oracle Database release) in the SQLException
class.
See also: