6 Enabling Continuous Service for Applications
Applications achieve continuous service easily when the underlying network, systems, and databases are always available.
To achieve continuous service in the face of unplanned outages and planned maintenance activities can be challenging. An MAA database architecture and its configuration and operational best practices is built upon redundancy and its ability to tolerate, prevent, and at times auto-repair failures.
However, applications can incur downtime whenever a failure hits a database instance, a database node, or the entire cluster or data center. Similarly, some planned maintenance activities may require restarting a database instance, a database node, or an entire database server to be restarted.
In all cases, following a simple checklist, your applications can incur zero or very little downtime whenever the database service that the application is connected to can be moved to another Oracle RAC instance or to another database.
To achieve continuous application uptime during Oracle RAC switchover or failover events, follow these application configuration best practices:
-
Use non-default Oracle Clusterware-managed services to connect your application.
-
Use recommended connection strings with built-in timeouts, retries, and delays, so that incoming connections do not see errors during outages.
-
Configure your connections with Fast Application Notification.
-
Drain and relocate services before any planned maintenance requiring an Oracle RAC instance restart.
Software updates to Exadata Database Host or Exadata Database Guest automatically drain and relocate services. Oracle Cloud and Fleet Patching and Provisioning (FPP) drain and relocate services automatically for Oracle Database, Grid Infrastructure, and Exadata software updates.
- Leverage Application Continuity or Transparent Application Continuity to replay in-flight uncommitted transactions transparently after failures.
Depending on the planned maintenance event, Oracle attempts to automatically drain and relocate application services before restarting any Oracle RAC instance. For OLTP applications, draining and relocating services works very well and results in zero application downtime.
Some applications such as long running batch jobs or reports may not be able to drain and relocate gracefully or within the maximum drain timeout. For those applications, Oracle recommends scheduling the software planned maintenance window that contains Oracle RAC rolling activities to exclude these types of activities by picking a window that will not conflict with these activities, or stopping these activities before the planned maintenance window. For example, you can reschedule a planned maintenance window outside your batch windows, or stop challenging batch jobs or reports before a planned maintenance window.
The following table outlines planned maintenance events that will incur Oracle RAC instance rolling restart and the relevant service drain timeout variables that may impact your application.
Table 6-1 Drain Timeout Variables for Planned Maintenance Events
Planned Maintenance Event | Drain Timeout Variables |
---|---|
Exadata Database Host (Dom0) software changes |
Exadata Host handles operating system (OS) shutdown with maximum timeout of 10 minutes. OS shutdown calls an
Each Clusterware-managed service is also controlled by a
See also: Using RHPhelper to Minimize Downtime During Planned Maintenance on Exadata (Doc ID 2385790.1) |
Exadata Database Guest (DomU) software changes |
Exadata
Each Clusterware-managed service is also controlled by a
See also: Using RHPhelper to Minimize Downtime During Planned Maintenance on Exadata (Doc ID 2385790.1) |
Oracle Grid Infrastructure (GI) software changes or upgrade |
The recommend steps are described in Graceful Application Switchover in RAC with No Application Interruption (Doc ID 1593712.1). Example:
Each Clusterware-managed service is also controlled by a
|
Oracle Database Software changes |
The recommend steps are described in Graceful Application Switchover in RAC with No Application Interruption (Doc ID 1593712.1). Example:
Each Clusterware-managed service is also controlled by a
|
For more information, see Application Checklist for Continuous Service for MAA Solutions for recommendations to experience application-level service uptime similar to that of the database uptime. Oracle recommends testing your application readiness by following the recommendations in Validating Application Failover Readiness (Doc ID 2758734.1).