Oracle Maximum Availability Architecture in Oracle Exadata Cloud Systems

Oracle Maximum Availability Architecture Benefits

Deployment: Oracle Exadata Cloud systems (ExaDB-D and ExaDB-C@C)) are deployed using Oracle Maximum Availability Architecture best practices, including configuration best practices for storage, network, operating system, Oracle Grid Infrastructure, and Oracle Database. ExaDB-D is optimized to run enterprise Oracle databases with extreme scalability, availability, and elasticity.
Oracle Maximum Availability Architecture database templates: All Oracle Cloud databases created with Oracle Cloud automation use Oracle Maximum Availability Architecture default settings, which are optimized for ExaDB-D.

Oracle does not recommend that you use custom scripts to create cloud databases. Other than adjusting memory and system resource settings, avoid migrating previous database parameter settings, especially undocumented parameters. One beneficial database data protection parameter, DB_BLOCK_CHECKING, is not enabled by default due to its potential overhead. MAA recommends evaluating the performance impact for your application and enabling this setting if performance impact is reasonable.
Backup and restore automation: When you configure automatic backup to Oracle Cloud Infrastructure Object Storage, backup copies provide additional protection when multiple availability domains exist in your region, and RMAN validates cloud database backups for any physical corruptions.

Database backups occur daily, with a full backup occurring once per week and incremental backups occurring on all other days. Archive log backups occur frequently to reduce potential data loss in case of disaster. The archive log frequency is typically 30 minutes.
Oracle Exadata Database Machine inherent benefits: Oracle Exadata Database Machine is the best Oracle Maximum Availability Architecture platform that Oracle offers. Exadata is engineered with hardware, software, database, and availability innovations that support the most mission-critical enterprise applications.

Specifically, Exadata provides unique high availability, data protection, and quality-of-service capabilities that set Oracle apart from any other platform or cloud vendor. Sizing Exadata cloud systems to meet your application and database system resource needs (for example, sufficient CPU, memory, and I/O resources) is very important to maintain the highest availability, stability, and performance. Proper sizing is especially important when consolidating many databases on the same cluster.

For a comprehensive list of Oracle Maximum Availability Architecture benefits for Oracle Exadata Database Machine systems, see Exadata Database Machine: Maximum Availability Architecture Best Practices.

Examples of these benefits include:

High availability and low brownout: Fully-redundant, fault-tolerant hardware exists in the storage, network, and database servers. Resilient, highly-available software, such as Oracle Real Application Clusters (Oracle RAC), Oracle Clusterware, Oracle Database, Oracle Automatic Storage Management, Oracle Linux, and Oracle Exadata Storage Server enable applications to maintain application service levels through unplanned outages and planned maintenance events.

For example, Exadata has instant failure detection that can detect and repair database node, storage server, and network failures in less than two seconds, and resume application and database service uptime and performance. Other platforms can experience 30 seconds, or even minutes, of blackout and extended application brownouts for the same type of failures. Only the Exadata platform offers a wide range of unplanned outage and planned maintenance tests to evaluate end-to-end application and database brownouts and blackouts.
Data protection: Exadata provides Oracle Database with physical and logical block corruption prevention, detection, and, in some cases, automatic remediation.

The Exadata Hardware Assisted Resilient Data (HARD) checks include support for server parameter files, control files, log files, Oracle data files, and Oracle Data Guard broker files, when those files are stored in Exadata storage. This intelligent Exadata storage validation stops corrupted data from being written to disk when a HARD check fails, which eliminates a large class of failures that the database industry had previously been unable to prevent.

Examples of the Exadata HARD checks include:
- Redo and block checksum
- Correct log sequence
- Block type validation
- Block number validation
- Oracle data structures, such as block magic number, block size, sequence number, and block header and tail data structures
Exadata HARD checks are initiated from Exadata storage software (cell services) and work transparently after enabling a database DB_BLOCK_CHECKSUM parameter, which is enabled by default in the cloud. Exadata is the only platform that currently supports the HARD initiative.

Furthermore, Oracle Exadata Storage Server provides non-intrusive, automatic hard disk scrub and repair. This feature periodically inspects and repairs hard disks during idle time. If bad sectors are detected on a hard disk, then Oracle Exadata Storage Server automatically sends a request to Oracle Automatic Storage Management (ASM) to repair the bad sectors by reading the data from another mirror copy.

Finally, Exadata and Oracle ASM can detect corruptions as data blocks are read into the buffer cache, and automatically repair data corruption with a good copy of the data block on a subsequent database write. This inherent intelligent data protection makes Exadata Database Machine and ExaDB-D the best data protection storage platform for Oracle databases.

For comprehensive data protection, a Maximum Availability Architecture best practice is to use a standby database on a separate Exadata instance to detect, prevent, and automatically repair corruptions that cannot be addressed by Exadata alone. The standby database also minimizes downtime and data loss for disasters that result from site, cluster, and database failures.
Response time quality of service: Only Exadata has end-to-end quality-of-service capabilities to ensure that response time remains low and optimum. Database server I/O latency capping and Exadata storage I/O latency capping ensure that read or write I/O can be redirected to partnered cells when response time exceeds a certain threshold.

If storage becomes unreliable (but not failed) because of poor and unpredictable performance, then the disk or flash cache can be confined offline, and later brought back online if heuristics show that I/O performance is back to acceptable levels. Resource management can help prioritize key database network or I/O functionality, so that your application and database perform at an optimized level.

For example, database log writes get priority over backup requests on Exadata network and storage. Furthermore, rapid response time is maintained during storage software updates by ensuring that partner flash cache is warmed so flash misses are minimized.
End-to-end testing and holistic health checks: Because Oracle owns the entire Oracle Exadata Cloud Infrastructure, end-to-end testing and optimizations benefit every Exadata customer around the world, whether hosted on-premises or in the cloud. Validated optimizations and fixes required to run any mission-critical system are uniformly applied after rigorous testing. Health checks are designed to evaluate the entire stack.

The Exadata health check utility EXACHK is Exadata cloud-aware and highlights any configuration and software alerts that may have occurred because of customer changes. No other cloud platform currently has this kind of end-to-end health check available. For Oracle Autonomous Database, EXACHK runs automatically to evaluate Maximum Availability Architecture compliance. For non-autonomous databases, Oracle recommends running EXACHK at least once a month, and before and after any software updates, to evaluate any new best practices and alerts.
Higher Uptime: The uptime service-level agreement per month is 99.95% (a maximum of 22 minutes of downtime per month), but when you use MAA best practices for continuous service, most months would have zero downtime.

Full list of Exadata features and benefits: Whats New in Oracle Exadata Database Machine

Oracle Maximum Availability Architecture best practices paper: Oracle Maximum Availability Architecture (MAA) engineering collaborates with Oracle Cloud teams to integrate Oracle MAA practices that are optimized for Oracle Cloud Infrastructure and security. See MAA Best Practices for the Oracle Cloud for additional information about continuous availability, Oracle Data Guard, Hybrid Data Guard, Oracle GoldenGate, and other Maximum Availability Architecture-related topics.

Expected Impact with Unplanned Outages

The following table lists various unplanned outages and the associated potential database downtime, application level Recovery Time Objective (RTO), and data loss potential or recovery point objective (RPO). For Oracle Data Guard architectures, the database downtime or service level downtime does not include detection time or the time it takes before a customer initiates the Cloud Console Data Guard failover operation.

Table 35-1 Availability and Performance Impact for Exadata Cloud Software Updates

Failure and Maintenance Events	Database Downtime	Service-Level Downtime (RTO)	Potential Service-Level Data Loss (RPO)
Localized events, including: Exadata cluster network topology failures Storage (disk and flash) failures Database instance failures Database server failures	Zero	Near-zero	Zero
Events that require restoring from backup because a standby database does not exist: Data corruptions Full database failures Complete storage failures Availability domain	Minutes to hours (without Data Guard)	Minutes to hours (without Data Guard)	30 minutes (without Data Guard)
Events using Data Guard to fail over: Data corruptions Full database failures Complete storage failures Availability domain or region failures	Seconds to minutes¹ Zero downtime for physical corruptions due to auto-block repair feature	Seconds to minutes¹ The foreground process that detects the physical corruption pauses while auto block repair completes	Zero for Max Availability (SYNC) Near Zero for Max Performance (ASYNC)

Failure and Maintenance Events

Database Downtime

Service-Level Downtime (RTO)

Potential Service-Level Data Loss (RPO)

Localized events, including:

Exadata cluster network topology failures

Storage (disk and flash) failures

Database instance failures

Database server failures

Zero

Near-zero

Zero

Events that require restoring from backup because a standby database does not exist:

Data corruptions

Full database failures

Complete storage failures

Availability domain

Minutes to hours

(without Data Guard)

Minutes to hours

(without Data Guard)

30 minutes

(without Data Guard)

Events using Data Guard to fail over:

Data corruptions

Full database failures

Complete storage failures

Availability domain or region failures

Seconds to minutes¹

Zero downtime for physical corruptions due to auto-block repair feature

Seconds to minutes¹

The foreground process that detects the physical corruption pauses while auto block repair completes

Zero for Max Availability (SYNC)

Near Zero for Max Performance (ASYNC)

¹ To protect from regional failure, you will need a standby database in a different region than the primary database.

Expected Impact with Planned Maintenance

The following table lists various software updates and the associated database and application impact. This is applicable for all Oracle Exadata Cloud infrastructures, including Oracle Exadata Cloud@Customer (ExaDB-C@C), Oracle Exadata Cloud Infrastructure (ExaDB-D) Gen2, and Oracle Autonomous Database (ADB).

Table 35-2 Availability and Performance Impact for Oracle Exadata Cloud Software Updates

Software Update	Database Impact	Application Impact	Scheduled By	Performed By
Exadata Network Fabric Switches	Zero downtime with No Database Restart	Zero to single-digit seconds brownout	Oracle schedules based on customer preferences and customer can reschedule	Oracle Cloud for both ADB and non-ADB
Exadata Storage Servers	Zero downtime with No Database Restart	Zero to single-digit seconds brownout Exadata storage servers are updated in rolling manner maintaining redundancy Oracle Exadata System Software pre-fetches the secondary mirrors of the OLTP data that is most frequently accessed into the flash cache, maintaining application performance during storage server restarts Exadata smart flash for database buffers is maintained across storage server restart With Exadata 21.2 software, Persistent Storage Index and Persistent Columnar Cache features enable consistent query performance after a storage server software update	Oracle schedules based on customer preferences and customer can reschedule	Oracle Cloud for both ADB and non-ADB
Exadata Database Host - Monthly Infrastructure Security Maintenance	Zero downtime with No Host or Database Restart	Zero downtime	Oracle schedules and customer can reschedule	Oracle Cloud for both ADB and non-ADB
Exadata Database Host - Quarterly Infrastructure Maintenance	Zero downtime with Oracle RAC rolling updates	Zero downtime Exadata Database compute resources are reduced until planned maintenance completes	Oracle schedules based on customer preferences and customer can reschedule	Oracle Cloud for both ADB and non-ADB
Exadata Database Guest	Zero downtime with Oracle RAC rolling updates	Zero downtime Exadata Database compute resources are reduced until planned maintenance completes	Customer for ADB	Oracle Cloud for ADB Customer using Oracle Cloud Console/APIs for non-ADB
Oracle Database quarterly update or custom image update	Zero downtime with Oracle RAC rolling updates	Zero downtime Exadata Database compute resources are reduced until planned maintenance completes Special consideration is required during rolling database quarterly updates for applications that use database OJVM. See KB137197 for details.	Customer for ADB	Oracle Cloud for ADB. For ADB-D, standby-first patch practices are automatically applied. Customer using Oracle Cloud Console/APIs or dbaascli utility for non-ADB. In-place via database home patch, and out-of-place via database move, software updates exist. Works for Data Guard and standby databases (See KB50142)
Oracle Grid Infrastructure quarterly update or upgrade	Zero downtime with Oracle RAC rolling updates	Zero downtime Exadata Database compute resources are reduced until planned maintenance completes	Customer for ADB	Oracle Cloud for ADB Customer using Oracle Cloud Console/APIs or dbaascli utility for non-ADB
Oracle Database upgrade with downtime	Minutes to Hour(s) downtime	Minutes to Hour(s) downtime	Customer for ADB	Oracle Cloud for ADB Customer using Oracle Cloud Console/APIs or dbaascli utility for non-ADB Works for Data Guard and standby databases (see KB70571)
Oracle Database upgrade with near zero downtime	Minimal downtime with `DBMS_ROLLING`, Oracle GoldenGate replication, or with pluggable database relocate	Minimal downtime with `DBMS_ROLLING`, Oracle GoldenGate replication, or with pluggable database relocate	Customer for non-ADB	Oracle Cloud for ADB on Shared Exadata Infrastructure (ADB-S) can run pluggable database relocate for upgrade use cases Customer using dbaascli for non-autonomous leveraging `DBMS_ROLLING`. See KB52602 Customer using generic Maximum Availability Architecture best practices for non-ADB

Exadata cloud systems have many elastic capabilities that can be used to adjust database and application performance needs. By rearranging resources on need, you can maximize system resources to targeted databases and applications and you can minimize costs. The following table lists elastic Oracle Exadata Cloud Infrastructure and VM Cluster updates, and the impacts associated with those updates on databases and applications. All of these operations can be performed using Oracle Cloud Console or APIs unless specified otherwise.

Table 35-3 Availability and Performance Impact for Exadata Elastic Operations

VM Cluster Changes	Database Impact	Application Impact
Scale Up or Down VM Cluster Memory	Zero downtime with Oracle RAC rolling updates	Zero to single-digit seconds brownout
Scale Up or Down VM Cluster CPU	Zero downtime with No Database Restart	Zero downtime Application performance and throughput can be impacted by available CPU resources
Scale Up or Down (resize) ASM Storage for Database usage	Zero downtime with No Database Restart	Zero downtime Application performance might be minimally impacted.
Scale Up VM Local /u02 File System Size (Exadata X8M and later systems)	Zero downtime with No Database Restart	Zero downtime
Scale Up VM Local /u02 File System Size (Exadata X8 and earlier systems)	Zero downtime with Oracle RAC rolling updates	Zero to single-digit seconds brownout
Scale Down VM Local /u02 File System Size	Zero downtime with Oracle RAC rolling updates for scaling down	Zero to single-digit seconds brownout
Adding Exadata Storage Cells	Zero downtime with No Database Restart	Zero to single-digit seconds brownout Application performance might be minimally impacted
Adding Exadata Database Servers	Zero downtime with No Database Restart	Zero to single-digit seconds brownout Application performance and throughput may increase by adding Oracle RAC instances and CPU resources
Adding/Dropping Database Nodes in Virtual Machines (VMs) Cluster	Zero downtime with No Database Restart	Zero to single-digit seconds brownout Application performance and throughput may increase or decrease by adding or dropping Oracle RAC instances and CPU resources

Because some of these elastic changes may take significant time, and may impact available resources for your application, some planning is required.

Note that “scale down” and “drop” changes will decrease available resources. Care must be taken to not reduce resources below the amount required for database and application stability and to meet application performance targets. Refer to the following table for estimated timings and planning recommendations.

Table 35-4 Customer Planning Recommendations for Exadata Elastic Operations

VM Cluster Changes	Estimated Timings	Customer Planning Recommendations
Scale Up or Down VM Cluster Memory	Time to drain services and Oracle RAC rolling restart Typically 15-30 minutes per node, but may vary depending on application draining	Understanding application draining. See Achieving Continuous Availability For Your Applications Before scaling down memory, ensure that database SGAs can still be stored in hugepages, and that application performance is still acceptable. To preserve predictable application performance and stability: Monitor and scale up before important high workload patterns require the memory resources Avoid memory scale down unless all your Databases' SGA and PGA memory fit into the new memory size and that all SGAs are accommodated by system's hugepages.
Scale Up or Down VM Cluster CPU	Online operation, typically less than 5 minutes per VM cluster. Scaling up from a very low value to very high value (10+ oCPU increase) may take 10 minutes.	To preserve predictable application performance and stability: Monitor and scale up before important high workload patterns require the CPU resources or when consistently reaching an OCPU threshold for tolerated amount of time. Only scale down if the load average is below a threshold for at least 30 minutes or scale down based on fixed workload schedules (e.g. business hours with 60 OCPUs, non-business hours with 10 OCPUs and batch with 100 oCPUs) Avoid more than one scale down requests within 2 hours period
Scale Up or Down (resize) ASM Storage for Database usage	Time varies based on utilized database storage capacity and database activity. The higher percentage of utilized database storage, the longer the resize operation (which includes ASM rebalance) will take. Typically minutes to hours.	Oracle ASM rebalance is initiated automatically. Storage redundancy is retained. Due to inherent best practices of using non-intrusive ASM power limit, application workload impact is minimal. Choose a non-peak window so resize and rebalance operations can be optimized. Since the time may vary significantly, plan for the operation to complete in hours. To estimate the time that an existing resize or rebalance operation per VM cluster, query `GV$ASM_OPERATION`. For example, a customer can run the following query every 30 minutes to evaluate how much work (EST_WORK) and how much more time (EST_MINUTES) potentially is required: `select operation, pass, state, sofar, est_work, est_minutes from gv$asm_operation where operation='REBAL';` Note the estimated statistics tend to become more accurate as the rebalance progresses but can vary based on the concurrent workload.
Scale Up VM Local /u02 File System Size (Exadata X8M and later)	Online operation, typically less than 5 minutes per VM cluster	VM local file system space is allocated on local database host disks, which is shared by all VM guests for all VM clusters provisioned on that database host. Do not scale up space for Local /u02 File System unnecessarily on one VM cluster such that no space remains to scale up on other VM clusters on the same Exadata Infrastructure because Local /u02 File System scale down must be performed in a RAC rolling manner, which may cause application disruption.
Scale Up VM Local /u02 File System Size (Exadata X8 and earlier)	Time to drain services and Oracle RAC rolling restart. Typically 15-30 minutes per node, but may vary depending on application draining settings.	Understanding application draining. See Achieving Continuous Availability For Your Applications
Scale Down VM Local /u02 File System Size	Time to drain services and Oracle RAC rolling restart. Typically 15-30 minutes per node, but may vary depending on application draining settings.	Understanding application draining See Achieving Continuous Availability For Your Applications
Adding Exadata Storage Cells	Online operation to create more available space for administrator to choose how to distribute. Typically 3-72 hours per operation depending number of VM clusters, database storage usage and storage activity. With very active database and heavy storage activity, this can take up to take 72 hours. As part of the add storage cell operation, there are two parts to this operation. 1) storage is added to the system as part the add storage, 2) administrator needs to decide which VM cluster to expand its ASM disk groups as a separate operation.	Plan to add storage when your storage capacity utilization will hit 80% within a month's time since the operation may complete in days. Oracle ASM rebalance is initiated automatically. Storage redundancy is retained. Due to inherent best practices of using non-intrusive ASM power limit, application workload impact is minimal. Since the time may vary significantly, plan for the operation to complete in days before the storage is available. To estimate the time that an existing resize or rebalance operation per VM cluster, query `GV$ASM_OPERATION`. For example a customer can run the following query every 30 minutes to evaluate how much work (EST_WORK) and how much more time (EST_MINUTES) potentially is required: `select operation, pass, state, sofar, est_work, est_minutes from gv$asm_operation where operation='REBAL';` Note the estimated statistics tend to become more accurate as the rebalance progresses, but can vary based on the concurrent workload.
Adding Exadata Database Servers	Online operation to expand your VM cluster. One step process to add the Database Compute to the ExaDB-D and then expand the VM cluster. Approximately 1 to 6 hours per Exadata Database Server	Plan to add Database Compute when your Database resource utilization will hit 80% within a month's time. Be aware and plan for this operation to take many hours to a day. Choose a non-peak window so that the add Database Compute operation can complete faster Each Oracle RAC database registered by Oracle Clusterware and visible in the Oracle Cloud Console is extended. If a database was configured outside the Oracle Cloud Console or without dbaascli, then those databases will not be extended.
Adding/Dropping Database Nodes in Virtual Machines (VMs) Cluster	Zero database downtime when adding Database Nodes in VM cluster typically takes 3-6 hours, depending on the number of databases in the VM cluster Zero database downtime with dropping Database Nodes in VM cluster typically takes 1-2 hours, depending on number of databases in the VM cluster	Understand that the add/drop operation is not instantaneous, and operation may take several hours to complete Drop operation reduces Database compute, OCPU and memory resources, so application performance can be impacted

Achieving Continuous Availability For Your Applications

As part of Oracle Exadata Database Service (ExaDB-D and ExaDB-C@C) all software updates (except for non-rolling database upgrades or non-rolling patches) can be done online or with Oracle RAC rolling updates to achieve continuous database up time. Furthermore, any local failures of storage, Exadata network, or Exadata database server are managed automatically, and database up time is maintained.

To achieve continuous application up time during Oracle RAC switchover or failover events, follow these application-configuration best practices:

Use Oracle Clusterware-managed database services to connect your application. For Oracle Data Guard environments, use role based services.
Use recommended connection string with built-in timeouts, retries, and delays, so that incoming connections do not see errors during outages.
Configure your connections with Fast Application Notification.
Drain and relocate services. Refer to the table below and use recommended best practices that support draining, such as test connections, when borrowing or starting batches of work, and return connections to pools between uses.
Leverage Application Continuity or Transparent Application Continuity to replay in-flight uncommitted transactions transparently after failures.

For more details on the above checklist, see Configuring Continuous Availability for Applications. Oracle recommends testing your application readiness by consulting KB59878.

Depending on the Oracle Exadata Database Service planned maintenance event, Oracle attempts to automatically drain and relocate database services before stopping any Oracle RAC instance. For OLTP applications, draining and relocating services typically work very well and result in zero application downtime.

Some applications, such as long running batch jobs or reports, may not be able to drain and relocate gracefully within the maximum draining time. For those applications, Oracle recommends scheduling the software planned maintenance window around these types of activities or stopping these activities before the planned maintenance window. For example, you can reschedule a planned maintenance window to run outside your batch windows, or stop batch jobs before a planned maintenance window.

Special consideration is required during rolling database quarterly updates for applications that use database OJVM. See KB137197 for details.

The following table lists planned maintenance events that perform Oracle RAC instance rolling restart, and the relevant service drain timeout variables that may impact your application.

Table 35-5 Application Drain Attributes for Exadata Cloud Software Updates and Elastic Operations

Oracle Exadata Database Service Software Updates or Elastic Operations Drain Timeout Variables

Oracle Exadata Database Service Software Updates or Elastic Operations	Drain Timeout Variables
Oracle DBHOME patch apply and database MOVE	Oracle Cloud software automation stops/relocates database services while honoring `drain_timeout` settings defined by database service configuration (for example, srvctl).¹ You can override `drain_timeout` defined on services by using option `–drainTimeoutInSeconds` with command line operation `dbaascli dbHome patch` or `dbaascli database move`. The Oracle Cloud internal maximum draining time supported is 2 hours.
Oracle Grid Infrastructure (GI) patch apply and upgrade	Oracle Cloud software automation stops/relocates database services while honoring `drain_timeout` settings defined by database service configuration (for example,. srvctl).1 You can override `drain_timeout` defined on services by using option `–drainTimeoutInSeconds` with command line operation `dbaascli grid patch` or `dbaascli grid upgrade`. The Oracle cloud internal maximum draining time supported is 2 hours.
Virtual machine operating system software update (Exadata Database Guest)	Exadata `patchmgr/dbnodeupdate` software program calls drain orchestration (`rhphelper`). Drain orchestration has the following drain timeout settings (See KB146644 for details): `DRAIN_TIMEOUT` – if a service does not have `drain_timeout` defined, then this value is used. Default value is 180 seconds. `MAX_DRAIN_TIMEOUT` - overrides any higher `drain_timeout` value defined by database service configuration. Default value is 300 seconds. There is no maximum value. `DRAIN_TIMEOUT` settings defined by database service configuration are honored during service stop/relocate.
Exadata X8 and earlier systems Scale up and down VM local /u02 file system size Scale up or down VM cluster memory	Exadata X8 and earlier systems local file system resize operation calls drain orchestration (`rhphelper`). Drain orchestration has the following drain timeout settings (See KB146644 for details): `DRAIN_TIMEOUT` – if a service does not have `drain_timeout` defined, then this value is used. Default value is 180 seconds. `MAX_DRAIN_TIMEOUT` - overrides any higher `drain_timeout` value defined by database service configuration. Default value is 300 seconds. `DRAIN_TIMEOUT` settings defined by database service configuration are honored during service stop/relocate. The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.
Exadata X8M and later systems Scale down VM local file system size	Exadata X8M and later systems call drain orchestration (`rhphelper`). Drain orchestration has the following drain timeout settings (See KB146644 for details): `DRAIN_TIMEOUT` – if a service does not have `drain_timeout` defined, then this value is used. Default value is 180 seconds. `MAX_DRAIN_TIMEOUT` - overrides any higher `drain_timeout` value defined by database service configuration. Default value is 300 seconds. `DRAIN_TIMEOUT` settings defined by database service configuration are honored during service stop/relocate. The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.
Exadata X8M and later systems Scale up or down VM cluster memory	Exadata X8M and later systems call drain orchestration (`rhphelper`). Drain orchestration has the following drain timeout settings (See KB146644 for details): `DRAIN_TIMEOUT` – if a service does not have `drain_timeout` defined, then this value is used. Default value is 180 seconds. `MAX_DRAIN_TIMEOUT` - overrides any higher `drain_timeout` value defined for a given service, default 300. `DRAIN_TIMEOUT` settings defined by database service configuration are honored during service stop/relocate. The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.
Oracle Exadata Cloud Infrastructure (ExaDB-D) software update	The ExaDB-D database host calls drain orchestration (`rhphelper`). Drain orchestration has the following drain timeout settings (See KB146644 for details): `DRAIN_TIMEOUT` – if a service does not have `drain_timeout` defined, then this value is used. Default value is 180 seconds. `MAX_DRAIN_TIMEOUT` - overrides any higher `drain_timeout` value defined by database service configuration. Default value is 300 seconds. `DRAIN_TIMEOUT` settings defined by database service configuration are honored during service stop/relocate. The Oracle Cloud internal maximum draining time supported for this operation is For Exadata X8 and earlier systems, the timeout is 300 seconds. For Exadata X8M and later systems, the timeout is 500 seconds. Enhanced Infrastructure Maintenance Controls feature: To achieve draining time longer than the Oracle Cloud internal maximum, leverage the custom action capability of the Enhanced Infrastructure Maintenance Controls feature, which allows you to suspend infrastructure maintenance before the next database server update starts, then directly stop/relocate database services running on the database server, and then resume infrastructure maintenance to proceed to the next database server. This feature is also currently available for Oracle Exadata Cloud@Customer (ExaDB-C@C). See Configure Oracle-Managed Infrastructure Maintenance in Oracle Cloud Infrastructure Documentation for details.

Oracle DBHOME patch apply and database MOVE

Oracle Cloud software automation stops/relocates database services while honoring drain_timeout settings defined by database service configuration (for example, srvctl).¹

You can override drain_timeout defined on services by using option –drainTimeoutInSeconds with command line operation dbaascli dbHome patch or dbaascli database move.

The Oracle Cloud internal maximum draining time supported is 2 hours.

Oracle Grid Infrastructure (GI) patch apply and upgrade

Oracle Cloud software automation stops/relocates database services while honoring drain_timeout settings defined by database service configuration (for example,. srvctl).1

You can override drain_timeout defined on services by using option –drainTimeoutInSeconds with command line operation dbaascli grid patch or dbaascli grid upgrade.

The Oracle cloud internal maximum draining time supported is 2 hours.

Virtual machine operating system software update (Exadata Database Guest)

Exadata patchmgr/dbnodeupdate software program calls drain orchestration (rhphelper).

Drain orchestration has the following drain timeout settings (See KB146644 for details):

DRAIN_TIMEOUT – if a service does not have drain_timeout defined, then this value is used. Default value is 180 seconds.
MAX_DRAIN_TIMEOUT - overrides any higher drain_timeout value defined by database service configuration. Default value is 300 seconds. There is no maximum value.

DRAIN_TIMEOUT settings defined by database service configuration are honored during service stop/relocate.

Exadata X8 and earlier systems

Scale up and down VM local /u02 file system size
Scale up or down VM cluster memory

Exadata X8 and earlier systems local file system resize operation calls drain orchestration (rhphelper).

Drain orchestration has the following drain timeout settings (See KB146644 for details):

DRAIN_TIMEOUT – if a service does not have drain_timeout defined, then this value is used. Default value is 180 seconds.
MAX_DRAIN_TIMEOUT - overrides any higher drain_timeout value defined by database service configuration. Default value is 300 seconds.

DRAIN_TIMEOUT settings defined by database service configuration are honored during service stop/relocate.

The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.

Exadata X8M and later systems

Scale down VM local file system size

Exadata X8M and later systems call drain orchestration (rhphelper).

Drain orchestration has the following drain timeout settings (See KB146644 for details):

DRAIN_TIMEOUT – if a service does not have drain_timeout defined, then this value is used. Default value is 180 seconds.
MAX_DRAIN_TIMEOUT - overrides any higher drain_timeout value defined by database service configuration. Default value is 300 seconds.

DRAIN_TIMEOUT settings defined by database service configuration are honored during service stop/relocate.

The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.

Exadata X8M and later systems

Scale up or down VM cluster memory

Exadata X8M and later systems call drain orchestration (rhphelper).

Drain orchestration has the following drain timeout settings (See KB146644 for details):

DRAIN_TIMEOUT – if a service does not have drain_timeout defined, then this value is used. Default value is 180 seconds.
MAX_DRAIN_TIMEOUT - overrides any higher drain_timeout value defined for a given service, default 300.

DRAIN_TIMEOUT settings defined by database service configuration are honored during service stop/relocate.

The Oracle Cloud internal maximum draining time supported for this operation is 300 seconds.

Oracle Exadata Cloud Infrastructure (ExaDB-D) software update

The ExaDB-D database host calls drain orchestration (rhphelper).

Drain orchestration has the following drain timeout settings (See KB146644 for details):

DRAIN_TIMEOUT – if a service does not have drain_timeout defined, then this value is used. Default value is 180 seconds.
MAX_DRAIN_TIMEOUT - overrides any higher drain_timeout value defined by database service configuration. Default value is 300 seconds.

DRAIN_TIMEOUT settings defined by database service configuration are honored during service stop/relocate.

The Oracle Cloud internal maximum draining time supported for this operation is

For Exadata X8 and earlier systems, the timeout is 300 seconds.
For Exadata X8M and later systems, the timeout is 500 seconds.

Enhanced Infrastructure Maintenance Controls feature:

To achieve draining time longer than the Oracle Cloud internal maximum, leverage the custom action capability of the Enhanced Infrastructure Maintenance Controls feature, which allows you to suspend infrastructure maintenance before the next database server update starts, then directly stop/relocate database services running on the database server, and then resume infrastructure maintenance to proceed to the next database server. This feature is also currently available for Oracle Exadata Cloud@Customer (ExaDB-C@C). See Configure Oracle-Managed Infrastructure Maintenance in Oracle Cloud Infrastructure Documentation for details.

¹ Minimum software requirements to achieve this service drain capability is 1) Oracle Database 12.2 and later and 2) the latest Oracle Cloud DBaaS tooling software

Oracle Maximum Availability Architecture Reference Architectures in Oracle Exadata Cloud

Oracle Exadata Cloud (ExaDB-D and ExaDB-C@C) supports all Oracle Maximum Availability Architecture reference architectures, providing support for all Oracle Databases, regardless of their specific high availability, data protection, and disaster recovery service-level agreements. See MAA Best Practices for the Oracle Cloud for more information about Oracle Maximum Availability Architecture in the Oracle Exadata Cloud.

35 Oracle Maximum Availability Architecture in Oracle Exadata Cloud Systems

Oracle Maximum Availability Architecture Benefits

Expected Impact with Unplanned Outages

Expected Impact with Planned Maintenance

Achieving Continuous Availability For Your Applications

Oracle Maximum Availability Architecture Reference Architectures in Oracle Exadata Cloud