About Autonomous Data Guard

Autonomous Data Guard creates and maintains two completely separate copies of your database: a primary database that your applications connect to and use, and a standby database that is a synchronized copy of the primary database. Then, should the primary database become unavailable for any reason, Autonomous Data Guard automatically converts the standby database to the primary database and, as such, it begins servicing your applications.

The primary and standby databases are often called peer databases of each other.

Note

Applications must be configured to use Transparent Application Continuity (TAC) to gain the full benefit of the database availability features provided by Autonomous Data Guard.

The following diagram shows how the standby database is kept synchronized with the primary database.


Description of autonomous-data-guard.eps follows

Changes made to the primary database are recorded in the primary database's redo log. Autonomous Data Guard transmits these redo records as a stream over the network to the standby database's redo log. Then, the standby database applies these records to the standby database. In this way, the standby database is kept synchronized with the primary database.

Synchronization is nearly instantaneous, but, as the process just described implies, there are two operations that consume time: transporting the redo records to the standby database and applying the redo records to the standby database. The first of these is called the transport lag, and the other is called the apply lag. You can view current lag values for a dedicated Autonomous Database from the database's Details page by choosing Autonomous Data Guard Associations under Resources in the side menu. You can view current lag values across all the dedicated Autonomous Databases in a container database from the container database's Details page in a similar fashion.

Configuring Autonomous Data Guard

In Oracle Autonomous Database on dedicated infrastructure, you configure and manage Autonomous Data Guard at the Autonomous Container Database level.

When you create an Autonomous Container Database, you select the Enable Autonomous Data Guard option and then provide details about the standby database, most notably the Exadata Infrastructure and Autonomous Exadata VM Cluster resources you want it created in and the protection mode you want to use. For information about the choices you have regarding these two details, see Choosing Autonomous Data Guard Configuration Options.

Role Transitions and Operations

After an Autonomous Container Database is created, you can change the role of the peer databases using either a switchover or a failover operation. Additionally, should the primary database become unavailable for any reason, Autonomous Data Guard automatically performs a failover operation.

A switchover is a role reversal between the primary database and its standby database. A switchover ensures no data loss. During a switchover, the primary database transitions to the standby role, and the standby database transitions to the primary role. In general, you perform a switchover operation to restore the roles of peer databases to their original configuration after a failed primary database has been reinstated as a standby. To perform a switchover operation, see Switch Roles in an Autonomous Data Guard Configuration.

A failover is when the primary database is unavailable. The failover results in a transition of the standby database to the primary role. In general, you do not need to perform a failover operation manually. However, in the rare case when an automatic failure operation fails, you can perform a manual failover as described in Fail Over to the Standby in an Autonomous Data Guard Configuration.

The availability and status of the database after a failover operation is characterized by two recovery objectives:

  • Recovery Time Objective (RTO). The RTO the maximum amount of time required for the database to become available to applications after a failover, and is related to some degree to the apply lag at the time of the failure. For Autonomous Data Guard, the RTO is seconds up to two minutes.
  • Recovery Point Objective (RPO). The RPO is the maximum duration of potential data loss from the failed primary database, and is related to some degree to the transport lag at the time of the failure. For Autonomous Data Guard, the RPO is near-zero.

When a failed primary database becomes available again, its role is set to Disabled Standby. At this point, you re-enable it by performing a reinstate operation. To perform a reinstate operation, see Reinstate the Disabled Standby in an Autonomous Data Guard Configuration.

Automatic Failover or Fast-Start Failover

With Automatic failover, whenever the primary Autonomous Container Database becomes unavailable because of a region failure, an availability domain failure, a failure of the Exadata Infrastructure or Autonomous Exadata VM Cluster, or the failure of the Autonomous Container Database itself, it automatically fails over to the standby Autonomous Container Database. This is also known as Fast-Start Failover.

Automatic failover is not enabled by default. You can enable automatic failover by selecting the Enable automatic failover option, while configuring the Autonomous Data Guard. Both the maximum performance and maximum availability protection modes support automatic failover:
  • In the Maximum availability mode, automatic failover guarantees zero data loss.
  • In the Maximum performance mode, automatic failover ensures that the standby database does not fall behind the primary database beyond the value specified for Fast Start failover lag limit. By default, Fast Start failover lag limit is set to 30 seconds and is applicable only to the Maximum peformance mode. In this case, automatic failover is only possible when the configured data loss guarantee can be upheld.
Besides hardware failures, availability domain outages, and regional outages, there are a few more database health conditions that can trigger a Fast-Start Failover, as listed below:
Database Health Condition Description
Corrupted Controlfile Controlfile is permanently damaged because of a disk failure.
Corrupted Dictionary Dictionary corruption of a critical database. Currently, this state can be detected only when the database is open.
Datafile Write Errors Write errors are encountered in any data files, including temp files, system data files, and undo files.

As a result of automatic failover, the role of the failed primary database becomes Disabled Standby and, after a brief period, the standby database assumes the role of the primary database. After automatic failover concludes, a message is displayed on the details page of the disabled standby database advising you that failover has occurred.

After the service resolves the former primary Autonomous Container Database issues, you can perform a manual switchover to return both databases to their initial roles. Once you provision the standby database, you can perform various management tasks related to the standby database, including:
  • Manually switching over a primary database to a standby database.
  • Manually failing over a primary database to a standby database.
  • Reinstating a primary database to standby role after failover.
  • Terminating a standby database.

Accessing Standby Databases from Client Applications

In an Autonomous Data Guard configuration, your client applications normally connect to and perform operations on the primary database.

In addition to this normal connectivity, Autonomous Data Guard provides you the option to connect client applications that perform only read-only operations to the standby database. To take advantage of this option, client applications connect to the database using database service names that include "_RO" (for "read only"), as described in Predefined Database Service Names for Autonomous Databases.

Monitoring Lag Times

As your databases that use Autonomous Data Guard are running, you can monitor transport lag and apply lag times from the database's (or container database's) Details page by choosing Autonomous Data Guard Associations under Resources in the side menu.

You should expect to see minor fluctuations over time as the workload on your database ebbs and flows. However, if you notice a continuing upward trend in lag time, you can take these actions to resolve the situation:

  • Upward Trend in Apply Lag. A continuing upward trend in apply lag indicates that the standby database doesn't have sufficient capacity to keep up with the redo records coming from the primary database. To resolve this situation, scale up the OCPUs of the database, as described in Add CPU or Storage Resources to a Dedicated Autonomous Database.
  • Upward Trend in Transport Lag. A continuing upward trend in transport lag indicates a network performance issue. Oracle Cloud operations staff constantly monitors network performance, so you should see the situation resolve itself without you taking any action. However, if you want, you can bring the situation to the operations staff by raising a service request, as described in Create a Service Request in My Oracle Support.