Site Failover and Recovery
A site failover is required if your infrastructure has suffered an
unplanned event that causes the primary site to be unavailable and completely
inaccessible for a duration of time that will negatively impact the business. In this
scenario, the standby assumes the primary role.
A primary site can become unavailable for a variety of reasons, including but not limited to, the following:
- Issues that might cause the primary database instances not to start such as failed or extensively corrupted media, or a flawed OS or firmware upgrade
- Infrastructure failure such as a full power or cooling system outage within the OCI region infrastructure
- Complete network failures
- Natural disasters such as earthquakes, fires, and floods
While unplanned events are rare, they can and do occur.
Perform Site Failover
As a true failover is disruptive and may result in some small loss
of data, test your site failover in a TEST environment.
The following example uses names from our test environment for the Primary database
in Ashburn (
CDBHCM_iad1dx
) and the Standby database in Phoenix
(CDBHCM_phx5s
).
Reinstate the Failed Primary as the New Standby
You'll want to protect your new production environment with a standby. Ideally, you'll be able to reinstate the failed primary as a new standby by reinstating both the database and the file systems.
Reinstate the Old Primary Database as the Standby
Oracle Data Guard will prevent the old primary database from opening when it is made available again
after a primary site failure. Any attempt to start the database normally will fail, with
messages written to its alert log indicating reinstatement is required. If
Flashback Database was enabled on this database before the failure, then you can reinstate
the old primary as the new standby.
Perform the following to reinstate the old primary as a standby of current production: