Recovering After a Dual Failure of Both Active and Standby Databases
If both the active and standby databases fail at around the same time and if you can reconnect to both of them almost immediately, then restart the replication agents (and cache agents if applicable) and continue.
- Connect to the failed active database. This triggers recovery from the local transaction logs. If you are replicating a read-only cache group, the autorefresh state is automatically set to
PAUSED
. - Verify that the replication agent for the failed active database has restarted. If it has not restarted, then start the replication agent. See Starting and Stopping the Replication Agents.
- Call
ttRepStateSet
('ACTIVE')
on the newly recovered database. If you are replicating a read-only cache group, this action automatically causes the autorefresh state to change fromPAUSED
toON
for this database. - Verify that the cache agent for the failed database has restarted. If it has not restarted, then start the cache agent.
- Connect to the failed standby master database. This triggers recovery from the local transaction logs. If you are replicating a read-only cache group, the autorefresh state is automatically set to
PAUSED
. - Verify that the replication agent for the failed standby database has restarted. If it has not restarted, then start the replication agent. See Starting and Stopping the Replication Agents.
- Verify that the cache agent for the failed standby database has restarted. If it has not restarted, then start the cache agent.
Alternatively, consider the following scenarios where both the active and standby master databases fail:
-
The standby database fails. The active database fails before the standby comes back up or before the standby has been synchronized with the active database.
-
The active database fails. The standby database becomes
ACTIVE
, and the rest of the recovery process begins. (See Recovering From a Failure of the Active Database.) The new active database fails before the new standby database is fully synchronized with it.
In these scenarios, the subscribers may have had more changes applied than the standby database.
In this case, you could potentially perform one of the following options:
Recover the Active Database and Duplicate a New Standby Database
You can recover an active database and then duplicate it to a new standby database.
Restore the Active Master From a Backup
If both the active and standby masters fail and neither comes up, you can restore the active master if you have a backup.
- Restore the active master from a backup, as described in Backing Up and Restoring a TimesTen Classic Database With Cache Groups in the Oracle TimesTen In-Memory Database Cache Guide.
- Drop the replication configuration using the
DROP ACTIVE STANDBY PAIR
statement. - Drop and re-create all AWT cache groups using the
DROP CACHE GROUP
andCREATE CACHE GROUP
statements. - Start the replication agent and the cache agent, since the cache agent needs to be active to refresh any read-only cache groups and both must be active in order to load the AWT cache groups.
- Refresh all read-only cache groups using the
REFRESH CACHE GROUP
statement to upload most current committed data from the cached Oracle database tables. Use theREFRESH CACHE GROUP ... PARALLEL
n
clause to concurrently load these cache groups over multiple threads. - Load all AWT cache groups using the
LOAD CACHE GROUP
statement to begin the autorefresh process. Use theLOAD CACHE GROUP ... PARALLEL
n
clause to concurrently load these cache groups over multiple threads. - Stop both the replication agent and the cache agent in preparation to re-create the active standby pair.
- Re-create the replication configuration using the
CREATE ACTIVE STANDBY PAIR
statement. - Call
ttRepStateSet
('ACTIVE')
on the active master database, giving it theACTIVE
role. If you are replicating a read-only cache group, this action automatically causes the autorefresh state to change fromPAUSED
toON
for this database. - Set up the replication agent policy and start the replication agent on the active database. See Starting and Stopping the Replication Agents.
- Start the cache agent on the active database.
- Duplicate the active database to the standby database. You can
use either the
ttRepAdmin
-duplicate
utility or thettRepDuplicateEx
C function to duplicate a database. Use the-keepCG
command line option withttRepAdmin
to preserve the cache group. See Duplicating a Database. - Set up the replication agent policy on the standby database and start the replication agent on the new standby database. See Starting and Stopping the Replication Agents.
- Wait for the standby database to enter the
STANDBY
state. Use thettRepStateGet
built-in procedure to check the state. - Start the cache agent for the standby database using the
ttCacheStart
built-in procedure or thettAdmin
-cacheStart
utility. - Duplicate all of the subscribers from the standby database. See
Duplicating a Master Database to a Subscriber. Use the
-noKeepCG
command line option withttRepAdmin
in order to convert the cache group to regular TimesTen tables on the subscribers. - Set up the replication agent policy on the subscribers and start the agent on each of the subscriber databases. See Starting and Stopping the Replication Agents.