Recovering From a Failure of the Active Database

There are certain procedures you can run to recover from a failure of an active database.

This section includes the following topics:

Recovering When the Standby Database is Ready

There are procedures you can perform to recover the active database when the standby database is available and synchronized with the active database.

When Replication is Return Receipt or Asynchronous

There are certain tasks you can run to recover after an active database fails when replication uses return receipt or asynchronous replication.

Complete the following tasks:

  1. Stop the replication agent on the failed database if it has not already been stopped.
  2. On the standby database, call ttRepStateSet('ACTIVE'). This changes the role of the database from STANDBY to ACTIVE.
  3. On the new active database, call ttRepStateSave('FAILED', 'failed_database','host_name'), where failed_database is the former active database that failed. This step is necessary for the new active database to replicate directly to the subscriber databases. During normal operation, only the standby database replicates to the subscribers.
  4. Destroy the failed database.
  5. Duplicate the new active database to the new standby database.
  6. Set up the replication agent policy and start the replication agent on the new standby database. See Starting and Stopping the Replication Agents.

The standby database contacts the active database. The active database stops sending updates to the subscribers. When the standby database is fully synchronized with the active database, then the standby database enters the STANDBY state and starts sending updates to the subscribers.

Note:

You can verify that the standby database has entered the STANDBY state by using the ttRepStateGet built-in procedure.

When Replication is Return Twosafe

There are certain procedures to run when recovering an active database when replication uses return twosafe.

Complete the following tasks:

  1. On the standby database, call ttRepStateSet('ACTIVE'). This changes the role of the database from STANDBY to ACTIVE.
  2. On the new active database, call ttRepStateSave('FAILED', 'failed_database','host_name'), where failed_database is the former active database that failed. This step is necessary for the new active database to replicate directly to the subscriber databases. During normal operation, only the standby database replicates to the subscribers.
  3. Connect to the failed database. This triggers recovery from the local transaction logs. If database recovery fails, you must continue from Step 5 of the procedure for recovering when replication is return receipt or asynchronous. See When Replication is Return Receipt or Asynchronous.
  4. Verify that the replication agent for the failed database has restarted. If it has not restarted, then start the replication agent. See Starting and Stopping the Replication Agents.

When the active database determines that it is fully synchronized with the standby database, then the standby database enters the STANDBY state and starts sending updates to the subscribers.

Note:

You can verify that the standby database has entered the STANDBY state by using the ttRepStateGet built-in procedure.

Failing Back to the Original Nodes

After a successful failover, you may want to fail back so that the active database and the standby database are on their original nodes.

See Reversing the Roles of the Active and Standby Databases.