About the BothDown State

The Operator provisions, monitors, and manages active standby pairs of TimesTen databases. It detects and reacts to the failure of the active or the standby database. For example, when one database in the active standby pair is down, the Operator does the following:

  • If the active database fails, the Operator promotes the standby to be the active.

  • If the standby database fails, the Operator keeps the active running and repairs the standby.

However, if both databases fail at the same time, it is essential that the databases are brought back up appropriately. TimesTen replication does not atomically commit transactions in both database simultaneously. Transactions are committed in one database and then later are committed in the other database. (The database on which transactions are committed first is considered the database that is ahead.) Depending on how replication is configured, transactions on the active database may be ahead of the standby or the standby may be ahead of the active. To avoid data loss, the database that is ahead must become the active database after the failure is corrected.

In most cases, the Operator can determine which database was ahead at the time of the failure. However, there are cases where the Operator cannot determine which database was ahead. In particular, the Operator cannot determine which database is ahead if all of the following conditions occur:

  • Both databases failed during the polling interval. Specifically, the Operator examined both databases and the TimesTen Pods were in the Healthy state. The Operator waited pollingInterval seconds, and when the Operator examined the databases again (after this pollingInterval), both databases were down and

  • RETURN TWOSAFE replication was configured and

  • DISABLE RETURN or LOCAL COMMIT ACTION COMMIT (or both) were configured.

See TimesTenClassicSpecSpec for more information on the .spec.ttspec.pollingInterval datum and on the RETURN TWOSAFE and DISABLE RETURN replication configurations options. Also, see CREATE ACTIVE STANDBY PAIR in the Oracle TimesTen In-Memory Database SQL Reference and Defining an Active Standby Pair Replication Scheme in the Oracle TimesTen In-Memory Database Replication Guide for information on defining an active standby pair replication scheme.

This combination of events indicates that some transactions may have committed on the standby and not on the active and/or some transactions may have committed on the active and not on the standby. The Operator takes no action in this case.

When both databases fail, the TimesTenClassic object enters the BothDown state. See BothDown for more information on the BothDown state. The Operator must then determine the appropriate action to take. The Operator first examines the value of the .spec.ttspec.bothDownBehavior datum to determine what to do. See TimesTenClassicSpecSpec.

If .spec.ttspec.bothDownBehavior is set to Manual, the TimesTenClassic object immediately enters the ManualInterventionRequired state. The Operator takes no further action even if either TimesTen container subsequently becomes available. See About the ManualInterventionRequired State for information on the ManualInterventionRequired state.

If .spec.ttspec.bothDownBehavior is set to Best (the default setting), the Operator attempts to determine which database was ahead at the time of failure.

  • If the Operator cannot determine which database is ahead, the TimesTenClassic object immediately enters the ManualInterventionRequired state. See About the ManualInterventionRequired State.

  • If the Operator can determine which database is ahead:

    • The TimesTenClassic object enters the WaitingForActive state. The object remains in this state until the Pod containing that database is running and the TimesTen agent located in the tt container within that Pod is responding to the Operator. At this point, the TimesTenClassic object enters the ConfiguringActive state.

    • While the TimesTenClassic object is in the ConfiguringActive state, TimesTen in this Pod is started, the database is loaded and is configured for use as the new active database. If there are any problems with these steps, the TimesTenClassic object enters the ManualInterventionRequired state. If the database is successfully loaded and successfully configured as the new active, the TimesTenClassic object enters the StandbyDown state. See About Monitoring the Health of an Active Standby Pair of Databases for information on the states of your TimesTenClassic object.

    • You can specify the maximum amount of time (expressed in seconds) that the TimesTenClassic object remains in the WaitingForActive state by specifying a value for the spec.ttspec.waitingForActiveTimeout datum. After this period of time, if the object is still in the WaitingForActive state, the object automatically transitions to the ManualInterventionRequired state. The default is 0, which indicates that there is no timeout, and the object will remain in this state indefinitely. See TimesTenClassicSpecSpec for more information on the spec.ttspec.waitingForActiveTimeout datum.

    • The time to recover the database varies by the size of the database. You should consider the size of your database when deciding the value for spec.ttspec.waitingForActiveTimeout.

    • If the database that is ahead cannot be loaded, the TimesTenClassic object enters the ManualInterventionRequired state. See About the ManualInterventionRequired State.