About the BothDown State
In replicated configurations, the TimesTen Operator provisions, monitors, and manages active standby pairs of TimesTen databases. It detects and reacts to the failure of the active or the standby database. For example, when one database in the active standby pair is down, the TimesTen Operator does the following:
-
If the active database fails, the Operator promotes the standby to be the active.
-
If the standby database fails, the Operator keeps the active running and repairs the standby.
However, if both databases fail at the same time, it is essential that the databases are brought back up appropriately. TimesTen replication does not atomically commit transactions in both database simultaneously. Transactions are committed in one database and then later are committed in the other database. (The database on which transactions are committed first is considered the database that is ahead.) Depending on how replication is configured, transactions on the active database may be ahead of the standby or the standby may be ahead of the active. To avoid data loss, the database that is ahead must become the active database after the failure is corrected.
In most cases, the TimesTen Operator can determine which database was ahead at the time of the failure. However, there are cases where the Operator cannot determine which database was ahead. In particular, the Operator cannot determine which database is ahead if all of the following conditions occur:
-
Both databases failed during the polling interval. Specifically, the Operator examined both databases and the TimesTen Pods were in the
Healthystate. The Operator waitedpollingIntervalseconds, and when the Operator examined the databases again (after thispollingInterval), both databases were down and -
RETURNTWOSAFEreplication was configured and -
DISABLERETURNorLOCALCOMMITACTIONCOMMIT(or both) were configured.
See TimesTenClassicSpecSpec for more information on the .spec.ttspec.pollingInterval datum and on the RETURN TWOSAFE and DISABLE RETURN replication configurations options. For information about defining an active standby pair replication scheme, see CREATE ACTIVE STANDBY PAIR in the Oracle TimesTen In-Memory Database SQL
Reference and Defining an Active Standby Pair Replication Scheme in the Oracle TimesTen In-Memory Database Replication
Guide
This combination of events indicates that some transactions may have committed on the standby and not on the active and/or some transactions may have committed on the active and not on the standby. The TimesTen Operator takes no action in this case.
When both databases fail, the TimesTenClassic object enters the BothDown state. The Operator must then determine the appropriate action to take. The Operator first examines the value of the .spec.ttspec.bothDownBehavior datum to determine what to do.
If .spec.ttspec.bothDownBehavior is set to Manual, the TimesTenClassic object immediately enters the ManualInterventionRequired state. The Operator takes no further action even if either TimesTen container subsequently becomes available. See About the ManualInterventionRequired State for Replicated Objects.
If .spec.ttspec.bothDownBehavior is set to Best (the default setting), the Operator attempts to determine which database was ahead at the time of failure.
-
If the Operator cannot determine which database is ahead, the TimesTenClassic object immediately enters the
ManualInterventionRequiredstate. See About the ManualInterventionRequired State for Replicated Objects. -
If the Operator can determine which database is ahead:
-
The TimesTenClassic object enters the
WaitingForActivestate. The object remains in this state until the Pod containing that database is running and the TimesTen agent located in thettcontainer within that Pod is responding to the Operator. At this point, the TimesTenClassic object enters theConfiguringActivestate. -
While the TimesTenClassic object is in the
ConfiguringActivestate, TimesTen in this Pod is started, the database is loaded and is configured for use as the new active database. If there are any problems with these steps, the TimesTenClassic object enters theManualInterventionRequiredstate. If the database is successfully loaded and successfully configured as the new active, the TimesTenClassic object enters theStandbyDownstate. See About the High Level State of TimesTenClassic Objects for information on the states of your TimesTenClassic object. -
You can specify the maximum amount of time (expressed in seconds) that the TimesTenClassic object remains in the
WaitingForActivestate by specifying a value for thespec.ttspec.waitingForActiveTimeoutdatum. After this period of time, if the object is still in theWaitingForActivestate, the object automatically transitions to theManualInterventionRequiredstate. The default is0, which indicates that there is no timeout, and the object will remain in this state indefinitely. See TimesTenClassicSpecSpec for more information on thespec.ttspec.waitingForActiveTimeoutdatum. -
The time to recover the database varies by the size of the database. You should consider the size of your database when deciding the value for
spec.ttspec.waitingForActiveTimeout. -
If the database that is ahead cannot be loaded, the TimesTenClassic object enters the
ManualInterventionRequiredstate. See About the ManualInterventionRequired State for Replicated Objects.
-