Recovering When a Data Instance Is Down

If the error is a hardware error involving the host, then fix the problem with the host and reload the data instance with the ttGridAdmin dbLoad command. During reload, TimesTen Scaleout attempts to recover the element within that data instance.

If a data instance is down, you should restart it. If a data instance is not running, then all of the elements that the data instance manages are down.

You can check if data instances are down by using the ttGridAdmin dbStatus -all or the ttGridAdmin dbStatus -element commands. These show if a data instance (and thus its element) is considered down.

% ttGridAdmin dbStatus database1 -element

Database database1 element level status as of Wed Mar 8 14:07:11 PST 2017
 
Host  Instance  Elem Status Date/Time of Event  Message 
----- --------- ---- ------ ------------------- ------- 
host3 instance1    1 opened 2017-03-08 13:58:06         
host4 instance1    2 down                               
host5 instance1    3 opened 2017-03-08 13:58:06         
host6 instance1    4 opened 2017-03-08 13:58:09
host7 instance1    5 opened 2017-03-08 13:58:09
host8 instance1    6 opened 2017-03-08 13:58:09

When a data instance is down (due to a hardware or software failure), all communication channels to its managed elements are shut down and no new connections are allowed to access these elements until the data instance is restored and the element that it manages is recovered.

If the data instance is down, you restart it by restarting its TimesTen daemon. Once restarted, the data instance connects to a ZooKeeper server. If it does not immediately connect, it continues to try to connect to a ZooKeeper server. After connection, the data instance loads its element.

Note:

If the data instance fails to connect to any ZooKeeper server, it may be in an unending loop as it continues to try to connect.

You can manually restart the daemon for that data instance by using the instanceExec command to run either the ttDaemonAdmin -start or ttDaemonAdmin -restart commands. Use the instanceExec command options of -only hostname[.instancename] to restart a single data instance.

% ttGridAdmin instanceExec -only host4.instance1 ttDaemonAdmin -start 
Overall return code: 0
Commands executed on:
  host4.instance1 rc 0
Return code from host4.instance1: 0
Output from host4.instance1:
TimesTen Daemon (PID: 15491, port: 14000) startup OK.

If the data instance does not start using either the ttDaemonAdmin -start or ttDaemonAdmin -restart commands, then you can force a restart of all data instances. The following restarts all data instances and recovers all data up to the last common epoch.

ttGridAdmin instanceexec -type data ttDaemonAdmin -restart -force

See Run a Command or Script on Grid Instances (instanceExec) in Oracle TimesTen In-Memory Database Reference or ttDaemonAdmin in Oracle TimesTen In-Memory Database Reference.

If you know what caused the error that caused the data instance to fail, then reload the database with the ttGridAdmin dbLoad command after you fix the problem.

% ttGridAdmin dbLoad database1

Open the database to continue working.

% ttGridAdmin dbOpen database1

You can verify the results with the ttGridAdmin dbStatus command.