Writing a Failure Recovery Script

Upon detecting a failure, the cluster manager should invoke a script that effectively runs the procedure shown by the failure recovery pseudocode.

Detect problem {
       if (Master == unavailable) {
          FailedDataDatabase = Master
          FailedDSN = Master_DSN
          SurvivorDatabase = Subscriber
          switch users to SurvivorDatabase
      }
else {
          FailedDatabase = Subscriber
          FailedDSN = Subscriber_DSN
          SurvivorDatabase = Master
      }
}
Fix problem....
If (Problem resolved) {
       Get state for FailedDatabase
       if (state == "failed") {
         ttDestroy FailedDatabase
         ttRepAdmin -dsn FailedDSN -duplicate
                 -from SurvivorDatabase -host SurvivorHost
                 -uid ttuser
                 -pwd ttuser
      }
      else {
         ttAdmin -repStart FailedDSN
      }
      while (backlog != 0) {
         wait
      }
}

Switch users back to Master.

This applies to either the master or subscriber databases. If the master fails, you may lose some transactions.