Using a Disaster Recovery Subscriber in an Active Standby Pair
TimesTen active standby pair replication provides high availability by allowing for fast switching between databases within a data center.
This includes the ability to automatically change which database propagates changes to an Oracle database using AWT cache groups. However, for additional high availability across data centers, you may require the ability to recover from a failure of an entire site, which can include a failure of both TimesTen master databases in the active standby pair as well as the Oracle database used for the cache groups.
You can recover from a complete site failure by creating a special disaster recovery read-only subscriber as part of the active standby pair replication scheme. The standby database sends updates to cache group tables on the read-only subscriber. This special subscriber is located at a remote disaster recovery site and can propagate updates to a second Oracle database, also located at the disaster recovery site. The disaster recovery subscriber can take over as the active in a new active standby pair at the disaster recovery site if the primary site suffers a complete failure. Any applications may then connect to the disaster recovery site and continue operating, with minimal interruption of service.
Requirements for Using a Disaster Recovery Subscriber With an Active Standby Pair
To use a disaster recovery subscriber, you must:
-
Use an active standby pair configuration with AWT cache groups at the primary site. The active standby pair can also include read-only cache groups in the replication scheme. The read-only cache groups are converted to regular tables on the disaster recovery subscriber. The AWT cache group tables remain AWT cache group tables on the disaster recovery subscriber.
-
Have a continuous WAN connection from the primary site to the disaster recovery site. This connection should have at least enough bandwidth to guarantee that the normal volume of transactions can be replicated to the disaster recovery subscriber at a reasonable pace.
-
Configure an Oracle database at the disaster recovery site to include tables with the same schema as the database at the primary site. Note that this database is intended only for capturing the replicated updates from the primary site, and if any data exists in tables written to by the cache groups when the disaster recovery subscriber is created, that data is deleted.
-
Have the same cache group administrator user ID and password at both the primary and the disaster recovery site.
Though it is not absolutely required, you should have a second TimesTen database configured at the disaster recovery site. This database can take on the role of a standby database, in the event that the disaster recovery subscriber is promoted to an active database after the primary site fails.
Rolling Out a Disaster Recovery Subscriber
To create a disaster recovery subscriber, follow these steps:
-
Create an active standby pair with AWT cache groups at the primary site. The active standby pair can also include read-only cache groups. The read-only cache groups are converted to regular tables when the disaster recovery subscriber is rolled out.
-
Create the disaster recovery subscriber at the disaster recovery site using the
ttRepAdmin
utility with the-duplicate
and-initCacheDR
options. You must also specify the cache group administrator and password for the Oracle database at the disaster recovery site using the-cacheUid
and-cachePwd
options.If your database includes multiple cache groups, you may improve the efficiency of the duplicate operation by using the
-nThreads
option to specify the number of threads that are spawned to flush the cache groups in parallel. Each thread flushes an entire cache group to the Oracle database and then moves on to the next cache group, if any remain to be flushed. If a value is not specified for-nThread
s, only one flushing thread is spawned.For example, duplicate the standby database
mast2
, on the system with the host nameprimary
and the cache user IDsystem
and passwordmanager
, to the disaster recovery subscriberdrsub
, and using two cache group flushing threads.ttRepAdmin
prompts for the values of-uid
,-pwd
,-cacheUid
and-cachePwd
.ttRepAdmin -duplicate -from mast2 -host primary -initCacheDR -nThreads 2 -connStr "DSN=drsub;UID=;PWD=;"
If you use the
ttRepDuplicateEx
function in C, you must set theTT_REPDUP_INITCACHEDR
flag inttRepDuplicateExArg.flags
and may optionally specify a value forttRepDuplicateExArg.nThreads4InitDR
:int rc; ttUtilHandle utilHandle; ttRepDuplicateExArg arg; memset( &arg, 0, sizeof( arg ) ); arg.size = sizeof( ttRepDuplicateExArg ); arg.flags = TT_REPDUP_INITCACHEDR; arg.nThreads4InitDR = 2; arg.uid="ttuser" arg.pwd="ttuser" arg.cacheuid = "system"; arg.cachepwd = "manager"; arg.localHost = "disaster"; rc = ttRepDuplicateEx( utilHandle, "DSN=drsub", "mast2", "primary", &arg );
After the subscriber is duplicated, TimesTen automatically configures the replication scheme that propagates updates from the AWT cache groups to the Oracle database, truncates the tables in the Oracle database that correspond to the cache groups in TimesTen, and then flushes all of the data in the cache groups to the Oracle database.
-
If you want to set the failure threshold for the disaster recovery subscriber, call the
ttCacheAWTThresholdSet
built-in procedure and specify the number of transaction log files that can accumulate before the disaster recovery subscriber is considered either dead or too far behind to catch up.If one or both master databases had a failure threshold configured before the disaster recovery subscriber was created, then the disaster recovery subscriber inherits the failure threshold value when it is created with the
ttRepAdmin -duplicate -initCacheDR
command. If the master databases have different failure thresholds, then the higher value is used for the disaster recovery subscriber. -
Start the replication agent for the disaster recovery subscriber using the
ttRepStart
built-in procedure or thettAdmin
utility with the-repstart
option. For example:ttAdmin -repstart drsub
Updates are now replicated from the standby database to the disaster recovery subscriber, which then propagates the updates to the Oracle database at the disaster recovery site.
Switching Over to the Disaster Recovery Site
When the primary site has failed, you can switch over to the disaster recovery site.
There are one of two ways to switch over to the disaster recovery site.
-
Creating a New Active Standby Pair After Switching to the Disaster Recovery Site: If your goal is to minimize risk of data loss at the disaster recovery site, you may roll out a new active standby pair using the disaster recovery subscriber as the active database.
-
Switching Over to a Single Database: If the goal is to absolutely minimize the downtime of your applications, at the risk of data loss if the disaster recovery database later fails, you may instead choose to drop the replication scheme from the disaster recovery subscriber and use it as a single non-replicating database. You may deploy an active standby pair at the disaster recovery site later.
Returning to the Original Configuration at the Primary Site
When the primary site is usable again, you may want to move the working active standby pair from the disaster recovery site back to the primary site.
You can do this with a minimal interruption of service by reversing the process that was used to create and switch over to the original disaster recovery site. Follow these steps:
-
Destroy original active database at the primary site, if necessary, using the
ttDestroy
utility. For example, to destroy a database calledmast1
, use:ttDestroy mast1
-
Create a disaster recovery subscriber at the primary site, following the steps detailed in Rolling Out a Disaster Recovery Subscriber. Use the original active database for the new disaster recovery subscriber.
-
Switch over to the new disaster recovery subscriber at primary site, as detailed in Switching Over to the Disaster Recovery Site. Roll out the standby database as well.
-
Roll out a new disaster recovery subscriber at the disaster recovery site, as detailed in Rolling Out a Disaster Recovery Subscriber.