Skip Headers
Oracle® TimesTen In-Memory Database TimesTen to TimesTen Replication Guide
Release 11.2.1

Part Number E13072-06
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

3 Defining an Active Standby Pair Replication Scheme

This chapter describes how to design a highly available system and define replication schemes. It includes the following topics:

To reduce the amount of bandwidth required for replication, see "Compressing replicated traffic".

Restrictions on active standby pairs

When you are planning an active standby pair, keep in mind the following restrictions:

Defining the DSNs for the data stores

Before you define the active standby pair, you need to define the DSNs for the active, standby and read-only subscriber data stores. On Linux or UNIX, create an odbc.ini file. On Windows, use the ODBC Administrator to name the data stores and set data store attributes. See "Step 1: Create the DSNs for the master and the subscriber data stores" for an example.

Each data store "name" specified in a replication scheme must match the prefix of the data store file name without the path given for the DataStore attribute in the DSN definition for the data store. A replication scheme that uses the names specified in the Data Source Name attributes does not work. To avoid confusion, use the same name for both your DataStore and Data Source Name attributes in each DSN definition. For example, if the data store path is directory/subdirectory/foo.ds0, then foo is the data store name that you should use.

Defining an active standby pair replication scheme

Use the CREATE ACTIVE STANDBY PAIR SQL statement to create an active standby pair replication scheme. The complete syntax for the CREATE ACTIVE STANDBY PAIR statement is provided in the Oracle TimesTen In-Memory Database SQL Reference.

You must have the ADMIN privilege to use the CREATE ACTIVE STANDBY PAIR statement and to perform other replication operations.

Table 3-1 shows the components of an active standby pair replication scheme and identifies the parameters associated with the topics in this chapter.

Table 3-1 Components of an active standby pair replication scheme

Component See...

CREATE ACTIVE STANDBY PAIR FullStoreName, FullStoreName

"Identifying the data stores in the active standby pair"

[ReturnServiceAttribute]

"Using a return service"

[SUBSCRIBER FullStoreName [,...]]

"Identifying the data stores in the active standby pair"

[STORE FullStoreName [StoreAttribute [...]]]

"Setting STORE attributes"

[NetworkOperation [...]]

"Configuring network operations"

[{INCLUDE|EXCLUDE}

{TABLE [[Owner.]TableName[,...]]|

CACHE GROUP [[Owner.]CacheGroupName[,...]|

SEQUENCE [[Owner.]SequenceName[,...]]}

[,...]]

"Including or excluding data store objects from replication"


Identifying the data stores in the active standby pair

Use the full store name described in "Defining the DSNs for the data stores". The first data store name designates the active data store. The second data store name designates the standby data store. Read-only subscriber data stores are indicated by the SUBSCRIBER clause.

You can also specify the hosts where the data stores reside by using an IP address or a literal host name surrounded by double quotes.

The active data store and the standby data store should be on separate hosts to achieve a highly available system. Read-only subscribers can be either local or remote. A remote subscriber provides protection from site-specific disasters.

You can provide an optional host ID as part of FullStoreName:

DataStoreName [ON Host]

Host can be either an IP address or a literal host name. It is good practice to surround a host name with double quotes.

Using a return service

You can configure your replication scheme with a return service to ensure a higher level of confidence that your replicated data is consistent on the active and standby data stores. See "Copying updates between data stores". This section describes how to configure and manage the return receipt and return twosafe services. NO RETURN (asynchronous replication) is the default and provides the fastest performance.

The following sections describe the following return service clauses:

RETURN RECEIPT

TimesTen provides an optional return receipt service to loosely couple or synchronize your application with the replication mechanism.

You can specify the RETURN RECEIPT clause to enable the return receipt service for the standby data store. With return receipt enabled, when your application commits a transaction for an element on the active data store, the application remains blocked until the standby acknowledges receipt of the transaction update.

If the standby is unable to acknowledge receipt of the transaction within a configurable timeout period, your application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. See "Setting the return service timeout period" for more information on the return service timeout period.

You can use the ttRepXactStatus procedure to check on the status of a return receipt transaction. See "Checking the status of return service transactions" for details.

You can also configure the replication agent to disable the return receipt service after a specific number of timeouts. See "Managing return service timeout errors" for details.

RETURN RECEIPT BY REQUEST

RETURN RECEIPT enables notification of receipt for all transactions. You can use RETURN RECEIPT with the BY REQUEST clause to enable receipt notification only for specific transactions identified by your application.

If you specify RETURN RECEIPT BY REQUEST, you must use the ttRepSyncSet procedure to enable the return receipt service for a transaction. The call to enable the return receipt service must be part of the transaction (autocommit must be off).

If the standby data store is unable to acknowledge receipt of the transaction update within a configurable timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. See "Setting the return service timeout period" for more information on the return service timeout period.

You can use ttRepSyncGet to check if a return service is enabled and obtain the timeout value. For example:

Command> CALL ttRepSyncGet();
< 01, 45, 1>
1 row found.

RETURN TWOSAFE

TimesTen provides a return twosafe service to fully synchronize your application with the replication mechanism. The return twosafe service ensures that each replicated transaction is committed on the standby data store before it is committed on the active data store. If replication is unable to verify the transaction has been committed on the standby, it returns notification of the error. Upon receiving an error, the application can either take a unique action or fall back on preconfigured actions, depending on the type of failure.

When replication is configured with RETURN TWOSAFE, you must disable the AutoCommit connection attribute.

If the standby is unable to acknowledge commit of the transaction update within a configurable timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. See "Setting the return service timeout period" for more information on the return service timeout period.

RETURN TWOSAFE BY REQUEST

RETURN TWOSAFE enables notification of commit on the standby data store for all transactions. You can use RETURN TWOSAFE with the BY REQUEST clause to enable notification of a commit on the standby only for specific transactions identified by your application.

If you specify RETURN TWOSAFE BY REQUEST for a standby data store, you must use the ttRepSyncSet procedure to enable the return twosafe service for a transaction. The call to enable the return twosafe service must be part of the transaction (autocommit must be off).

If the standby is unable to acknowledge commit of the transaction within the timeout period, the application receives a tt_ErrRepReturnFailed (8170) warning on its commit request. The application can then chose how to handle the timeout, in the same manner as described for "RETURN TWOSAFE".

The ALTER TABLE statement cannot be used to alter a replicated table that is part of a TWOSAFE BY REQUEST transaction. If DDLCommitBehavior=1, this operation results in error 8051. If DDLCommitBehavior=0, the operation succeeds because a commit is performed before the ALTER TABLE operation, resulting in the ALTER TABLE operation being in a new transaction which is not part of the TWOSAFE BY REQUEST transaction.

See "Setting the return service timeout period" for more information on setting the return service timeout period.

You can use ttRepSyncGet to check if a return service is enabled and obtain the timeout value. For example:

Command> CALL ttRepSyncGet();
< 01, 45, 1>
1 row found.

NO RETURN

You can use the NO RETURN clause to explicitly disable either the return receipt or return twosafe service, depending on which one you have enabled. NO RETURN is the default condition.

Setting STORE attributes

Table 3-2 lists the optional STORE attributes for the CREATE ACTIVE STANDBY PAIR statement.

Table 3-2 STORE attribute descriptions

Data store attribute Description

DISABLE RETURN {SUBSCRIBER|ALL} NumFailures

Set the return service policy so that return service blocking is disabled after the number of timeouts specified by NumFailures.

See "Establishing return service failure/recovery policies".

RETURN SERVICES {ON|OFF} WHEN [REPLICATION] STOPPED

Set return services on or off when replication is disabled.

See "Establishing return service failure/recovery policies".

RESUME RETURN Milliseconds

If DISABLE RETURN has disabled return service blocking, this attribute sets the policy for re-enabling the return service.

See "Establishing return service failure/recovery policies".

RETURN WAIT TIME Seconds

Specifies the number of seconds to wait for return service acknowledgement. A value of 0 means that there is no waiting. The default value is 10 seconds.

The application can override this timeout setting by using the returnWait parameter in the ttRepSyncSet built-in procedure.

See "Setting the return service timeout period".

DURABLE COMMIT {ON|OFF}

Overrides the DurableCommits attribute setting. Enables durable commit when return service blocking has been disabled by DISABLE RETURN.

See "DURABLE COMMIT".

LOCAL COMMIT ACTION {NO ACTION|COMMIT}

Specifies the default action to be taken for a return service transaction in the event of a timeout. The options are:

NO ACTION - On timeout, the commit function returns to the application, leaving the transaction in the same state it was in when it entered the commit call, with the exception that the application is not able to update any replicated tables. The application can reissue the commit. This is the default.

COMMIT- On timeout, the commit function attempts to perform a commit to end the transaction locally.

This default setting can be overridden for specific transactions by using the localAction parameter in the ttRepSyncSet procedure.

See "LOCAL COMMIT ACTION".

COMPRESS TRAFFIC {ON|OFF}

Compresses replicated traffic to reduce the amount of network bandwidth used.

See "Compressing replicated traffic".

PORT PortNumber

Sets the port number used by a data store to listen for updates from another data store.

In an active standby pair, the standby data store listens for updates from the active data store. Read-only subscribers listen for updates from the standby data store.

If no PORT attribute is specified, the TimesTen daemon dynamically selects the port. While static port assignment is allowed by TimesTen, dynamic port allocation is recommended.

See "Port assignments".

TIMEOUT Seconds

Set the maximum number of seconds a data store waits before re-sending a message to an unresponsive data store.

In an active standby pair, the active data store sends messages to the standby data store. The standby data store sends messages to the read-only subscribers.

See "Setting the return service timeout period".

FAILTHRESHOLD Value

Sets the log failure threshold.

See "Setting the log failure threshold".


The rest of this section includes these topics:

Setting the return service timeout period

If a replication scheme is configured with one of the return services described in "Using a return service", a timeout occurs if the standby data store is unable to send an acknowledgement back to the active within the time period specified by TIMEOUT.

The default return service timeout period is 10 seconds. You can specify a different return service timeout period by:

  • Specifying the RETURN WAIT TIME in the CREATE ACTIVE STANDBY PAIR statement or ALTER ACTIVE STANDBY PAIR statement. A RETURN WAIT TIME of '0' indicates no waiting.

  • Specifying a different return service timeout period programmatically by calling the ttRepSyncSet procedure with a new returnWait parameter.

Once set, the timeout period applies to all subsequent return service transactions until you either reset the timeout period or terminate the application session.

A return service may time out because of a replication failure or because replication is so far behind that the return service transaction times out before it is replicated. However, unless there is a simultaneous replication failure, failure to obtain a return service confirmation from the standby does not necessarily mean the transaction has not been or will not be replicated.

You can set other STORE attributes to establish policies that automatically disable return service blocking if there are excessive timeouts and re-enable return service blocking when conditions improve. See "Managing return service timeout errors" for details.

Managing return service timeout errors

If the standby data store is unable to acknowledge the transaction update from the active data store within the timeout period, the application receives an errRepReturnFailed warning on its commit request.

The default return service timeout period is 10 seconds. You can specify a different timeout period by:

  • Using the RETURN WAIT TIME clause in the STORE clause of the CREATE ACTIVE STANDBY PAIR statement

  • Calling the ttRepSyncSet procedure with a new returnWait parameter

A return service may time out or fail because of a replication failure or because replication is so far behind that the return service transaction times out before it is replicated. However, unless there is a simultaneous replication failure, failure to obtain a return service confirmation from the subscriber does not mean the transaction has not or will not be replicated.

You can respond to return service timeouts by:

Disabling return service blocking manually

You may want respond in some manner if replication is stopped or return service timeout failures begin to adversely impact the performance of your replicated system. Your "tolerance threshold" for return service timeouts may depend on the historical frequency of timeouts and the performance/availability equation for your particular application, both of which should be factored into your response to the problem.

When using the return receipt service, you can manually respond by:

  • Using the ALTER ACTIVE STANDBY PAIR statement to disable return receipt blocking

  • Calling the ttDurableCommit procedure to durably commit transactions on the active data storey that you can no longer verify as being received by the standby

If you decide to disable return receipt blocking, your decision to re-enable it depends on your confidence level that the return receipt transaction is no longer likely to time out.

Establishing return service failure/recovery policies

An alternative to manually responding to return service timeout failures is to establish return service failure and recovery policies in the replication scheme. These policies direct the replication agents to detect changes to the replication state and to keep track of return service timeouts and then automatically respond in a predefined manner.

The following attributes in the CREATE ACTIVE STANDBY PAIR statement set the failure and recovery policies when using a RETURN RECEIPT or RETURN TWOSAFE service:

The policies set by these attributes are applicable for the life of the data store or until changed. The replication agent must be running to enforce these policies.

RETURN SERVICES {ON | OFF} WHEN [REPLICATION] STOPPED

The RETURN SERVICES {ON | OFF} WHEN [REPLICATION] STOPPED attribute determines whether a return receipt or return twosafe service continues to be enabled or is disabled when replication is stopped. "Stopped" in this context means that either the active replication agent is stopped (for example, by ttAdmin -repStop active) or the replication state of the standby data store is set to Stop or Pause with respect to the active data store (for example, by ttRepAdmin -state stop standby). A failed standby that has exceeded the specified FAILTHRESHOLD value is set to the Failed state, but is eventually set to the Stop state by the master replication agent.

Note:

A standby data store may become unavailable for a period of time that exceeds the timeout period specified by RETURN WAIT TIME but still be considered by the master replication agent to be in the Start state. Failure policies related to timeouts are set by the DISABLE RETURN attribute.

RETURN SERVICES OFF WHEN REPLICATION STOPPED disables the return service when replication is stopped and is the default when using the RETURN RECEIPT service. RETURN SERVICES ON WHEN REPLICATION STOPPED allows the return service to continue to be enabled when replication is stopped and is the default when using the RETURN TWOSAFE service.

DISABLE RETURN

When a DISABLE RETURN value is set, the data store keeps track of the number of return receipt or return twosafe transactions that have exceeded the timeout period set by RETURN WAIT TIME. If the number of timeouts exceeds the maximum value set by DISABLE RETURN, the application reverts to a default replication cycle in which it no longer waits for the standby to acknowledge the replicated updates.

Specifying SUBSCRIBER is the same as specifying ALL. Both settings refer to the standby data store.

The DISABLE RETURN failure policy is only enabled when the replication agent is running. If DISABLE RETURN is specified but RESUME RETURN is not specified, the return services remain off until the replication agent for the data store has been restarted.You can cancel this failure policy by stopping the replication agent and specifying DISABLE RETURN with a zero value for NumFailures. The count of timeouts to trigger the failure policy is reset either when you restart the replication agent, when you set the DISABLE RETURN value to 0, or when return service blocking is re-enabled by RESUME RETURN.

RESUME RETURN

When we say return service blocking is "disabled," we mean that the applications on the master data store no longer block execution while waiting to receive acknowledgements from the subscribers that they received or committed the replicated updates. Note, however, that the master still listens for an acknowledgement of each batch of replicated updates from the standby data store.

You can establish a return service recovery policy by setting the RESUME RETURN attribute and specifying a resume latency value. When this attribute is set and return service blocking has been disabled for the standby data store, the return receipt or return twosafe service is re-enabled when the commit-to-acknowledge time for a transaction falls below the value set by RESUME RETURN. The commit-to-acknowledge time is the latency between when the application issues a commit and when the master receives acknowledgement of the update from the subscriber.

The RESUME RETURN policy is enabled only when the replication agent is running. You can cancel a return receipt resume policy by stopping the replication agent and then using ALTER ACTIVE STANDBY PAIR to set RESUME RETURN to zero.

DURABLE COMMIT

You can set the DURABLE COMMIT attribute to specify the durable commit policy for applications that have return service blocking disabled by DISABLE RETURN. When DURABLE COMMIT is set to ON, it overrides the DurableCommits attribute on the master data store and forces durable commits for those transactions that have had return service blocking disabled.

If the replication scheme is configured with RETURN SERVICES ON WHEN REPLICATION STOPPED, the replication agent must be running to enforce the DURABLE COMMIT policy.

LOCAL COMMIT ACTION

When you are using the return twosafe service, you can specify how the master replication agent responds to timeouts by setting LOCAL COMMIT ACTION. You can override the setting for specific transactions by calling the localAction parameter in the ttRepSyncSet procedure.

The possible actions upon receiving a timeout during replication of a twosafe transaction are:

  • COMMIT - Upon timeout, the replication agent on the active data store commits the transaction and no more operations are allowed in the transaction.

  • NO ACTION - Upon timeout, the replication agent on the active data store does not commit the transaction. The process recovery commits the transaction. This is equivalent to a forced commit.

Compressing replicated traffic

If you are replicating over a low-bandwidth network, or if you are replicating massive amounts of data, you can set the COMPRESS TRAFFIC attribute to reduce the amount of bandwidth required for replication. The COMPRESS TRAFFIC attribute compresses the replicated data from the data store specified by the STORE parameter in the CREATE ACTIVE STANDBY PAIR or ALTER ACTIVE STANDBY PAIR statement. TimesTen does not compress traffic from other data stores.

Though the compression algorithm is optimized for speed, enabling the COMPRESS TRAFFIC attribute has some impact on replication throughput and latency.

Example 3-1 Compressing traffic from active data store

For example, to compress replicated traffic from active data store dsn1 and leave the replicated traffic from standby data store dsn2 uncompressed, the CREATE ACTIVE STANDBY PAIR statement looks like:

CREATE ACTIVE STANDBY PAIR dsn1 ON "machine1", dsn2 ON "machine2"
  SUBSCRIBER dsn3 ON "machine3"
  STORE dsn1 ON "machine1" COMPRESS TRAFFIC ON;

Example 3-2 Compressing traffic from both master data stores

To compress the replicated traffic from the dsn1 and dsn2 data stores, use:

CREATE ACTIVE STANDBY PAIR dsn1 ON "machine1", dsn2 ON "machine2"
  SUBSCRIBER dsn3 ON "machine3"
STORE dsn1 ON "machine1" COMPRESS TRAFFIC ON
STORE dsn2 ON "machine2" COMPRESS TRAFFIC ON;

Port assignments

If you do not assign a PORT attribute, the TimesTen daemon dynamically selects the port. When ports are assigned dynamically in this manner for the replication agents, then the ports of the TimesTen daemons have to match as well.

When statically assigning ports, it is important to specify the full host name, DSN and PORT in the STORE attribute of the CREATE ACTIVE STANDBY PAIR statement.

Example 3-3 Assigning static ports

CREATE ACTIVE STANDBY PAIR dsn1 ON "machine1", dsn2 ON "machine2"
  SUBSCRIBER dsn3 ON "machine3"
STORE dsn1 ON "machine1" PORT 16080
STORE dsn2 ON "machine2" PORT 16083
STORE dsn3 ON "machine3" PORT 16084;

Setting the log failure threshold

You can establish a threshold value that, when exceeded, sets an unavailable standby data store or a read-only subscriber to the Failed state before the available log space is exhausted.

Set the log threshold by specifying the STORE clause with a FAILTHRESHOLD value in the CREATE ACTIVE STANDBY PAIR or ALTER ACTIVE STANDBY PAIR statement. The default threshold value is 0, which means "no limit."

If an active data store sets the standby data store or a read-only subscriber to the Failed state, it drops all of the data for the failed data store from its log and transmits a message to the failed data store. If the active replication agent can communicate with the replication agent of the failed data store, then the message is transmitted immediately. Otherwise, the message is transmitted when the connection is reestablished.

Any application that connects to the failed data store receives a tt_ErrReplicationInvalid (8025) warning indicating that the data store has been marked Failed by a replication peer. Once the data store has been informed of its failed status, its state on the active data store is changed from Failed to Stop.

An application can use the ODBC SQLGetInfo function to check if the data store the application is connected to has been set to the Failed state.

Configuring network operations

If a replication host has more than one network interface, you may wish to configure replication to use an interface other than the default interface. Although you must specify the host name returned by the operating system's hostname command when you specify the data store name, you can configure replication to send or receive traffic over a different interface using the ROUTE clause.

The syntax of the ROUTE clause is:

ROUTE MASTER FullStoreName SUBSCRIBER FullStoreName
  {{MASTERIP MasterHost | SUBSCRIBERIP SubscriberHost}
    PRIORITY Priority} [...]

In the context of the ROUTE clause, each master data store is a subscriber of the other master data store and each read-only subscriber is a subscriber of both master data stores.

Example 3-4 Configuring multiple network interfaces

If host machine1 is configured with a second interface accessible by the host name machine1fast, and machine2 is configured with a second interface at IP address 192.168.1.100, you may specify that the secondary interfaces are used with the replication scheme.

CREATE ACTIVE STANDBY PAIR dns1, dsn2
ROUTE MASTER dsn1 ON "machine1" SUBSCRIBER dsn2 ON "machine2"
    MASTERIP "machine1fast" PRIORITY 1
    SUBSCRIBERIP "192.168.1.100" PRIORITY 1
ROUTE MASTER dsn2 ON "machine2" SUBSCRIBER dsn1 ON "machine1"
    MASTERIP "192.168.1.100" PRIORITY 1
    SUBSCRIBERIP "machine1fast" PRIORITY 1;

Alternately, on a replication host with more than one interface, you may wish to configure replication to use one or more interfaces as backups, in case the primary interface fails or the connection from it to the receiving host is broken. You can use the ROUTE clause to specify two or more interfaces for each master or subscriber that are used by replication in order of priority.

If replication on the master host is unable to bind to the MASTERIP with the highest priority, it will try to connect using subsequent MASTERIP addresses in order of priority immediately. However, if the connection to the subscriber fails for any other reason, replication will try to connect using each of the SUBSCRIBERIP addresses in order of priority before it tries the MASTERIP address with the next highest priority.

Example 3-5 Configuring network priority

If host machine1 is configured with two network interfaces at IP addresses 192.168.1.100 and 192.168.1.101, and host machine2 is configured with two interfaces at IP addresses 192.168.1.200 and 192.168.1.201, you may specify that replication use IP addresses 192.168.1.100 and 192.168.200 to transmit and receive traffic first, and to try IP addresses 192.168.1.101 or 192.168.1.201 if the first connection fails.

CREATE ACTIVE STANDBY PAIR dns1, dns2
ROUTE MASTER dsn1 ON "machine1" SUBSCRIBER dsn2 ON "machine2"
  MASTERIP "192.168.1.100" PRIORITY 1
  MASTERIP "192.168.1.101" PRIORITY 2
  SUBSCRIBERIP "192.168.1.200" PRIORITY 1
  SUBSCRIBERIP "192.168.1.201" PRIORITY 2;

Including or excluding data store objects from replication

An active standby pair replicates an entire data store by default. Use the INCLUDE clause to replicate only the tables, cache groups and sequences that are listed in the INCLUDE clause. No other data store objects will be replicated in an active standby pair that is defined with an INCLUDE clause. For example, this INCLUDE clause specifies three tables to be replicated by the active standby pair:

INCLUDE TABLE employees, departments, jobs

You can choose to exclude specific tables, cache groups or sequences from replication by using the EXCLUDE clause of the CREATE ACTIVE STANDBY PAIR statement. Use one EXCLUDE clause for each object type. For example:

EXCLUDE TABLE ttuser.tab1, ttuser.tab2
EXCLUDE CACHE GROUP ttuser.cg1, ttuser.cg2
EXCLUDE SEQUENCE ttuser.seq1, ttuser.seq2