4 Troubleshooting Replication 

The following sections in this chapter describe how to troubleshoot problems you may encounter when replicating databases:

Note:

Error log messages from the replication agent daemon are designated by REP in the message, as shown in several examples in this chapter. These messages are enabled unless you disable the REPLICATION component of the ttDaemonLog utility. See "ttDaemonLog" in Oracle TimesTen In-Memory Database Reference for information about ttDaemonLog options for Windows and UNIX.

Unable to create a replication scheme

This section describes what to check if you are unable to use CREATE REPLICATION to create a replication scheme.

Possible cause What to do
User does not have ADMIN privilege. You must have ADMIN privilege to use the CREATE REPLICATION or DROP REPLICATION statements.
Database name, host name, or element name is incorrect.
  • Check the CREATE REPLICATION statement for typographical errors.
  • See "Check host names".

  • Use official host names instead of aliases.

  • The host name should match the value returned by the hostname command on your system and should be used consistently throughout the replication scheme.

The local host is not part of the replication scheme. Create the replication scheme on a host that will be part of the replication scheme.
Replication tables defined in the CREATE REPLICATION statement do not exist. The name, owner, and column definitions of the tables participating in the replication scheme must be identical on both the master and subscriber databases. Use CREATE TABLE to create tables on the database, or use the ttRepAdmin -duplicate utility or the ttRepDuplicateEx C function to duplicate the entire database to be replicated.
There are other problems. Review the procedures and requirements described in "Defining Classic Replication Schemes" in the Oracle TimesTen In-Memory Database Replication Guide.

Unable to alter a replication scheme

This section describes what to check if you are unable to use ALTER REPLICATION to alter a replication scheme.

Possible cause What to do
User does not have ADMIN privilege. You must have ADMIN privilege to use the ALTER REPLICATION statement.
Replication agent is in Start state. Most ALTER REPLICATION operations are supported only when the replication agent is stopped (ttAdmin -repStop). Stop the replication agents on both master and subscriber databases, alter the replication scheme on both master and subscriber databases, then restart both replication agents.
Database name, host name, or element name is inconsistent.
Replication table defined in the ALTER REPLICATION statement does not exist. Use CREATE TABLE to create a table on the database.
There are other problems. Review the procedures and requirements described in "Altering a Classic Replication Scheme" in the Oracle TimesTen In-Memory Database Replication Guide.

Unable to start or stop replication agent

This section describes what to check if you are unable to start or stop a replication agent.

Possible cause What to do
User does not have ADMIN privileges. You must have ADMIN privileges to use the ttAdmin utility or the ttRepStart or ttRepStop procedures to start or stop a replication agent.
TimesTen daemon is not started. Check the state of the TimesTen daemon, as described in "Check the TimesTen user error log". If necessary, start the TimesTen daemon as described in "Working with the TimesTen Data Manager Daemon" in the Oracle TimesTen In-Memory Database Operations Guide.
Database does not participate in a replication scheme. If a database does not participate in a replication scheme, attempts to start a replication agent for that database will fail. Use CREATE REPLICATION to create a replication scheme for the database.

Using SNMP traps for notification of replication events

TimesTen can send SNMP traps for certain replication events to enable network management software to take immediate action. TimesTen can send the following traps for replication events:

  • ttRepAgentExitingTrap

  • ttRepAgentDiedTrap

  • ttRepAgentStartingTrap

  • ttRepCatchupStartTrap

  • ttRepCatchupStopTrap

  • ttRepReturnTransitionTrap

  • ttRepSubscriberFailedTrap

  • ttRepUpdateFailedTrap

These traps are described in "Diagnostics through SNMP Traps" in the Oracle TimesTen In-Memory Database Error Messages and SNMP Traps.

Replication does not work

If you are unable to get replication working between a master and subscriber database, the problem may be one or more of the following:

Possible cause See...
TimesTen daemon or replication agents are not running. "Check status of TimesTen daemon and replication agents", below
Master and subscriber agents are not communicating. "Check that replication agents are communicating"
Replication is not in Start state. "Check replication state"
There is an error in the replication scheme. "Check replication scheme configuration"
Owner names for replication scheme and tables are inconsistent. "Check owner names"
Replication tables are inconsistent. "Check consistency between replicated tables"

Check status of TimesTen daemon and replication agents

Use the ttStatus utility to confirm the main TimesTen daemon is running and the replication agents are started for all of your master and subscriber databases. The output from a simple replication scheme using a single master and subscriber database should look like that shown in Example 4-1.

If the TimesTen daemon is running, but the replication agents are not, the output looks like that shown in Example 4-2. In this case, start the replication agents as described in "Starting and stopping the replication agents" in the Oracle TimesTen In-Memory Database Replication Guide.

If neither the TimesTen daemon or replication agents are running, the output looks like that shown in Example 4-3. In this case, confirm you have correctly installed TimesTen and the Data Manager service is started, as described in "TimesTen Installation" in the Oracle TimesTen In-Memory Database Installation Guide.

Example 4-1 ttStatus output for one master and one subscriber

C:\>ttStatus
TimesTen status report as of Mon Aug  6 22:07:53 2012
Daemon pid 5088 port 17000 instance MYINSTANCE
TimesTen server pid 4344 started on port 17002
------------------------------------------------------------------------
Data store c:\temp\subscriber1ds
There are 12 connections to the data store
Data store is in shared mode
Shared Memory KEY Global\DBI45b9471c.2.SHM.2 HANDLE 0x280
Type            PID     Context     Connection Name              ConnID
Process         1244    0x00d08fb0  subscriber1ds                     1
Replication     4548    0x00aed2f8  LOGFORCE                          4
Replication     4548    0x00b03470  TRANSMITTER                       5
Replication     4548    0x00b725a8  RECEIVER                          6
Replication     4548    0x00b82808  REPHOLD                           2
Replication     4548    0x00b98980  REPLISTENER                       3
Subdaemon       2752    0x00526768  Worker                         2042
Subdaemon       2752    0x0072a758  Flusher                        2043
Subdaemon       2752    0x007308c0  Checkpoint                     2044
Subdaemon       2752    0x00736a28  HistGC                         2046
Subdaemon       2752    0x067f02f8  Aging                          2045
Subdaemon       2752    0x068364a0  Monitor                        2047
Replication policy  : Manual
Replication agent is running.
Cache agent policy : Manual
------------------------------------------------------------------------
Data store c:\temp\masterds
There are 12 connections to the data store
Data store is in shared mode
Shared Memory KEY Global\DBI45b945d0.0.SHM.6 HANDLE 0x2bc
Type            PID     Context     Connection Name              ConnID
Process         5880    0x00d09008  masterds                          1
Replication     3728    0x00aed570  LOGFORCE                          4
Replication     3728    0x00b036e8  TRANSMITTER                       5
Replication     3728    0x00b168b8  REPHOLD                           3
Replication     3728    0x00b1ca20  REPLISTENER                       2
Replication     3728    0x00b22b88  RECEIVER                          6
Subdaemon       3220    0x00526768  Worker                         2042
Subdaemon       3220    0x0072e768  Flusher                        2043
Subdaemon       3220    0x007348d0  Checkpoint                     2044
Subdaemon       3220    0x067b0068  Aging                          2045
Subdaemon       3220    0x067c0040  Monitor                        2047
Subdaemon       3220    0x068404c8  HistGC                         2046
Replication policy  : Manual
Replication agent is running.
Cache agent policy : Manual
------------------------------------------------------------------------
Data store c:\temp\demo
There are no connections to the data store
Replication policy : Manual
Cache agent policy : Manual
------------------------------------------------------------------------
End of report

Example 4-2 Replication agent is not running

> ttStatus
TimesTen status report as of Mon Aug  6 22:07:53 2012

Daemon pid 3396 port 15000 instance MYINSTANCE
TimesTen server pid 3436 started on port 15002
-----------------------------------------------------------------
Data store c:\temp\subscriberds
There are no connections to the data store
cache agent restart policy: manual
-----------------------------------------------------------------
Data store c:\temp\masterds
There are no connections to the data store
cache agent restart policy: manual
-----------------------------------------------------------------
End of report

Example 4-3 TimesTen daemon and replication agent are not running

> ttStatus
ttStatus: Could not connect to TimesTen daemon: Connection refused

Check that replication agents are communicating

Use ttRepAdmin -receiver -list to see that the replication agents are communicating with each other. If the masterds database is replicating to subscriberds, the output should look similar to the following:

Example 4-4 Check that the replication agents are communicating

> ttRepAdmin -receiver -list masterDSN
Peer name         Host name                 Port    State  Proto
----------------  ------------------------ ------  ------- -----
SUBSCRIBERDS      MYHOST                    Auto   Start       10

Last Msg Sent Last Msg Recv Latency TPS     RecordsPS Logs
------------- ------------- ------- ------- --------- ----
0:01:12       -               19.41       5        52    2

Check replication state

Use the ttReplicationStatus procedure to check the state of the subscriber database with respect to its master. If the subscriber is in the Stop, Pause, or Failed state, use the ttReplicationStatus procedure to reset the subscriber state to Start, as described in "Setting the replication state of subscribers" in the Oracle TimesTen In-Memory Database Replication Guide.

Example 4-5 Obtain status of the subscriber database from the master database

Use ttReplicationStatus to obtain the status of the subscriberds database from its master database, masterDSN, enter:

> ttIsql masterDSN
Command> CALL ttReplicationStatus ('subscriberds');
< SUBSCRIBERDS, MYHOST, 0, pause, 1, 10, REPSCHEME, REPL >
1 row found.

To reset state to Start call the ttRepSubscriberStateSet procedure:

Command> CALL ttRepSubscriberStateSet('REPSCHEME', 'REPL', 'SUBSCRIBERDS', 'MYHOST', 0)
Command> CALL ttReplicationStatus ('subscriberds');
< SUBSCRIBERDS, MYHOST, 0, start, 1, 152959, REPSCHEME, REPL >
1 row found.

Check replication scheme configuration

This section describes some procedures you can use to confirm the correct configuration of the various components in your replicated system. The basic procedure categories are:

Check ttRepAdmin -showconfig

Use ttRepAdmin -showconfig to confirm the configuration of your replication scheme.

What to look for:

  • Are all of the subscriber agents started and reported to be in the Start state? If not, reset the agents to the Start state. See "Setting the replication state of subscribers" in the Oracle TimesTen In-Memory Database Replication Guide.

  • Do the reported Peer names match the names given in the DataStore attributes in the DSN definitions for the replicated databases? Replication does not work if you specified the names given for the Data Source Name attributes.

  • Is there anything under List of subscribers? If not, confirm the database names you specified in the DSN definition are consistent with those you specified in your replication scheme configuration file.

  • Are the Host names correct? If in doubt, see "Check host names".

  • Are the correct table names displayed under Table details? If not, correct the table names in your replication scheme configuration file.

Example 4-6 Confirm the configuration of the replication scheme

> ttRepAdmin -showconfig masterDSN
Self host "MYHOST", port auto, name "MASTERDS", LSN 4/2970276, timeout 120, threshold 0
List of subscribers
-----------------
Peer name         Host name                 Port    State  Proto
----------------  ------------------------ ------  ------- -----
SUBSCRIBERDS      MYHOST                    Auto   Start      10
Last Msg Sent Last Msg Recv Latency TPS     RecordsPS Logs
------------- ------------- ------- ------- --------- ----
0:01:12       -               19.41       5        52    2
List of tables and subscriptions
--------------------------------
Table details
-------------
Table : REPL.TAB
Master Name               Subscriber Name
-----------               -------------
MASTERDS                  SUBSCRIBERDS

Check the TTREP.TTSTORES table

Check the TTREP.TTSTORES table to confirm that replication associates the replication scheme with the local database.

Example 4-7 Confirm that the replication scheme is associated with the local database

Connect to the database and enter:

SELECT * FROM ttrep.ttstores WHERE is_local_store <> 0x0;
Command> select * from ttrep.ttstores where is_local_store <> 0x0;
< -5193371075573733683, MYHOST, MASTERDS, 01, 0, 0, 4, 0 >
1 row found.

There should be exactly one row returned. If more than one row is returned, contact TimesTen technical support. If no rows are returned, then none of the hosts returned by the following statement is perceived to be a local system by TimesTen replication:

SELECT DISTINCT host_name FROM ttrep.ttstores;

It may also be that none of the database names specified in your replication scheme match those specified in your DSN descriptions.

Check host names

Some hosts or IP addresses specified in a replication scheme cannot be resolved by the replication agent because:

  • Host names or IP addresses specified in the replication scheme are wrong or misspelled.

  • Host names or IP addresses cannot be resolved or found by DNS or in the /etc/hosts file

  • Entries in the /etc/hosts file are incorrectly ordered in appearance. This error is most common when multiple NICs are used. You must have root privilege to make changes to the /etc/hosts files.

See "Configuring the network" in the Oracle TimesTen In-Memory Database Replication Guide for details on how to configure DNS and /etc/hosts files for host systems used for replication.

To check if a host name in the replication scheme matches the host name of the local system, write an application to perform these tasks:

  1. Use a gethostname operating system function call to determine the host name of the running host.

  2. Call gethostbyname with the output from Step 1.

  3. Call gethostbyname with the host name specified in the replication scheme.

  4. Compare output of Step 2 and Step 3. If there is a match, then the running host is involved in replication. Otherwise, it is not involved in replication.

Check owner names

As described in "Table requirements and restrictions for classic replication schemes" and "Owner of the replication scheme and replicated objects" in the Oracle TimesTen In-Memory Database Replication Guide, the owner names of your replication scheme and your replicated tables must be consistent across all participating databases. This indicates that the replication scheme owner must match in all databases and each table must be owned by the same user in each database.

Checking replication owner

Check the owner name assigned to your replication scheme by calling the ttIsql repschemes command or by listing the contents of the TTREP.REPLICATIONS table.

Example 4-8 shows that the replication scheme name, REPSCHEME, has a consistent owner name (REPL) in the databases on both SYSTEM1 and SYSTEM2. Example 4-9 shows the scheme name with inconsistent owner names. This can occur if you omit the owner name from the replication scheme definition and the system uses the Id of the replication scheme creator.

Example 4-8 Consistent owner names for replication scheme

On SYSTEM1:

> ttIsql masterDSN
Command> select * from ttrep.replications;
< REPSCHEME            , REPL                 , C, 0, 0, -1 >
1 row found.

On SYSTEM2:

> ttIsql -connStr "dsn=subscriberDSN"
Command> select * from ttrep.replications;
< REPSCHEME            , REPL                 , C, 0, 0, -1 >
1 row found.

Example 4-9 Inconsistent owner names for replication scheme

On SYSTEM1:

> ttIsql masterDSN
Command> select * from ttrep.replications;
< REPSCHEME            , SYSTEM1               , C, 0, 0, -1 >
1 row found.

On SYSTEM2:

> ttIsql -connStr "dsn=subscriberDSN"
Command> select * from ttrep.replications;
< REPSCHEME            , SYSTEM2               , C, 0, 0, -1 >
1 row found.

Checking table owner

Check the owner names assigned to the tables in each database by using the ttIsql tables command.

Example 4-10 Consistent table owner names

This example shows that the TAB table has a consistent owner name (REPL) in the databases on both SYSTEM1 and SYSTEM2.

Output for SYSTEM1 Output for SYSTEM2
REPL.TAB
REPL.TAB

Example 4-11 Inconsistent table owner names

This example shows the TAB table with inconsistent owner names, which were automatically assigned for each host.

Output for SYSTEM1 Output for SYSTEM2
SYSTEM1.TAB
SYSTEM2.TAB

Check consistency between replicated tables

By default, TimesTen creates a replication scheme with the TABLE DEFINITION CHECKING RELAXED attribute. The TABLE DEFINITION CHECKING RELAXED attribute does not require that the column definition of replicated tables be identical. Rather, the RELAXED attribute only requires that replicated tables have the same key definition, number of columns, column names, and column data types. If you define the replication scheme with the TABLE DEFINITION CHECKING EXACT clause, then the column definitions of replicated tables participating in the replication scheme must be identical on the master and subscriber databases. TimesTen recommends that you use TABLE DEFINITION CHECKING RELAXED. For more information, see "Column definition options for replicated tables" in the Oracle TimesTen In-Memory Database Replication Guide.

Example 4-12 Check consistency between replicated tables

This output from the user error log shows a mismatch on the number of columns for the subscriber table TTUSER.MYDSN.

11:37:58.00 Info: REP:  9430: REP1:transmitter.c(4936): TT16136: Sending 
definition for table TTUSER.MYDSN (1 column)
11:37:58.00 Info: REP:  9412: REP2:receiver.c(5928): TT16193: Adding definition
for table: TTUSER.MYDSN
11:37:58.00 Info: REP:  9412: REP2:meta.c(5580):TTUSER.MYDSN ptn 0: srcoff 0, 
destoff 0, length 8
11:37:58.00 Info: REP:  9412: REP2:meta.c(5580):TTUSER.MYDSN ptn 1: srcoff 8, 
destoff 12, length 12
11:37:58.00 Err : REP:  9412: REP2:receiver.c(6203): TT16198: Table definition 
mismatch on number of columns for table TTUSER.MYDSN.  Local definition: 2; 
transmitting peer: 1
11:37:58.00 Err : REP:  9412: REP2:receiver.c(6380): TT16204: Table TTUSER.MYDSN 
marked invalid. Will not apply transactions received for it until a valid 
definition is received
11:37:58.00 Err : REP:  9412: REP2:receiver.c(7200): TT16078: Table definition 
for ID 637068 is invalid (Original failure 11:37:58
REP2:receiver.c(6203): TT16198: Table definition mismatch on number of columns 
for table TTUSER.MYDSN.  Local definition: 2; transmitting peer: 1)
11:37:58.00 Err : REP:  9412: REP2:receiver.c(5002): TT16187: Transaction 
1173958671/2; Error: transient 0, permanent 1

Replication unresponsive, appears hung

Table summary is in the first heading cell.

Possible cause See...
There is a failed subscriber. "Check replication state", below
Return-receipt timeout period is too long. "Check return receipt timeout setting", below

Check replication state

Use the ttReplicationStatus procedure to check state of the subscriber database with respect to its master. If the subscriber is in the Failed state, see "Managing Database Failover and Recovery" in the Oracle TimesTen In-Memory Database Replication Guide for information on how to deal with failed databases.

Example 4-13 Check replication state

Use ttReplicationStatus to obtain the status of the subscriberds database from its master database, masterDSN, enter:

> ttIsql masterDSN
Command> CALL ttReplicationStatus ('subscriberds');
< SUBSCRIBERDS, MYHOST, 0, failed, 1, 10, REPSCHEME, REPL >
1 row found.

Check return receipt timeout setting

Use the ttRepSyncGet procedure to check the return receipt timeout setting. A value of -1 indicates the application is to wait until it receives an acknowledgment from the subscriber. Network latency or other issues might delay receipt of the subscriber acknowledgment. You either address these issues or use the ttRepSyncGet procedure to reset the return receipt timeout period.

See "Check the status of return service transactions" in the Oracle TimesTen In-Memory Database Replication Guide for more information.

Poor replication or XLA performance

Most of this section addresses issues that may impact replication performance. Some issues, such as log buffer too small and reading from the transaction log files on disk, can impact the performance of both replication and XLA applications.

Possible cause See...
Network is slow. "Check network bandwidth", below
RETURN RECEIPT is used. "Check use of return receipt blocking", below
Replication scheme is inconsistent. "Check replication configuration"
Log buffer is too small. "Check size of log buffer"
There are frequent or inefficient disk writes. "Check durability settings"
Reading from transaction log files on disk rather than the log buffer. "Check for reads from transaction log files"
Rate of conflicts is high. "Conflict reporting slows down replication"

Check network bandwidth

Replication agents typically communicate over some type of network connection. If replicating over a network slower than 10 MB per second (such as common with a 100 Base-T Ethernet typical in a LAN), you must be careful to match the transaction load to the available bandwidth of the network. see "Network bandwidth requirements" in the Oracle TimesTen In-Memory Database Replication Guide for details.

Check use of return receipt blocking

Unless you need receipt confirmation for all your transactions, disable RETURN RECEIPT BLOCKING. If you require receipt confirmation for some transactions, then set RETURN RECEIPT BY REQUEST and call the ttRepSyncSet procedure to enable the return receipt service for specific transactions. See "RETURN RECEIPT BY REQUEST" in the Oracle TimesTen In-Memory Database Replication Guide for details.

Note:

The performance degradation caused by return-receipt becomes less of an issue when multiple applications (or threads) are updating the database. If you must use return-receipt in a transaction, you can improve the performance of your application by using multiple threads to update the database. Though each thread must block for receipt confirmation, the other threads are free to make updates.

Check replication configuration

In addition to return-receipt setting described above, other factors related to the configuration of your replication scheme could impact replication performance. As described in "Making decisions about performance and recovery tradeoffs" in the Oracle TimesTen In-Memory Database Replication Guide, you often have to weigh the ability to efficiently failover and recover a database against replication performance.

For more information about direct replication, see "Direct replication or propagation" in the Oracle TimesTen In-Memory Database Replication Guide.

Check size of log buffer

Setting your log buffer too small may impact replication performance. Instead, Set the LogBufMB DSN attribute to a larger size. For more information on this DSN attribute, see "Setting connection attributes for logging" in the Oracle TimesTen In-Memory Database Replication Guide

Check durability settings

You can improve replication performance by setting TRANSMIT NONDURABLE on the replication ELEMENT to eliminate the flush-log-to-disk operation from the replication cycle. See "Setting transmit durability on DATASTORE element" in the Oracle TimesTen In-Memory Database Replication Guide for details.

Enabling the DURABLE COMMIT option in your replication scheme also impacts performance. See "DURABLE COMMIT" in the Oracle TimesTen In-Memory Database Replication Guide for more information.

Check for reads from transaction log files

In some situations a "log reader," such as a master replication agent 'transmitter' thread or a ttXlaNextUpdate call in an XLA application, may not be able to keep up with the update rate of the applications writing to the database. Normally, replication and XLA readers get update records from the log buffer in memory. When the readers fall behind the application update rate, transaction log files can accumulate on the disk until the backlog can be cleared. This forces the readers to read transactions from the transaction log files on disk, which is much slower. Should you detect reads from the transaction log files, you may want to respond by decreasing the rate of application updates to that sustainable by the log readers.

Applications can monitor whether log readers are obtaining update records from transaction log files on disk rather than from the log buffer in memory by tracking the SYS.MONITOR table entry LOG_FS_READS. For example, you can check the value of LOG_FS_READS for the database, MASTERDSN, with the following ttIsql command.

% ttIsql -v1 -e "select log_fs_reads from monitor; quit;" -connStr dsn=MASTERDSN

If the LOG_FS_READS counter is increasing, the log readers are falling behind or clearing out a backlog in the transaction log files.

For more complete monitoring of replication progress, create a simple shell script like the following:

!/bin/sh
trap exit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
DSN=$1

while [ 1 ] ; do
   date
   ttRepAdmin -receiver -list -connStr dsn=$DSN
   echo -n "Log reads from disk: "
   ttIsql -v1 -e "select log_fs_reads from monitor; quit;" -connStr dsn=$DSN
   echo
   ttRepAdmin -bookmark -connStr dsn=$DSN
sleep 15
done

Example 4-14 Check the status of the transaction log

For example, you name the above script monitorLog and your replication scheme replicates from the MASTERDSN database to the SUBSCRIBER1DSN database. You can then check the status of the transaction log by entering:

$ monitorLog masterdsn

This generates output similar to the following:

Mon Aug  2 10:44:40  2004
Peer name         Host name                 Port    State  Proto
----------------  ------------------------ ------  ------- -----
SUBSCRIBER1DSN    MYHOST                    Auto    Start   12

Last Msg Sent Last Msg Recv Latency TPS     RecordsPS Logs
------------- ------------- ------- ------- --------- ----
00:00:05      -               -1.00      -1        -1    1

Log reads from disk: < 0 >

Replication hold LSN ...... 10/2656136
Last written LSN .......... 10/4015824
Last LSN forced to disk ... 10/3970152

The output from the script displays an updated status every 15 seconds until you enter Ctrl-C to exit.

Following the date in the output in Example 4-14 is the name of the subscriber, its host, and so on. Next is latency and rate information, as well as the number of transaction log files being retained on behalf of this subscriber. The specific meaning of each value is described in "Display subscriber status with the ttRepAdmin utility" in the Oracle TimesTen In-Memory Database Replication Guide. The main interest here is the 'Last Msg Sent' and 'Logs' values. The 'Last Msg Sent' value indicates the elapsed time since the last message was sent by the master to the subscriber and 'Logs' indicates how many transaction log files behind the replication log reader is from the current log insertion point used by the writers (Last written LSN).

Normally the 'Logs' value should be '1', as shown in Example 4-14. A steadily increasing 'Logs' value indicates latency is increasing and eventually log reads are satisfied from disk.

Note:

If the LogBufMB is larger than the LogFileSize, an increase in the 'Logs' value does not necessarily mean the log readers are reading from the transaction log files. This is because the log manager does not allow more than one log file's worth of data to be outstanding before writing it to the file system. After the log manager writes the data, the data remains in the log buffer to be read directly by the log readers. So, when the LogBufMB is larger than the LogFileSize, the 'Logs" value alone may not be the best measure of whether log readers are reading from memory or from disk.

The output from the following command displays the number of the transaction log files and the location of the bookmarks set by the log manager.

ttRepAdmin -bookmark -connStr dsn=$DSN

For more information, see "Monitor replication with the ttRepAdmin utility" in the Oracle TimesTen In-Memory Database Replication Guide. The difference between the Replication hold LSN and the last written LSN indicates the number of records in the transaction log that have not yet been transmitted to the subscribers. A steady increase in the difference between these values is another indication that replication latency is increasing and log file reads are likely to occur.

Example 4-15 Log reader must read from transaction log files

In this example, assume the LogBufMB is 16MB and the LogFileSize is 8MB. The following output indicates the log reader is approximately 1.8 MB behind the capacity of the log buffer and must read from the transaction log files, 14 and 15.

Peer name         Host name                Port    State   Proto
----------------  ------------------------ ------  ------- -----
SUBSCRIBER1DSN    MYHOST                   Auto    Start   12

Last Msg Sent Last Msg Recv Latency TPS     RecordsPS Logs
------------- ------------- ------- ------- --------- ----
00:00:03      -               -1.00      -1        -1    4

Log reads from disk: <20>

Replication hold LSN ...... 14/7007464
Last written LSN .......... 17/465336
Last LSN forced to disk ... 17/456152

Problems using ttRepAdmin

This section includes the following topics:

Problems when changing the state of a replication receiver

If you try to change the state of a replication receiver when replication has been configured to use a secondary IP address, a misconfiguration of the /etc/hosts file may cause ttRepAdmin to print the following error:

Alter replication with 'ALTER REPLICATION...port 0' failed: 
TT0907: Unique constraint (REPSTORESIX) violated. 

This error is caused by replication not recognizing the local database, which can be confirmed by the following query:

SELECT * FROM ttrep.ttstores WHERE is_local_store <> 0x0;

If this query returns no rows, or returns a row with the main host name of the database set to the result of the hostname command rather than the host you specified, you have encountered a configuration problem with the /etc/hosts file.

To correct the problem, make sure that the special host name you are using is defined in the /etc/hosts file and that there is an IP address in common between your special host name and the result of the hostname command.

For example, if the hostname command returns softswitch and your system has two Ethernet cards with the addresses 10.10.15.136 and 192.168.15.136, the IP addresses defined for softswitch should include both IP addresses.

Problems using ttRepAdmin -duplicate

Make sure that Timesten is installed with the same instance administrator on both the source system and target system. If so, then consider other causes:

Database created before duplication

If you connected to your new subscriber DSN before running ttRepAdmin -duplicate, the database has already been created. In this situation, -duplicate returns:

Error : Restore not done : The datastore already exists.
Unable to restore datastore locally

Confirm the existence of the database by running ttStatus and checking to see if the database is in the returned list. If the new subscriber database exists, destroy it and try ttRepAdmin -duplicate again.

> ttDestroy /tmp/newstore
> ttRepAdmin -dsn newstoreDSN -duplicate -name newstore
-from masterds -host "server1"

Wrong database or host name

If you have made an error entering the subscriber database name or host name in the replication scheme, you may see something like the following:

Unable to swap datastore locally
No receiver NEWSTORE on SERVER2 found to swap with

If you provide an incorrect host name for a subscriber, you may receive this error:

Problem: ttRepAdmin -duplicate command returns TimesTen error 12080: No 
subscriber found to swap with.

You also receive information about the subscriber that TimesTen is trying to locate. One common cause is providing an incorrect host name, which must be the exact same name as the host name provided when issuing the CREATE ACTIVE STANDBY PAIR statement. For example, if you created the subscriber with myhost.oracle.com, but only provided myhost in the ttRepAdmin -duplicate, the subscriber will not be found.

Note:

If myhost is the local host, use the -localhost argument. You typically need to use the -localhost argument if the local host name does not exactly match the host name provided when creating the replication scheme.

Returns 'Must specify -scheme' error

If you have more than one scheme specified in your TTREP.REPLICATIONS table, some ttRepAdmin commands may return the error:

Must specify -scheme to identify which replication scheme to use

To check the names of the replication schemes used by your database, use the ttIsql utility to connect, and enter:

Command> SELECT * from TTREP.REPLICATIONS;

Example 4-16 Two replication schemes assigned to the database

This example shows that two replication schemes, REPSCHEME1 and REPSCHEME2, are assigned to the database associated with subDSN. In this case, it is necessary to use the ttRepAdmin -scheme option.

> ttIsql -connStr "dsn=subDSN"
Command> SELECT * from TTREP.REPLICATIONS;
< REPSCHEME1      , REPL                , C, 0, 0, -1 >
< REPSCHEME2      , REPL                , C, 0, 0, -1 >
2 rows found.
Command> exit
> ttRepAdmin -dsn subDSN -receiver -list -scheme REPSCHEME1
Peer name         Host name                 Port    State  Proto
----------------  ------------------------ ------  ------- -----
SUBSCRIBER1        MYHOST                   Auto   Start      10

Last Msg Sent Last Msg Recv Latency TPS     RecordsPS Logs
------------- ------------- ------- ------- --------- ----
0:01:12       -               19.41       5        52    2

Problems with conflict checking

This section includes the following topics:

Column cannot be used for replication timestamp

When attempting to set CHECK CONFLICTS for an element in a CREATE REPLICATION statement, you may encounter an error similar to the following.

8004: Column REPL.TABS.TS cannot be used for replication timestamp checking if 
in an index or added by ALTER TABLE; and must be binary(8) with NULL values allowed.

In this situation, check:

  • That the timestamp column in the specified table is a nullable column of type BINARY(8). In the above example, the TS column in the REPL.TAB table should have a type of BINARY(8).

  • The timestamp column is defined in the original CREATE TABLE statement, rather than added later using ALTER TABLE.

Timestamp does not exist

You may receive an error similar to the following:

2208: Column TS does not exist in table.

In this situation, confirm that you have specified the correct name for the timestamp column in the CHECK CONFLICTS clause and that it exists in the specified table.

Also, make sure the timestamp column is not part of a primary key or index.

Conflict reporting slows down replication

If you have configured replication to CHECK CONFLICTS, TimesTen sends reports to the local host. You can also configure a report file. See "Diagnostics through SNMP Traps" in the Oracle TimesTen In-Memory Database Error Messages and SNMP Traps.

If there are many conflicts in a short period of time, subscriber performance can slow down because of the reporting requirements. You can use store attributes in the CREATE REPLICATION or ALTER REPLICATION statements to suspend and resume conflict reporting at specified rates of conflict:

  • CONFLICT REPORTING SUSPEND AT rate

  • CONFLICT REPORTING RESUME AT rate

Information about conflicts that occur while reporting is suspended cannot be retrieved.

See "Reporting conflicts" in the Oracle TimesTen In-Memory Database Replication Guide.

Problems with parallel replication

If you have configured parallel replication, then you have additional TimesTen replication threads running on your system. Be sure that you have enough CPU cores to accommodate the increased number of replication threads.

If you have configured user-defined track based parallel replication (ReplicationApplyOrdering = 1 and ReplicationParallelism > = 2), and you are seeing permanent errors on the receiver side that include constraint violations or row lookup failures, then possible causes include:

  • Application defined dependencies are incorrect

  • Operations for the same keys or keys of tables in a constraint relationship are written to different tracks.

Check the value of the general connection attribute, ReplicationTrack. All transactions executed by the connection are assigned to the track as defined in the ReplicationTrack attribute setting. Make sure the workload for each track is suitable. Use the ALTER SESSION statement to change the replication track. For more information, see "ALTER SESSION" in the Oracle TimesTen In-Memory Database SQL Reference. For more information on configuring parallel replication, see "Configuring parallel replication" in the Oracle TimesTen In-Memory Database Replication Guide.