Parallel Redo Recovery in TimesTen Database

Parallel Redo improves database recovery by using multiple CPU threads for large TimesTen databases. As a result, databases recover more quickly from crashes or other disruptions, ensuring higher availability and better overall performance.

The single threaded database recovery process works well for small TimesTen databases with few transaction log files. However, as database size increases or when there are more transaction log files to process, the recovery time can become quite slow. You can limit the number of log files needed for recovery using frequent checkpoints. With single thread recovery, processing many transaction log files can lead to a long recovery time, lengthens service outage when there is a crash or recovery from backup.

Parallel Redo recovers one segment of the transaction log at a time, reapplying the log records within each segment concurrently using multiple threads. This parallel approach not only minimizes recovery time but also guarantees that transactions are reapplied in the original order of record insertion, maintaining the database's consistency and integrity.

There are three first connection attributes that control the Parallel Redo recovery.
  • EnableParallelRedo
  • RedoSegmentSize
  • RedoParallelism

For details, see EnableParallelRedo, RedoSegmentSize, and RedoParallelism in the Oracle TimesTen In-Memory Database Reference.

To view the default values of the attributes, use the ttConfiguration built-in (see ttConfiguration):
Command> CALL ttConfiguration ('EnableParallelRedo');
< EnableParallelRedo, 1 >
1 row found.

Command> CALL ttConfiguration ('RedoSegmentSize');
< RedoSegmentSize, 20 >
1 row found.

Command> CALL ttConfiguration ('RedoParallelism');
< RedoParallelism, 4 >
1 row found.

When you enable EnableParallelRedo, parallel redo recovers large transaction logs by dividing them into 20MB segments, which is the default value of the RedoSegmentSize connection attribute. It processes each segment concurrently using four tracks (default), or the number of tracks defined by the RedoParallelism value, thereby speeding up the recovery process. Parallel Redo generates a report in the daemon log at timesten_home/diag/ttmesg.log (see Error, Warning, and Informational Messages). This report contains the performance-related information for database recovery. You can use this information to adjust the redo segment size or modify the parallelism level according to your requirements.

Here is the sample report:
parallelRedo.c:631 : 0x7f99699f6010: Conn[2047]: Number of transactions      3144574
parallelRedo.c:632 : 0x7f99699f6010: Conn[2047]: Number of log blocks        263209
parallelRedo.c:633 : 0x7f99699f6010: Conn[2047]: Number of log records       12579336
parallelRedo.c:634 : 0x7f99699f6010: Conn[2047]: Number of log segments      54
parallelRedo.c:636 : 0x7f99699f6010: Conn[2047]: Number of delayed LRs       488587 (3.88%)
parallelRedo.c:637 : 0x7f99699f6010: Conn[2047]: Max blk catalog entries     1192
parallelRedo.c:639 : 0x7f99699f6010: Conn[2047]: Log reading time           3474.38 msec (24.93%)
parallelRedo.c:641 : 0x7f99699f6010: Conn[2047]: Collections create time    12.77 msec (0.09%)
parallelRedo.c:643 : 0x7f99699f6010: Conn[2047]: LRs preparation time       1759.52 msec (12.62%)
parallelRedo.c:645 : 0x7f99699f6010: Conn[2047]: Sort + apply time          8657.64 msec (62.12%)
parallelRedo.c:647 : 0x7f99699f6010: Conn[2047]: Collections destroy time   11.58 msec (0.08%)
parallelRedo.c:648 : 0x7f99699f6010: Conn[2047]: Parallel Redo time         13937.30 msec
parallelRedo.c:653 : 0x7f99699f6010: Conn[2047]: Blk catalog create         2
parallelRedo.c:654 : 0x7f99699f6010: Conn[2047]: Blk catalog collisions     5
parallelRedo.c:656 : 0x7f99699f6010: Conn[2047]: Number of deps added       252212 (2.00%)
parallelRedo.c:657 : 0x7f99699f6010: Conn[2047]: Number of Ix LRs           9
parallelRedo.c:658 : 0x7f99699f6010: Conn[2047]: Dep table realloc cnt      0
parallelRedo.c:659 : 0x7f99699f6010: Conn[2047]: No progress deps iterations        236162
parallelRedo.c:660 : 0x7f99699f6010: Conn[2047]: Tracks sleep/wake up cnt           234372
parallelRedo.c:661 : 0x7f99699f6010: Conn[2047]: Reclaim cache create cnt           105
parallelRedo.c:663 : 0x7f99699f6010: Conn[2047]: Next LR ready cnt                  150 (0.00%)
parallelRedo.c:664 : 0x7f99699f6010: Conn[2047]: No redo track dependency cnt       801544
parallelRedo.c:669 : 0x7f99699f6010: Conn[2047]:          Avg. sort time             1178.51 msec
parallelRedo.c:670 : 0x7f99699f6010: Conn[2047]:          Avg. apply time            7454.12 msec
parallelRedo.c:672 : 0x7f99699f6010: Conn[2047]: ==============================================================
parallelRedo.c:674 : 0x7f99699f6010: Conn[2047]:   Track     Sort time (msec)    Apply time (msec)            Processed LRs           Deps added   MaxArrSz
parallelRedo.c:694 : 0x7f99699f6010: Conn[2047]:       0              1196.59              7443.97         3152496 (25.06%)       59431 (23.56%)          0
parallelRedo.c:694 : 0x7f99699f6010: Conn[2047]:       1              1208.68              7430.70         3188117 (25.34%)       61850 (24.52%)          0
parallelRedo.c:694 : 0x7f99699f6010: Conn[2047]:       2              1156.24              7480.98         3088989 (24.56%)       69820 (27.68%)          0
parallelRedo.c:694 : 0x7f99699f6010: Conn[2047]:       3              1152.55              7460.84         3149734 (25.04%)       61111 (24.23%)          0
parallelRedo.c:698 : 0x7f99699f6010: Conn[2047]: ==============================================================
parallelRedo.c:2307 : 0x7f99699f6010: Conn[2047]: Recovery: ParallelRedo: redo done.
parallelRedo.c:2319 : 0x7f99699f6010: Conn[2047]: Recovery: ParallelRedo: Parallel redo completed. 
You can configure parallel redo by setting the values of the first connection attributes when you connect to the database. Here is an example of a database named database1 that is configured for parallel redo:
[database1]
DataStore=/disk1/databases/database1
LogDir=/disk1/logs
DatabaseCharacterSet=AL32UTF8
EnableParallelRedo=1
RedoSegmentSize=40
RedoParallelism=8

In this example, with parallel redo enabled and RedoParallelism=8, parallel redo uses 8 parallel threads to process redo log segments each of size 40MB, which speeds up the recovery process after a crash or restart. If you encounter consistent failures with the new default parallel redo method that extend recovery time, you can disable it by setting the connection attribute EnableParallelRedo=0. When database recovery encounters an issue using parallel redo, it reverts to using legacy (single-threaded) redo.

Parallel redo uses temporary memory, which is allocated upfront before the parallel redo process starts. If there is insufficient memory, it returns to legacy redo. All allocated memory is released once parallel redo completes.