1.2 Oracle GoldenGate Architecture

Oracle GoldenGate processes data by capturing records from a data source, housing it temporarily, then delivering it to a data target. Each of these steps is handled by a modular component of Oracle GoldenGate.

1.2.1 Oracle GoldenGate Components

Oracle GoldenGate for HP NonStop consists of the following components:

  • Extract, for extracting and processing records from Transaction Monitoring Facility (TMF)-enabled applications

  • Logger, for extracting and processing records from non-TMF-enabled applications

  • Collector, for facilitating the transmittal of records between local and remote systems

  • Oracle GoldenGate trails, for transmitting change records from the source to the target

  • Replicat, for processing and replicating records to a target

  • Oracle GoldenGate Manager, for controlling, monitoring, and reporting on Oracle GoldenGate processing.

  • Checkpoint groups, for helping maintain data integrity by tracking where, on the source, processing starts and stops

  • Parameters, for compiling instructions for Extract, Replicat, Manager, and utilities

  • Reader, for monitoring Oracle GoldenGate trails for distributed network transactions and communicating status information to the Coordinator

  • Coordinator, for tracking distributed network transactions to coordinate processing across multiple nodes

1.2.1.1 Extract

Extract extracts source data from the TMF audit trail and writes it to one or more files, called Oracle GoldenGate trails. Multiple Extracts can operate on different sources at the same time. For example, one Extract could continuously extract data changes from a database and stream them to an up-to-date decision-support database, while another Extract performs batch extracts from other tables for periodic reporting. Or, two Extract processes can extract and transmit in parallel to two Replicat processes to minimize target latency when the databases are large.

Use Extract for:

  • Initial load

  • Change synchronization for TMF-enabled applications

  • Transmitting change records between Logger (non-TMF) over TCP/IP to a remote target

  • Data distribution

1.2.1.2 Logger

Logger performs data extracts when a NonStop source is non-TMF. Logger requires GGSLIB, an intercept library, that binds the Oracle GoldenGate application to the user's NonStop application. When the application performs an Enscribe operation (such as WRITE), GGSLIB intercepts it and sends the record to Logger. Logger writes the records to a log trail which is read by Replicat.

1.2.1.3 Collector

When data is transmitted over a TCP/IP connection, the Collector resides on the target system and receives incoming records. Each Replicat group has a dedicated Collector process that terminates when the group's Extract process terminates.

1.2.1.4 Trails

Extract and Logger create trails to transmit data from the source to the target. An Oracle GoldenGate trail can contain a sequence of files or a single flat file. Generally, an Oracle GoldenGate trail is used during online change synchronization and an Oracle GoldenGate file is used for one-time tasks, such as initial data load or certain batch processes.

All trail file names begin with the same two characters, which you specify when you create the trail. As files are created, the name is appended with a six-digit number that increments sequentially from 000000 to 999999. When the sequence number reaches 999999, the numbering starts over at 000000.

There are two kinds of Oracle GoldenGate trails:

  • Local trails. Local trails are transmitted to another NonStop system over Expand and read by Replicat. Local trails can also reside on the source and be used as a data source for Extract.

  • Remote trails. Remote trails are transmitted to the target over TCP/IP and read by Replicat on the remote target.

Each record in an Oracle GoldenGate trail includes the data, a header with transaction information, an identifier of the record source, and other items. Trails are unstructured for best performance and are collected into transactions, which process in a continuous stream. Transaction identifiers indicate the first and last records in each transaction. Transactions are written in commit order, which guarantees that:

  • Each record has been committed in the original database

  • All records in the original transaction are together in the output

  • Inserts, updates, and deletes are presented, per record key, in the order in which they were applied

To maximize throughput and minimize I/O load, extracted data is written to, and read from, the trail in large blocks. By default, Oracle GoldenGate writes data to the trail in a proprietary format which allows data to be exchanged rapidly and accurately among heterogeneous databases. However, data can also be written in external ASCII, XML, or other formats compatible with different applications.

1.2.1.5 Replicat

Replicat reads data from Oracle GoldenGate trails that were created by Extract or Logger. You can run multiple instances of Replicat to read multiple Oracle GoldenGate trails. Replicat supports a high volume of replication activity on the target platform, transferring data in blocks rather than a single record at a time. SQL operations are compiled once and performed many times, and small transactions can be grouped into larger transactions to improve performance.

1.2.1.6 Manager

Oracle GoldenGate is managed by the Manager. Manager is responsible for starting and stopping Extract, Replicat, and their dependent subprocesses, such as Collector. Extract and Replicat checkpoints give Manager information about what resources are required at a particular time. No other Oracle GoldenGate processes will run if Manager is stopped.

1.2.1.7 Syncfile

Syncfile lets you schedule and manage file duplication when you want to copy files in their entirety. This is a common requirement for maintaining a secondary Oracle GoldenGate instance that may see frequent database changes, but infrequent configuration file changes. However, Syncfile can copy any type of file, database or not, according to a schedule set by you. This makes it suitable for other off-hours, small scale copying tasks.

1.2.1.8 Processing Groups

To differentiate among multiple Extract or Replicat processes on a system, you define processing groups. A processing group consists of a process (either Extract, Replicat, or Syncfile), its parameter file, its checkpoint file (if applicable), and any other files associated with the process. For example, to replicate two sets of data in parallel, you would create two Replicat groups. You might name one group GGSDEL1 and the other GGSDEL2. Groups are defined by the ADD EXTRACT and ADD REPLICAT commands in GGSCI.

A group name can contain up to eight characters and is not case-sensitive. All files and checkpoints relating to a group share that name. Any time you issue a command to control or view processing, you supply a group name or multiple group names with a wildcard.

You can use numbers in group names, but it is best to avoid placing numbers at the end. Oracle GoldenGate appends a numeric value of 0 to 9 to group names to create report file names. In an instance with Replicats REP1 and REP11, for example, a report file will be created for REP1 with the name REP11. This can cause confusion.

1.2.1.9 Checkpoints

Checkpoints are used to store the current read and write position of a process. They ensure that data changes marked for synchronization are extracted, and they prevent redundant extracts. Checkpoints provide fault tolerance by preventing the loss of data if the system, the network, or an Oracle GoldenGate process must be restarted. For advanced synchronization configurations, checkpoints enable multiple Extract or Replicat processes to read from the same set of trails.

Checkpoints work with inter-process acknowledgments to prevent messages from being lost in the network. Oracle GoldenGate has a proprietary guaranteed-message delivery technology.

The Extract process checkpoints its position in the data source and in the trail. The Replicat process checkpoints its position in the trail. The checkpoint position is a combination of the sequence number of the trail file and the Relative Byte Address (RBA) of the trail.

The read checkpoint is always synchronized with the write checkpoint. Thus, if Oracle GoldenGate must re-read data that was already sent to the target system (for example, after a process failure), checkpoints enable accurate overwriting of the old data to the point where new transactions start and Oracle GoldenGate can resume processing.

1.2.1.10 Parameters

Parameters manage all Oracle GoldenGate components and utilities, allowing you to customize your data management environment to suit your needs. For example:

  • The Manager parameter file contains instructions for controlling all other Oracle GoldenGate processes.

  • The Logger parameter file contains instructions for capturing data from non-TMF applications.

  • The Extract parameter file contains instructions for selecting, mapping, and transforming TMF data, and sending trails to Replicat.

  • The Replicat parameter file contains instructions for selecting, mapping, and transforming data to the target.

  • The Global parameter file contains instructions that can be applied globally to Oracle GoldenGate processing.

1.2.1.11 Reader

A Reader on each node scans the local Oracle GoldenGate trail for distributed transactions. When one is found, the Reader gathers local transaction information and sends it to the Coordinator process.

1.2.1.12 Coordinator

The Coordinator process receives information from each replicating node on the status of the distributed network transactions that are being processed. The transaction is not committed until the Coordinator has been notified that all of the updates have been received on their destination nodes. If any node has a failure, the changes are not applied.