1 Understanding Oracle GoldenGate for HP NonStop

This topic describes the capabilities of Oracle GoldenGate for HP NonStop to manipulate data at the transactions level and to replicate selected data to a variety of heterogeneous applications and platforms. It introduces both the configuration and the features of Oracle GoldenGate for HP NonStop.

This topic includes the following sections:

Oracle GoldenGate Overview

Oracle GoldenGate for HP NonStop has modular architecture that gives you the flexibility to extract and replicate selected data records and transactional changes across a variety of heterogeneous applications and platforms.

You can configure Oracle GoldenGate for HP NonStop to manage data from multiple, heterogeneous sources and targets. Oracle GoldenGate for HP NonStop contains features that enables your business to manage data at the transaction level across the enterprise.

Oracle GoldenGate Configuration

Oracle GoldenGate offers flexibility in configuring your transaction management system, supporting both homogeneous and heterogeneous data replication. This lets you configure Oracle GoldenGate to capture and deliver data to best suit your operating environment. Options include:

  1. One-to-one, from a single source to a single target

  2. One-to-many, from a single source to multiple targets

  3. Many-to-one, from multiple sources to a single target

  4. Bi-directional, between a single source and a single target

In doing so, the following business needs can be met:

  • Change synchronization, supporting both online and batch change synchronization for Transaction Management Facility (TMF)-enabled and non-TMF-enabled applications.

    • Online change synchronization continuously processes incremental data changes.

    • Batch change synchronization processes change records that are generated during specific periods of time.

  • Initial load, extracting complete records directly from a source file or table, then loading them into the target. You can use initial load to populate the target and to synchronize the source and target for change synchronization.

  • Data distribution, sending extracted records to more than one target.

Oracle GoldenGate Features

Oracle GoldenGate features let you select, map, and transform data so it can be used for a variety of applications. You can configure Oracle GoldenGate to integrate and convert data by selecting data based on filtering criteria. You can implement custom logic so Oracle GoldenGate works seamlessly with your own applications.

For example, you can configure the activities of Oracle GoldenGate by:

  • Configuring data selection to deliver only required records, filter records to extract specific column data, and control which types of operations are extracted

  • Mapping named source files and tables to the target when the target has similar formats but different file or table names

  • Splitting single rows into multiple rows and combining rows from different tables to a single table

  • Implementing data transformations to:

    • Convert dates from one format to another

    • Perform arithmetic calculations

    • Transform DML operations, such as converting delete operations into insert operations on the target table

You can also configure Oracle GoldenGate to run your custom programs and frequently-used routines with user exits, macros, and obey files. These features increase the flexibility of Oracle GoldenGate by:

  • Inserting user exits to call your applications and/or custom data management logic

  • Using macros to implement multiple uses of a statement, consolidate multiple commands, and call other macros

  • Automating frequently-used routines by using the OBEY command, which instructs Oracle GoldenGate to process parameters specified in another parameter file.

The modular architecture of Oracle GoldenGate lets you implement just the components you need. These components are introduced in the next section.

Oracle GoldenGate Architecture

Oracle GoldenGate processes data by capturing records from a data source, housing it temporarily, then delivering it to a data target. Each of these steps is handled by a modular component of Oracle GoldenGate.

Oracle GoldenGate Components

Oracle GoldenGate for HP NonStop consists of the following components:

  • Extract, for extracting and processing records from Transaction Monitoring Facility (TMF)-enabled applications

  • Logger, for extracting and processing records from non-TMF-enabled applications

  • Collector, for facilitating the transmittal of records between local and remote systems

  • Oracle GoldenGate trails, for transmitting change records from the source to the target

  • Replicat, for processing and replicating records to a target

  • Oracle GoldenGate Manager, for controlling, monitoring, and reporting on Oracle GoldenGate processing.

  • Checkpoint groups, for helping maintain data integrity by tracking where, on the source, processing starts and stops

  • Parameters, for compiling instructions for Extract, Replicat, Manager, and utilities

  • Reader, for monitoring Oracle GoldenGate trails for distributed network transactions and communicating status information to the Coordinator

  • Coordinator, for tracking distributed network transactions to coordinate processing across multiple nodes

Extract

Extract extracts source data from the TMF audit trail and writes it to one or more files, called Oracle GoldenGate trails. Multiple Extracts can operate on different sources at the same time. For example, one Extract could continuously extract data changes from a database and stream them to an up-to-date decision-support database, while another Extract performs batch extracts from other tables for periodic reporting. Or, two Extract processes can extract and transmit in parallel to two Replicat processes to minimize target latency when the databases are large.

Use Extract for:

  • Initial load

  • Change synchronization for TMF-enabled applications

  • Transmitting change records between Logger (non-TMF) over TCP/IP to a remote target

  • Data distribution

Logger

Logger performs data extracts when a NonStop source is non-TMF. Logger requires GGSLIB, an intercept library, that binds the Oracle GoldenGate application to the user's NonStop application. When the application performs an Enscribe operation (such as WRITE), GGSLIB intercepts it and sends the record to Logger. Logger writes the records to a log trail which is read by Replicat.

Collector

When data is transmitted over a TCP/IP connection, the Collector resides on the target system and receives incoming records. Each Replicat group has a dedicated Collector process that terminates when the group's Extract process terminates.

Trails

Extract and Logger create trails to transmit data from the source to the target. An Oracle GoldenGate trail can contain a sequence of files or a single flat file. Generally, an Oracle GoldenGate trail is used during online change synchronization and an Oracle GoldenGate file is used for one-time tasks, such as initial data load or certain batch processes.

All trail file names begin with the same two characters, which you specify when you create the trail. As files are created, the name is appended with a six-digit number that increments sequentially from 000000 to 999999. When the sequence number reaches 999999, the numbering starts over at 000000.

There are two kinds of Oracle GoldenGate trails:

  • Local trails. Local trails are transmitted to another NonStop system over Expand and read by Replicat. Local trails can also reside on the source and be used as a data source for Extract.

  • Remote trails. Remote trails are transmitted to the target over TCP/IP and read by Replicat on the remote target.

Each record in an Oracle GoldenGate trail includes the data, a header with transaction information, an identifier of the record source, and other items. Trails are unstructured for best performance and are collected into transactions, which process in a continuous stream. Transaction identifiers indicate the first and last records in each transaction. Transactions are written in commit order, which guarantees that:

  • Each record has been committed in the original database

  • All records in the original transaction are together in the output

  • Inserts, updates, and deletes are presented, per record key, in the order in which they were applied

To maximize throughput and minimize I/O load, extracted data is written to, and read from, the trail in large blocks. By default, Oracle GoldenGate writes data to the trail in a proprietary format which allows data to be exchanged rapidly and accurately among heterogeneous databases. However, data can also be written in external ASCII, XML, or other formats compatible with different applications.

Enabling Trail Recovery (FAR)

By default, Extract operates in an append mode. If there is a process failure, then a recovery marker is written to the trail and the Extract appends recovery data to the file. This is done to retain a history of all prior data recovery purposes. In an append mode, the Extract initialization determines the identity of the last complete transaction that was written to the trail at startup time.

With that information, Extract ends recovery when the commit record for that transaction is encountered in the data source; then it begins a new data capture with the next committed transaction that qualifies for extraction and then begins appending the new data to the trail. A data pump or Replicat starts reading again from that recovery point.

Overwrite mode is another version of Extract recovery that was used in versions of Oracle GoldenGate prior to version 10.0. In these versions, Extract overwrites the existing transaction data in the trail after the last write-checkpoint position, instead of appending the new data. The first transaction that is written is the first one that qualifies for extraction after the last read checkpoint position in the data source. This behavior can be controlled manually with the FORMAT RELEASE option on EXTTRAIL/RMTTRAIL parameter.

Replicat

Replicat reads data from Oracle GoldenGate trails that were created by Extract or Logger. You can run multiple instances of Replicat to read multiple Oracle GoldenGate trails. Replicat supports a high volume of replication activity on the target platform, transferring data in blocks rather than a single record at a time. SQL operations are compiled once and performed many times, and small transactions can be grouped into larger transactions to improve performance.

Manager

Oracle GoldenGate is managed by the Manager. Manager is responsible for starting and stopping Extract, Replicat, and their dependent subprocesses, such as Collector. Extract and Replicat checkpoints give Manager information about what resources are required at a particular time. No other Oracle GoldenGate processes will run if Manager is stopped.

Syncfile

Syncfile lets you schedule and manage file duplication when you want to copy files in their entirety. This is a common requirement for maintaining a secondary Oracle GoldenGate instance that may see frequent database changes, but infrequent configuration file changes. However, Syncfile can copy any type of file, database or not, according to a schedule set by you. This makes it suitable for other off-hours, small scale copying tasks.

Processing Groups

To differentiate among multiple Extract or Replicat processes on a system, you define processing groups. A processing group consists of a process (either Extract, Replicat, or Syncfile), its parameter file, its checkpoint file (if applicable), and any other files associated with the process. For example, to replicate two sets of data in parallel, you would create two Replicat groups. You might name one group GGSDEL1 and the other GGSDEL2. Groups are defined by the ADD EXTRACT and ADD REPLICAT commands in GGSCI.

A group name can contain up to eight characters and is not case-sensitive. All files and checkpoints relating to a group share that name. Any time you issue a command to control or view processing, you supply a group name or multiple group names with a wildcard.

You can use numbers in group names, but it is best to avoid placing numbers at the end. Oracle GoldenGate appends a numeric value of 0 to 9 to group names to create report file names. In an instance with Replicats REP1 and REP11, for example, a report file will be created for REP1 with the name REP11. This can cause confusion.

Checkpoints

Checkpoints are used to store the current read and write position of a process. They ensure that data changes marked for synchronization are extracted, and they prevent redundant extracts. Checkpoints provide fault tolerance by preventing the loss of data if the system, the network, or an Oracle GoldenGate process must be restarted. For advanced synchronization configurations, checkpoints enable multiple Extract or Replicat processes to read from the same set of trails.

Checkpoints work with inter-process acknowledgments to prevent messages from being lost in the network. Oracle GoldenGate has a proprietary guaranteed-message delivery technology.

The Extract process checkpoints its position in the data source and in the trail. The Replicat process checkpoints its position in the trail. The checkpoint position is a combination of the sequence number of the trail file and the Relative Byte Address (RBA) of the trail.

The read checkpoint is always synchronized with the write checkpoint. Thus, if Oracle GoldenGate must re-read data that was already sent to the target system (for example, after a process failure), checkpoints enable accurate overwriting of the old data to the point where new transactions start and Oracle GoldenGate can resume processing.

Parameters

Parameters manage all Oracle GoldenGate components and utilities, allowing you to customize your data management environment to suit your needs. For example:

  • The Manager parameter file contains instructions for controlling all other Oracle GoldenGate processes.

  • The Logger parameter file contains instructions for capturing data from non-TMF applications.

  • The Extract parameter file contains instructions for selecting, mapping, and transforming TMF data, and sending trails to Replicat.

  • The Replicat parameter file contains instructions for selecting, mapping, and transforming data to the target.

  • The Global parameter file contains instructions that can be applied globally to Oracle GoldenGate processing.

Reader

A Reader on each node scans the local Oracle GoldenGate trail for distributed transactions. When one is found, the Reader gathers local transaction information and sends it to the Coordinator process.

Coordinator

The Coordinator process receives information from each replicating node on the status of the distributed network transactions that are being processed. The transaction is not committed until the Coordinator has been notified that all of the updates have been received on their destination nodes. If any node has a failure, the changes are not applied.

Oracle GoldenGate Processing

Oracle GoldenGate for NonStop processes data in a variety of ways, depending on your organization's needs and operating environment. This section introduces the primary ways Oracle GoldenGate captures and delivers data, including:

Initial Data Synchronization

Run initial data synchronization to synchronize the source and target databases. This process can be run while your transaction system is operational because Oracle GoldenGate will not lock data when it captures and delivers records. With Oracle GoldenGate, your options for loading data include:

  • Extracting data to a file and sending it to Replicat to apply to the target

  • Using Oracle GoldenGate direct load

  • Using Oracle GoldenGate direct bulk load when the target is Oracle

File to Replicat

You can queue your data in one, or many, Oracle GoldenGate files before loading your target for the first time. This lets you perform initial data synchronization while your transaction system remains online.

Figure 1-1 File to Replicat Processing Flow

Description of Figure 1-1 follows
Description of "Figure 1-1 File to Replicat Processing Flow"

Direct Load

Using Oracle GoldenGate direct load lets you extract data directly from source tables and send it in large blocks directly to Replicat, which writes data to its final target. This method is particularly effective for source data that does not require transformation (such as initial data loads).

Figure 1-2 Direct Load Processing Flow

Description of Figure 1-2 follows
Description of "Figure 1-2 Direct Load Processing Flow "

Direct Bulk Load

If you are replicating from NonStop to Oracle, you can use Oracle GoldenGate direct bulk load. Using direct bulk load lets you extract data directly from source tables and send it in a large block to the delivery process. Replicat then communicates directly to SQL*Loader. Using Replicat lets you perform additional data transformation before loading the data. The direct bulk load method is the fastest technique available when using Oracle GoldenGate.

Figure 1-3 Direct Bulk Load Processing Flow

Description of Figure 1-3 follows
Description of "Figure 1-3 Direct Bulk Load Processing Flow "

Capturing Data Changes from TMF Applications

TMF audit trails provide the central resource for retrieving database changes in TMF-enabled applications. Changes to TMF-enabled Enscribe files and SQL tables are recorded in TMF audit trails for transaction integrity and recoverability. The following figure shows the processing flow for TMF-audited applications.

Note:

Because Oracle GoldenGate uses these audit trails for extract processing, plan and manage TMF-related activities carefully. The Oracle GoldenGate GGSCI and Manager tools provide optional audit management capabilities.

The Extract and Audserv work together to retrieve and process database changes for TMF applications. When started, Extract starts an Audserv process, which returns database changes from TMF audit trails. Audserv reads audit trails from their original location on disk, from a disk or tape dump, or from a user-specified alternative location. Audserv also determines the location of all required audit. Audserv can only areturn changes to tables or files if the user has read access.

Figure 1-4 TMF Audited Process Flow

Description of Figure 1-4 follows
Description of "Figure 1-4 TMF Audited Process Flow "

Database changes include insert, update or delete operations, along with transaction information. Insert and update records are after-images, or the format of the database record after the operation completes; delete records returned are before-images. Before-images for updates can also be returned.

Extract saves each image in memory until an associated transaction commit record is received. If the transaction aborts, the associated records are discarded. Committed records can be sent to one or more user-designated extract files.

Audserv automatically excludes audit associated with SQL/MP and SQL/MX catalogs (file codes 564, 564, 565, 572, and 585).

Capturing Changes for Distributed Network Transactions

In a multi-node environment a single transaction can update files across different nodes. Oracle GoldenGate includes processes to coordinate these network transactions so an outage on one of the nodes will not result in a partially updated transaction.

The central process is called the Coordinator. It receives information from each replicating node on the status of the changes being processed. The transaction is not committed until the Coordinator has been notified that all of the updates have been received on their destination nodes. If any node has a failure, the changes are not applied.

Figure 1-5 illustrates the processing flow for a transaction across two nodes. An order from a customer, for example, can add information to the customer account file on the \A and the customer order file on \B, and these files can each be replicated to backup nodes.

Figure 1-5 Processing Flow for Distributed Network Transactions

Description of Figure 1-5 follows
Description of "Figure 1-5 Processing Flow for Distributed Network Transactions "

Capturing Data Changes from Non-TMF applications

Many Enscribe applications do not use the NonStop TMF audit facility. Oracle GoldenGate provides an alternative method for capturing non-TMF-audited database changes. Figure 5 displays the processing flow.

Figure 1-6 Oracle GoldenGate Processing Flow—Non-TMF Applications

Description of Figure 1-6 follows
Description of "Figure 1-6 Oracle GoldenGate Processing Flow—Non-TMF Applications "

The Oracle GoldenGate Software intercept library (GGSLIB) is a group of functions with the same names as Guardian operating system procedures. GGSLIB is bound to Guardian, acting as an interface between application programs and NonStop.

For example, when an application calls a Guardian function such as WRITE, GGSLIB performs it instead of Guardian. The application is unaware of the substitution and performs, from an application programming standpoint, exactly as it did before.

If the function succeeds, it sends its data to Logger, which writes it to an Oracle GoldenGate log trail. Log trails are available for Extract and Replicat processing, which perform formatting, distribution, and delivery steps.

Using Extract for Data Distribution

Extract can retrieve data from custom-created files or from trails created by Extract or Logger—in this sense, distributing data. Applications can take advantage of the data movement, formatting, conversion, and other features of Oracle GoldenGate without reading the data directly from TMF audit trails or the database.

Figure 1-7 Extract as Data Distributor

Description of Figure 1-7 follows
Description of "Figure 1-7 Extract as Data Distributor "

Batch Processing

When capturing incremental data changes in real-time is not appropriate, you can run batch processing. Batch runs process data generated during a specific time frame, defined by a begin and end time. Many Oracle GoldenGate operations, such as record selection, mapping, and field conversions can be performed during batch processing.

Capturing Directly from Files

In some situations, Extract can read directly from a file rather than from the log trail; however, the following conditions must be true:

  • The file or sequence of files is entry-sequenced.

  • Only inserts occur against the files—no updates or deletes.

  • Records are inserted only at the end of the file.

Use this feature when:

  • The method of logging is non-TMF-enabled.

  • The files are BASE24 TLFX or PTLFX.

  • The input files meet the conditions described above.

  • You want to "trickle" the batch file contents throughout the day, rather than all at once at the end of the day.

Custom Event Processing

Oracle GoldenGate user exits make it possible to incorporate custom processing needs. A common application of user exits is database event triggering. For example, a user exit might contain code to page a supervisor when an account balance falls below a certain threshold. User exits reside outside the mainstream application—you can add, change, or remove them with almost no impact on the application.

Oracle GoldenGate Commands

GGSCI is the command-line interface that lets you interface with all Oracle GoldenGate components. Throughout this guide, you will see GGSCI commands described; they are your primary tools for configuring, operating, managing, and troubleshooting your data management environment.

Output from the Oracle GoldenGate GGSCI interface supports up to 1024 processing groups, including Extract, Coordinator, Syncfile, and Replicat groups. At the supported level, all groups can be controlled and viewed in full with the GGSCI commands, such as INFO and STATUS commands. Beyond that supported level, group information is not displayed and errors can occur. Oracle GoldenGate recommends keeping the combined number of processing groups at 1024 or below in order to manage your environment effectively.

To Start GGSCI

Before you can start GGSCI, you must navigate to the Oracle GoldenGate installation location. When you are in the correct subvolume, enter RUN GGSCI. Your prompt will change to a GGSCI prompt. For example:

Example 1-1 Starting GGSCI

TACL> VOLUME $DATA.GGS 
TACL> RUN GGSCI 
GGSCI> 

After you have the GGSCI command prompt, enter GGSCI commands on the command line as needed. With GGSCI commands, you can edit parameter files, add groups, view reports, and communicate with running processes. For more information, see GGSCI Commands .