This chapter describes the capabilities of Oracle GoldenGate for HP NonStop to manipulate data at the transactions level and to replicate selected data to a variety of heterogeneous applications and platforms. The chapter introduces both the configuration and the features of Oracle GoldenGate for HP NonStop
This chapter contains the following sections:
Oracle GoldenGate for HP NonStop has modular architecture that gives you the flexibility to extract and replicate selected data records and transactional changes across a variety of heterogeneous applications and platforms.
You can configure Oracle GoldenGate for HP NonStop to manage data from multiple, heterogeneous sources and targets. Oracle GoldenGate for HP NonStop contains features that enables your business to manage data at the transaction level across the enterprise.
Oracle GoldenGate offers flexibility in configuring your transaction management system, supporting both homogeneous and heterogeneous data replication. This lets you configure Oracle GoldenGate to capture and deliver data to best suit your operating environment. Options include:
One-to-one, from a single source to a single target
One-to-many, from a single source to multiple targets
Many-to-one, from multiple sources to a single target
Bi-directional, between a single source and a single target
In doing so, the following business needs can be met:
Change synchronization, supporting both online and batch change synchronization for Transaction Management Facility (TMF)-enabled and non-TMF-enabled applications.
Online change synchronization continuously processes incremental data changes.
Batch change synchronization processes change records that are generated during specific periods of time.
Initial load, extracting complete records directly from a source file or table, then loading them into the target. You can use initial load to populate the target and to synchronize the source and target for change synchronization.
Data distribution, sending extracted records to more than one target.
Oracle GoldenGate features let you select, map, and transform data so it can be used for a variety of applications. You can configure Oracle GoldenGate to integrate and convert data by selecting data based on filtering criteria. You can implement custom logic so Oracle GoldenGate works seamlessly with your own applications.
For example, you can configure the activities of Oracle GoldenGate by:
Configuring data selection to deliver only required records, filter records to extract specific column data, and control which types of operations are extracted
Mapping named source files and tables to the target when the target has similar formats but different file or table names
Splitting single rows into multiple rows and combining rows from different tables to a single table
Implementing data transformations to:
Convert dates from one format to another
Perform arithmetic calculations
Transform DML operations, such as converting delete operations into insert operations on the target table
You can also configure Oracle GoldenGate to run your custom programs and frequently-used routines with user exits, macros, and obey files. These features increase the flexibility of Oracle GoldenGate by:
Inserting user exits to call your applications and/or custom data management logic
Using macros to implement multiple uses of a statement, consolidate multiple commands, and call other macros
Automating frequently-used routines by using the
OBEY command, which instructs Oracle GoldenGate to process parameters specified in another parameter file.
The modular architecture of Oracle GoldenGate lets you implement just the components you need. These components are introduced in the next section.
Oracle GoldenGate processes data by capturing records from a data source, housing it temporarily, then delivering it to a data target. Each of these steps is handled by a modular component of Oracle GoldenGate.
Oracle GoldenGate for HP NonStop consists of the following components:
Extract, for extracting and processing records from Transaction Monitoring Facility (TMF)-enabled applications
Logger, for extracting and processing records from non-TMF-enabled applications
Collector, for facilitating the transmittal of records between local and remote systems
Oracle GoldenGate trails, for transmitting change records from the source to the target
Replicat, for processing and replicating records to a target
Oracle GoldenGate Manager, for controlling, monitoring, and reporting on Oracle GoldenGate processing.
Checkpoint groups, for helping maintain data integrity by tracking where, on the source, processing starts and stops
Parameters, for compiling instructions for Extract, Replicat, Manager, and utilities
Reader, for monitoring Oracle GoldenGate trails for distributed network transactions and communicating status information to the Coordinator
Coordinator, for tracking distributed network transactions to coordinate processing across multiple nodes
Extract extracts source data from the TMF audit trail and writes it to one or more files, called Oracle GoldenGate trails. Multiple Extracts can operate on different sources at the same time. For example, one Extract could continuously extract data changes from a database and stream them to an up-to-date decision-support database, while another Extract performs batch extracts from other tables for periodic reporting. Or, two Extract processes can extract and transmit in parallel to two Replicat processes to minimize target latency when the databases are large.
Use Extract for:
Change synchronization for TMF-enabled applications
Transmitting change records between Logger (non-TMF) over TCP/IP to a remote target
Logger performs data extracts when a NonStop source is non-TMF. Logger requires GGSLIB, an intercept library, that binds the Oracle GoldenGate application to the user's NonStop application. When the application performs an Enscribe operation (such as
WRITE), GGSLIB intercepts it and sends the record to Logger. Logger writes the records to a log trail which is read by Replicat.
When data is transmitted over a TCP/IP connection, the Collector resides on the target system and receives incoming records. Each Replicat group has a dedicated Collector process that terminates when the group's Extract process terminates.
Extract and Logger create trails to transmit data from the source to the target. An Oracle GoldenGate trail can contain a sequence of files or a single flat file. Generally, an Oracle GoldenGate trail is used during online change synchronization and an Oracle GoldenGate file is used for one-time tasks, such as initial data load or certain batch processes.
All trail file names begin with the same two characters, which you specify when you create the trail. As files are created, the name is appended with a six-digit number that increments sequentially.
There are two kinds of Oracle GoldenGate trails:
Local trails. Local trails are transmitted to another NonStop system over Expand and read by Replicat. Local trails can also reside on the source and be used as a data source for Extract.
Remote trails. Remote trails are transmitted to the target over TCP/IP and read by Replicat on the remote target.
Each record in an Oracle GoldenGate trail includes the data, a header with transaction information, an identifier of the record source, and other items. Trails are unstructured for best performance and are collected into transactions, which process in a continuous stream. Transaction identifiers indicate the first and last records in each transaction. Transactions are written in commit order, which guarantees that:
Each record has been committed in the original database
All records in the original transaction are together in the output
Inserts, updates, and deletes are presented, per record key, in the order in which they were applied
To maximize throughput and minimize I/O load, extracted data is written to, and read from, the trail in large blocks. By default, Oracle GoldenGate writes data to the trail in a proprietary format which allows data to be exchanged rapidly and accurately among heterogeneous databases. However, data can also be written in external ASCII, XML, or other formats compatible with different applications.
Replicat reads data from Oracle GoldenGate trails that were created by Extract or Logger. You can run multiple instances of Replicat to read multiple Oracle GoldenGate trails. Replicat supports a high volume of replication activity on the target platform, transferring data in blocks rather than a single record at a time. SQL operations are compiled once and performed many times, and small transactions can be grouped into larger transactions to improve performance.
Oracle GoldenGate is managed by the Manager. Manager is responsible for starting and stopping Extract, Replicat, and their dependent subprocesses, such as Collector. Extract and Replicat checkpoints give Manager information about what resources are required at a particular time. No other Oracle GoldenGate processes will run if Manager is stopped.
Syncfile lets you schedule and manage file duplication when you want to copy files in their entirety. This is a common requirement for maintaining a secondary Oracle GoldenGate instance that may see frequent database changes, but infrequent configuration file changes. However, Syncfile can copy any type of file, database or not, according to a schedule set by you. This makes it suitable for other off-hours, small scale copying tasks.
To differentiate among multiple Extract or Replicat processes on a system, you define processing groups. A processing group consists of a process (either Extract, Replicat, or Syncfile), its parameter file, its checkpoint file (if applicable), and any other files associated with the process. For example, to replicate two sets of data in parallel, you would create two Replicat groups. You might name one group
GGSDEL1 and the other
GGSDEL2. Groups are defined by the
ADD EXTRACT and
ADD REPLICAT commands in GGSCI.
A group name can contain up to eight characters and is not case-sensitive. All files and checkpoints relating to a group share that name. Any time you issue a command to control or view processing, you supply a group name or multiple group names with a wildcard.
You can use numbers in group names, but it is best to avoid placing numbers at the end. Oracle GoldenGate appends a numeric value of 0 to 9 to group names to create report file names. In an instance with Replicats
REP11, for example, a report file will be created for
REP1 with the name
REP11. This can cause confusion.
Checkpoints are used to store the current read and write position of a process. They ensure that data changes marked for synchronization are extracted, and they prevent redundant extracts. Checkpoints provide fault tolerance by preventing the loss of data if the system, the network, or an Oracle GoldenGate process must be restarted. For advanced synchronization configurations, checkpoints enable multiple Extract or Replicat processes to read from the same set of trails.
Checkpoints work with inter-process acknowledgments to prevent messages from being lost in the network. Oracle GoldenGate has a proprietary guaranteed-message delivery technology.
The Extract process checkpoints its position in the data source and in the trail. The Replicat process checkpoints its position in the trail. The checkpoint position is a combination of the sequence number of the trail file and the Relative Byte Address (RBA) of the trail.
The read checkpoint is always synchronized with the write checkpoint. Thus, if Oracle GoldenGate must re-read data that was already sent to the target system (for example, after a process failure), checkpoints enable accurate overwriting of the old data to the point where new transactions start and Oracle GoldenGate can resume processing.
Parameters manage all Oracle GoldenGate components and utilities, allowing you to customize your data management environment to suit your needs. For example:
The Manager parameter file contains instructions for controlling all other Oracle GoldenGate processes.
The Logger parameter file contains instructions for capturing data from non-TMF applications.
The Extract parameter file contains instructions for selecting, mapping, and transforming TMF data, and sending trails to Replicat.
The Replicat parameter file contains instructions for selecting, mapping, and transforming data to the target.
The Global parameter file contains instructions that can be applied globally to Oracle GoldenGate processing.
A Reader on each node scans the local Oracle GoldenGate trail for distributed transactions. When one is found, the Reader gathers local transaction information and sends it to the Coordinator process.
The Coordinator process receives information from each replicating node on the status of the distributed network transactions that are being processed. The transaction is not committed until the Coordinator has been notified that all of the updates have been received on their destination nodes. If any node has a failure, the changes are not applied.
Oracle GoldenGate for NonStop processes data in a variety of ways, depending on your organization's needs and operating environment. This section introduces the primary ways Oracle GoldenGate captures and delivers data, including:
Initial data synchronization
Capturing data changes from TMF applications
Using Extract for data distribution
Capturing directly from files
Custom event processing
Run initial data synchronization to synchronize the source and target databases. This process can be run while your transaction system is operational because Oracle GoldenGate will not lock data when it captures and delivers records. With Oracle GoldenGate, your options for loading data include:
Extracting data to a file and sending it to Replicat to apply to the target
Using Oracle GoldenGate direct load
Using Oracle GoldenGate direct bulk load when the target is Oracle
You can queue your data in one, or many, Oracle GoldenGate files before loading your target for the first time. This lets you perform initial data synchronization while your transaction system remains online.
Using Oracle GoldenGate direct load lets you extract data directly from source tables and send it in large blocks directly to Replicat, which writes data to its final target. This method is particularly effective for source data that does not require transformation (such as initial data loads).
If you are replicating from NonStop to Oracle, you can use Oracle GoldenGate direct bulk load. Using direct bulk load lets you extract data directly from source tables and send it in a large block to the delivery process. Replicat then communicates directly to SQL*Loader. Using Replicat lets you perform additional data transformation before loading the data. The direct bulk load method is the fastest technique available when using Oracle GoldenGate.
TMF audit trails provide the central resource for retrieving database changes in TMF-enabled applications. Changes to TMF-enabled Enscribe files and SQL tables are recorded in TMF audit trails for transaction integrity and recoverability. The following figure shows the processing flow for TMF-audited applications.
Because Oracle GoldenGate uses these audit trails for extract processing, plan and manage TMF-related activities carefully. The Oracle GoldenGate GGSCI and Manager tools provide optional audit management capabilities.
The Extract and Audserv work together to retrieve and process database changes for TMF applications. When started, Extract starts an Audserv process, which returns database changes from TMF audit trails. Audserv reads audit trails from their original location on disk, from a disk or tape dump, or from a user-specified alternative location. Audserv also determines the location of all required audit. Audserv can only return changes to tables or files if the user has read access.
Database changes include insert, update or delete operations, along with transaction information. Insert and update records are after-images, or the format of the database record after the operation completes; delete records returned are before-images. Before-images for updates can also be returned.
Extract saves each image in memory until an associated transaction commit record is received. If the transaction aborts, the associated records are discarded. Committed records can be sent to one or more user-designated extract files.
Audserv automatically excludes audit associated with SQL/MP and SQL/MX catalogs (file codes 564, 564, 565, 572, and 585).
In a multi-node environment a single transaction can update files across different nodes. Oracle GoldenGate includes processes to coordinate these network transactions so an outage on one of the nodes will not result in a partially updated transaction.
The central process is called the Coordinator. It receives information from each replicating node on the status of the changes being processed. The transaction is not committed until the Coordinator has been notified that all of the updates have been received on their destination nodes. If any node has a failure, the changes are not applied.
Figure 1-5 illustrates the processing flow for a transaction across two nodes. An order from a customer, for example, can add information to the customer account file on the
\A and the customer order file on
\B, and these files can each be replicated to backup nodes.
Many Enscribe applications do not use the NonStop TMF audit facility. Oracle GoldenGate provides an alternative method for capturing non-TMF-audited database changes. Figure 5 displays the processing flow.
The Oracle GoldenGate Software intercept library (GGSLIB) is a group of functions with the same names as Guardian operating system procedures. GGSLIB is bound to Guardian, acting as an interface between application programs and NonStop.
For example, when an application calls a Guardian function such as
WRITE, GGSLIB performs it instead of Guardian. The application is unaware of the substitution and performs, from an application programming standpoint, exactly as it did before.
If the function succeeds, it sends its data to Logger, which writes it to an Oracle GoldenGate log trail. Log trails are available for Extract and Replicat processing, which perform formatting, distribution, and delivery steps.
Extract can retrieve data from custom-created files or from trails created by Extract or Logger—in this sense, distributing data. Applications can take advantage of the data movement, formatting, conversion, and other features of Oracle GoldenGate without reading the data directly from TMF audit trails or the database.
When capturing incremental data changes in real-time is not appropriate, you can run batch processing. Batch runs process data generated during a specific time frame, defined by a begin and end time. Many Oracle GoldenGate operations, such as record selection, mapping, and field conversions can be performed during batch processing.
In some situations, Extract can read directly from a file rather than from the log trail; however, the following conditions must be true:
The file or sequence of files is entry-sequenced.
Only inserts occur against the files—no updates or deletes.
Records are inserted only at the end of the file.
Use this feature when:
The method of logging is non-TMF-enabled.
The files are BASE24
The input files meet the conditions described above.
You want to "trickle" the batch file contents throughout the day, rather than all at once at the end of the day.
Oracle GoldenGate user exits make it possible to incorporate custom processing needs. A common application of user exits is database event triggering. For example, a user exit might contain code to page a supervisor when an account balance falls below a certain threshold. User exits reside outside the mainstream application—you can add, change, or remove them with almost no impact on the application.
GGSCI is the command-line interface that lets you interface with all Oracle GoldenGate components. Throughout this guide, you will see GGSCI commands described; they are your primary tools for configuring, operating, managing, and troubleshooting your data management environment.
Output from the Oracle GoldenGate GGSCI interface supports up to 1024 processing groups, including Extract, Coordinator, Syncfile, and Replicat groups. At the supported level, all groups can be controlled and viewed in full with the GGSCI commands, such as
STATUS commands. Beyond that supported level, group information is not displayed and errors can occur. Oracle GoldenGate recommends keeping the combined number of processing groups at 1024 or below in order to manage your environment effectively.
Before you can start GGSCI, you must navigate to the Oracle GoldenGate installation location. When you are in the correct subvolume, enter
RUN GGSCI. Your prompt will change to a GGSCI prompt. For example:
TACL> VOLUME $DATA.GGS TACL> RUN GGSCI GGSCI>
After you have the GGSCI command prompt, enter GGSCI commands on the command line as needed. With GGSCI commands, you can edit parameter files, add groups, view reports, and communicate with running processes. For more information, see Reference for Oracle GoldenGate on HP NonStop Guardian.