Change data capture (CDC) is an integration technology that enables efficient near real-time integration by capturing changes made to enterprise data sources and providing them for processing in integration middleware, such as Oracle Fusion Middleware. The CDC Adapter can be used by Oracle SOA Suite (BPEL Process Manager, Oracle Enterprise Service Bus) and Oracle Data Integrator (ODI).
The Oracle CDC Adapter for DB2 captures changes that are written to DB2 log and archive files.
The DB2 CDC adapter uses the DSNJU004 module internally to find an archive or log file that contains the first LRBA corresponding to the provided time stamp, based on the provided bootstrap data set.
The DB2 CDC adapter uses the IFI interface to capture the records from the log. The monitoring process is started using the IFI command,
-STA TRA(MON) CLASS(1) IFCID(306).
The DB2 agent does not support DB2 Data Sharing (also known as DB2plex).
The DB2 agent for mainframe systems does not support DB2 instances that use the coupling facility.
This chapter provides an overview of the features and architecture of the OracleAS CDC Adapter for DB2. It includes the following topics:
The OracleAS CDC Adapter for DB2 captures and delivers changes (such as insert, update, and delete operations) made to data in DB2 data sources. By using the CDC Adapter, you can build efficient data integration solutions that use DB2 data, synchronize systems to ensure data integrity, update data warehouses and operational data stores in near real-time, and enable event-driven business activity monitoring and processing.
The OracleAS CDC Adapter for DB2 has the following capabilities:
Non-intrusive change capture: The OracleAS CDC Adapter for DB2 is a log-based CDC technology. The CDC agent accesses a logstream that is polled to access the changes. The agent begins polling from a specific point called the stream position (or agent context). The last position read is saved as the stream position. This next time the agent polls the logstream for information, it starts at the saved stream position.
Real-time data capture: The OracleAS CDC Adapter for DB2 captures changes to data in near-real-time for applications that demand zero latency and require the most up-to-date data. Unlike scheduled CDC solutions, near real-time data capture guarantees that a change event is can be made available in a matter of seconds at the consumer. The term near-real-time data capture is used because after a period of inactivity changes may be available after a few seconds.
Change storage: The Oracle CDC Adapter for DB2 provides a staging area that stores the captured changes in a format that is easy to access and process.The staging area ensures that changes are persisted until they are processed, and provides facilities to clean up old change records.
Change access and delivery: The Oracle CDC Adapter for DB2 includes a JCA Resource Adapter that can be configured in BPEL and ESB as an inbound endpoint, delivering changes in XML format.
Filtering: The OracleAS CDC Adapter for DB2 enables you to filter captured data based on the type of operation (INSERT, UPDATE, DELETE), based on column selection and even based on specific column values.
Reliable delivery and recovery: The Oracle CDC Adapter for DB2 stores changes in a reliable persistent storage and uses the concept of a 'stream position' that indicates the location in the change stream that was successfully moved and processed. By storing stream positions, the CDC Adapter can always restart after failure, recover, and continue moving and delivering changes from wherever it was stopped
Ease of use: The OracleAS CDC Adapter for DB2 provides an easy and intuitive way to confugure CDC solutions using Oracle Studio.
Using the OracleAS CDC Adapter for DB2 makes ETL (extract, transform, and load) processes more efficient and less disruptive to the operation of the source database systems, eliminating the need for the traditional ETL downtime window.
The following figure illustrates the system components used for change data capture using the OracleAS CDC Adapter for DB2.
The OracleAS CDC Adapter for DB2 component architecture includes the following components:
Database Platform: The database platform is the data source that contains the data to be captured.
Database Log: The database log is a log stream that contains the raw change data.
Oracle Connect: Oracle Connect runs on the back-end system and handles the special J2CA adapter that runs on the Oracle Application Server.
CDC agent: This is an OracleAS CDC Adapter for DB2 component that runs under the control of the daemon (a listener process) on the computer where the back-end data is located. The capture agent provides access to change events in the transaction log for the change router.
The agent is located on the same computer as the changes to be captured. It maintains the last position read in the journal (the stream position or context) and starts at this point the next time it polls the journal. The context is stored in the repository where the adapter definition is stored. The adapter definition includes a single primary interaction, which is used to access the appropriate journal and includes the list of tables to monitor for changes.
BPEL/ESB: This is where the change events are consumed by a JCA provider that sends change records to an end point configured in either ESB or BPEL.
JC2A Resource Adapter: This is a special version of the J2CA 1.5 DB2 adapter that can read change records from the staging area. It implements inbound endpoints for BPEL and ESB that pulls change events from the staging area and deliver them as XML change documents. Each endpoint handles change records for a single captured table at the source database.
The J2CA adapter uses multiple connections to the staging area to process changes from multiple tables in parallel.
JDBC Driver: This enables access to changes in the staging area using SQL queries.
WebLogic / BPEL Components: Any OracleAS component that hosts and gets messages from the JC2A adapter.
Staging Area: This is a set of change tables where change records are stored before being sent to BPEL/ESB. Old records from the Staging Area are deleted based upon the specified aging policy.
The Staging Area contains a DISAM data source for SQL access and a Database Events adapter for the XML access. For more information, see The Staging Area for more information.
The staging area includes the following components:
Change Router: This is an OracleAS CDC Adapter for DB2 service that runs under the control of the daemon (a listener process) on the Oracle platform. The change router reads the change stream from the CDC agent, sorts them in transaction/commit order eliminating rolled-back changes, and distrubutes the changes into the various change tables.
Only one instance of the change router is active at a time against a captured database. The change router also deletes old changes from the change tables according to a configurable aging policy. policy (48 hours is the default, which means that records that have been in the staging area for more than 48 hours are deleted).
Change Tables: These are tables in the staging area that contain changes to one captured table in the source database. These change tables maintain change events in transaction order and in occurrence order within a transaction for each captured table. The change tables are DISAM files that are indexed to support fast access to the change data, that can be accessed directly using the stream position.
Change Access Service. This is an instance Oracle Connect that is installed on a separate computer or the same computer as BPEL/ESB that allows access to the change tables. This instance of Oracle Connect is configure as the staging area.
Oracle Studio: Provides the developer or administrator with a graphical user interface for setting up the CDC Agent and Staging Area,using intuitive wizards. Oracle Studio can be installed on Linux, UNIX, or Windows.
The staging area is an area used by Oracle Connect to store captured data from a journal. The journal is scanned once and changes for every required table read during that scan are passed to the staging area, where they are stored. Therefore, the journal is scanned once each time it is polled. After the changes are written to the staging area, processing of these changes is carried out independently of the journal.
The staging area is a benefit when transactions are used. The changed data is not written to the change queue until the transaction is committed. Thus, if the transaction fails, there is no overhead of having to back out any processing done with the steps in the failed transaction.
The staging area can be on any computer with a Windows, Linux, or UNIX operating system that is running the Oracle Application Server and does not need to be run on the same server as the DB2 CDC adapter. Once the information is extracted from the journal and written to the staging area, processing of changes is carried out on the staging area only. Thus, the staging area should be set up to consider the network configuration and where the consumer application runs.
The staging area maintains the last position read by the consumer application (the staging area context) and starts at this point the next time a request from the consumer application is received. The context is stored in the repository where the staging area is maintained.
The staging area is indexed so that access to the staging area for a specific stream is quick.
The staging area is automatically cleared every 48 hours. All events that have been in the staging area for more than 48 hours are deleted.