This topic includes the following sections:
You can use Oracle GoldenGate to load data in any of the following ways.
Using database utilities. The utility program performs the initial load. Examples include loading the target using FUP or SQL and using back up and restore.
Loading from a file to a database utility. Extract writes records to an extract file in external ASCII format. The files are used as input to a bulk load utility that writes to target. Replicat creates the run and control files.
Loading from a file to Replicat. Extract writes records to an extract file and Replicat applies them to the target tables.
Using an Oracle GoldenGate direct load. Extract communicates with Replicat directly without using a Collector process or files. Replicat applies the data to the target.
Using direct bulk load. Extract writes records in external ASCII format and delivers them directly to Replicat, which delivers them to the Oracle SQL*Loader bulk load utility. This is the fastest method of loading Oracle data with Oracle GoldenGate.
The method you choose depends, in part, on the database types of your source and target. For example, if you are replicating from a NonStop source to an Oracle target, your choices include direct bulk load, which interacts with SQL*Loader.
Regardless of the method chosen, the initial data synchronization should be run after initiating change capture and before configuring delivery. The steps are:
Configure and run Extract to capture all active database changes to an Oracle GoldenGate trail.
Perform initial load.
Configure and run Replicat.
Parent topic: Configuring Initial Data Synchronization
This example shows how you might follow these three steps to configure change capture and delivery and perform an initial data load.
Parent topic: Initial Data Synchronization
You can configure this Extract process to read change records from a TMF audit trail, log trail, or flat file, and process them to an Oracle GoldenGate trail. This step is necessary before doing the initial load because it extracts and delivers ongoing changes to the source database, preserving data integrity for your business operations.
Instructions for configuring and running Extract can be found in "Planning the Configuration".
Parent topic: Example Steps for Initial Data Load
You can perform your initial load using any of the methods; however this example addresses how to use Oracle GoldenGate to do the initial load by queuing data to an Oracle GoldenGate file that will be picked up by Replicat.
Parent topic: Example Steps for Initial Data Load
When your initial data load finishes writing to its trails, configure Replicat on the target system. This can be the same parameter file you use for ongoing Replicat work, however, you will need to add HANDLECOLLISIONS
and END
parameters, run the batch, then remove those parameters before beginning ongoing change extract.
HANDLECOLLISIONS
and END
in the parameter file. The END
parameter is the time recorded for the completion of the Extract process.START REPLICAT
command.HANDLECOLLISIONS
and END
parameters.Parent topic: Example Steps for Initial Data Load
Using direct load, you can extract data directly from the source tables and send it, in a large block, to Replicat. You may do this on any operating system and database combination Oracle GoldenGate supports (such as NonStop to NonStop, NonStop to Oracle, Oracle to NonStop).
Parent topic: Initial Data Synchronization
Wildcards can be used for the FILE
and TABLE
statements in direct load parameter files, but not for views.
Refer back to the example of an Extract group added with the SOURCEISFILE
parameter in "To run direct load:". If the ACCOUNT
, PRODUCT
and CUSTOMER
files are the only files on $DATA.MASTER
, the Extract parameters could be changed to use wildcards. This use of wildcards is shown in the following direct load parameter file:
EXTRACT INITEXT RMTHOST targethost, MGRPORT 7809 RMTTASK REPLICAT, GROUP INITREP TABLE $DATA.MASTER.*, AUTOTRUNCATE;
Parent topic: Direct Load
If you are loading to an Oracle target, you may choose to use direct bulk load. Direct bulk load is the fastest technique for capturing and delivering data to SQL*Loaders. Extract sends the data, in a large block, to Replicat. Manager dynamically starts Replicat, which communicates directly with SQL*Loader using an API.
Note:
You can only use this direct bulk load from NonStop to Oracle.
Parent topic: Initial Data Synchronization
You can synchronize two NonStop tables or files on different systems over a TCP/IP connection using the trail to database utility method. Use the following steps:
Parent topic: Initial Data Synchronization
Although you can configure multiple initial-load Extracts and Replicats, by default the Replicats will inherit the IP process of the Manager running on the target. This results in a single IP channel that does not spread the load across the processors.
To configure the Extract and Replicat pairs to use different channels, you can use static Replicats as shown in the next example.
Parent topic: Configuring Initial Data Synchronization
NonStop tables and files can be synchronized with Oracle or SQL Server tables in very much the same way as NonStop-to-NonStop synchronization.
Parent topic: Configuring Initial Data Synchronization
To load to Oracle or SQL Server:
Run DEFGEN
to export source data definitions to the target system. Be sure to satisfy any other prerequisites.
Start the Collector on the target system:
For UNIX:
$server –d /ggs/mydb.def 2> server.log &
For Windows:
server –d \ggs\mydb.def 2> server.log
Use the –d option to specify the definitions file created by DEFGEN
(mydb.def
).
Create an Extract parameter file to perform initial table extract over TCP/IP.
Run the Extract program to extract the data into a flat file on the target system:
TACL> RUN GGS.EXTRACT /IN GGSPARM.TRANINI, OUT $S.#TRAN/
This command creates a flat file on the target. If you specified the FORMATASCII, SQLLOADER
in a parameter file for Oracle, Oracle GoldenGate generates the flat file in a format that SQL*Loader can read. If you specified FORMATASCII
, BCP
in the parameter file for SQL Server, Oracle GoldenGate generates a flat file that is compatible with the BCP utility.
Create a Replicat parameter file using the MAP
parameter to map between source and target tables.
Run Replicat to create files called TRANSTAB.ctl
and TRANSTAB.run
for Oracle, and TRANSTAB.bat
and TRANSTAB.fmt
for SQL Server. These files contain mapping definitions and run commands required to load.
For UNIX:
replicat paramfile/ggs/dirprm/tranini.prm
For Windows:
C:\replicat paramfile\ggs\dirprm\tranini
Load the data.
For UNIX:
$TRANSTAB.run
For Windows:
TRANSTAB.bat
Parent topic: Loading Oracle, Microsoft, or Sybase SQL Server Tables
This sections contains these examples:
An Extract parameter file example for Oracle running on UNIX
A Replicat parameter file example for Oracle running on UNIX
An Extract parameter file example for SQL Server running on Windows
A Replicat parameter file example for SQL Server running on Windows
Parent topic: Loading Oracle, Microsoft, or Sybase SQL Server Tables
Following are examples of NonStop to Oracle Extract parameter files.
Extract Parameter File GGSPARM.ORAINI:
SOURCEISFILE, FASTUNLOAD FORMATASCII, SQLLOADER RMTHOST ntbox12, MGRPORT 7809, PARAMS "-d c:\ggs\dirdef\source.def" RMTFILE TRANSTAB.dat, PURGE FILE \SRC.$D2.MYDB.TRANSTAB;
FORMATASCII, SQLLOADER
specifies the data format is compatible with Oracle's SQL*Loader utility.
RMTFILE
identifies TRANSTAB.dat
as the source table.
Replicat Parameter File /ggs/dirprm/tranini.prm:
GENLOADFILES USERID me, PASSWORD orapw SOURCEDEFS /ggs/mydb.def MAP $D2.MYDB.TRANSTAB, TARGET ORATRANS; RMTFILE /ggsdat/tranini, PURGE FILE \SRC.$D2.MYDB.TRANSTAB;
GENLOADFILES
generates load control files and then quits. These control files generate maps, even between dissimilar tables.
USERID
and PASSWORD
specify the database log on.
SOURCEDEFS
specifies the location of the NonStop definitions exported by DEFGEN
(These are required to generate a load map.)
MAP
specifies the source to target relationship of the NonStop to Oracle table.
Errors are displayed to the screen and detailed messages are written to the TRANSTAB.err
and TRANSTAB.log
files
Parent topic: Initial Sync Parameter File Examples
Following are examples of parameter files for SQL Server.
Extract parameter file GGSPARM.SQLINI:
SOURCEISFILE, FASTUNLOAD FORMATASCII, BCP RMTHOST ntbox12, MGRPORT 7809, PARAMS "-d c:\ggs\dirdef\source.def" RMTFILE C:\GGS\TRANSTAB.dat, PURGE TABLE $DATA.MASTER.TRANS
FORMATASCII, BCP
specifies the data format is compatible with the Microsoft BCP
utility.
RMTFILE
identifies TRANSTAB.dat
as the source table. Using the dat
extension makes it compatible with the load functions.
To load data to SQL Server, you must use the BCP
template provided by Oracle GoldenGate. You can call BCP
from your Replicat parameter file or run it interactively from the operating system shell. The template tells Replicat how data is laid out in the SQL Server target.
Replicat parameter file for GGSPARM.TRANINI
GENLOADFILES BCPFMT.TPLTARGETDB TARGETDB MYAPP, USERID MYNAME, PASSWORD MSPW SOURCEDEFS c:\ggs\mydb.def MAP $D2.MYDB.TRANSTAB, TARGET SCHEMA.ORATRANS;
Parent topic: Initial Sync Parameter File Examples
If your parameters meet the requirements, the FILE
parameter options STARTKEY
and ENDKEY
can be used to limit the range of Enscribe records selected for a SOURCEISFILE
initial-load process. This allows you to load subsets of the data for different purposes or to break up the a large initial data load. Refer to Reference for Oracle GoldenGate on HP NonStop Guardian FILE | TABLE
parameter for specifics on the requirements and how to use STARTKEY
and ENDKEY
.
Parent topic: Configuring Initial Data Synchronization
You can restart initial loads using the RESTARTCHECKPOINTS
option of the SOURCEISFILE
or SOURCEISTABLE
parameter if your Extract is added from GGSCI.
You can use RESTARTCHECKPOINTS
for:
SQL/MP source tables with or without the SQLPREDICATE
option
Enscribe whether or not you use the FILE
STARTKEY
and ENDKEY
options
Both SQL/MP and Enscribe with or without FASTUNLOAD
.
Refer to Reference for Oracle GoldenGate on HP NonStop Guardian for additional conditions and restrictions for using the SOURCEISFILE
RESTARTCHECKPOINTS
option.
The messages generated when the SOURCEISFILE
Extract restarts vary based on the type of database and the parameters and options that are used. Some different types of examples are shown next.
A message similar to the following is produced for SQL/MP source tables without FASTUNLOAD
. In this example the option SQLPREDICATE
is being used and WHERE (STATE = "CA")
is the user's predicate. AC_KEY
is the multi-column key for the restart.
Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126 Extract SourceIsFile process is restartable Processing File \NY.$DATA02.ACDAT.ACCT Using this SQL statement to retrieve data: SELECT * FROM \NY.$DATA02.ACDAT.ACCT WHERE (STATE = "CA") AND AC_KEY1, AC_KEY2, AC_KEY3 > 13 ,4781 ,27 BROWSE ACCESS
A message similar to the following is produced for SQL/MP or Enscribe source tables using FASTUNLOAD
. The restart key is RBA 9555968
of partition $DATA03
.
Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126 Extract SourceIsFile process is restartable Processing File \NY.$DATA02.ACDAT.ACCT2 Processing Partition \NY.$DATA03.ACDAT.ACCT2 Positioning Restart at RBA 9555968
A message similar to the following is produced for Enscribe that is not using FASTUNLOAD
and without STARTKEY
. The CUST-KEY
used for the restart is 1234
.
Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126 ExtractSourceIsFile process is restartable Processing File \NY.$DATA02.ACDAT.ALTPART Processing using restart values ( CUST-KEY = 1234 )
A message similar to the following is produced for Enscribe without FASTUNLOAD
and with STARTKEY
. The CUST-KEY
used for the restart is 1234
.
file $data02.acdat.altpart, startkey (CUST-key = 0000), def ens-rec, endkey (CUST-key = 5555); file $data02.acdat.altpart, startkey (CUST-key = 5556), def ens-rec, endkey (CUST-key = 999999); Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126 Extract SourceIsFile process is restartable Processing File \NY.$DATA02.ACDAT.ALTPART Processing using restart values ( CUST-KEY = 1234 ) Finished to EndKey ( CUST-KEY = 5555 ) Processing from StartKey ( CUST-KEY = 5556 ) Finished to EndKey ( CUST-KEY = 999999 )
Parent topic: Configuring Initial Data Synchronization
Use Replicat to load data from a Windows or UNIX system into a NonStop target database. See Getting Started with the Oracle GoldenGate Process Interfaces for details.
Parent topic: Configuring Initial Data Synchronization
When only a subset of source rows or columns are needed in the target, you can use one of the following techniques to integrate selected data into your target:
Selecting on the source with WHERE
or FILTER
Mapping columns on the target with COLMAP
When the data source is a SQL table, you can specify SQL Views. This allows automatic filtering of columns before transmission.
Data transformation (such as six-to-eight digit date conversion) takes a little extra effort during the load process. There are a couple of ways to achieve initial loads in this situation.
The first solution involves extracting the entire table into a flat file. In this case, do not specify FORMATASCII
. Next use Replicat to load the table using the SPECIALRUN
parameter. This method, while slower than native loads, is often sufficient and allows field conversion functions to be used during replication.
The second solution is to perform all data mapping on the NonStop before transmission on the target side. This means that all conversion work is performed by Extract. Using this strategy can result in less network traffic, since filtering can be performed before data reaches the pipe. However, this can also require the creation of a dummy table or DDL definition on the NonStop side that mimics the structure of the real target table.
Parent topic: Configuring Initial Data Synchronization
In addition to extracting and replicating database changes, Extract can forward and distribute changes that have already been extracted. This process is known as data pumping.
Use data pumping in the following scenarios:
A network or target system may be down for an extended time, but extraction or logging activities must occur constantly.
Data extracted by Logger must be forwarded over TCP/IP to non-NonStop systems.
Running Extract for these purposes is nearly identical to capturing data from TMF audit trails. To run Extract in this manner, perform the following tasks.
EXTTRAILSOURCE
or LOGTRAILSOURCE
option, create an initial Extract checkpoint with the GGSCI ADD EXTRACT
command.ADD EXTTRAIL
or ADD RMTTRAIL
command. By adding the trails, you direct Extract where to write the data you need.START EXTRACT
command.Parent topic: Configuring Initial Data Synchronization
Rather than capturing from trails, you can extract directly from a file or a sequence of files. You can read a file directly only when the following conditions are true:
Only inserts occur against the files (no updates).
Records are inserted only at the end of the file.
Use this feature when:
To extract directly from a file:
Enter a GGSCI ADD EXTRACT
command, specifying the FILETYPE
parameter. FILETYPE
indicates the type of file from which you are reading.
If more than one file in a sequence might be open at a time, start Extract for each file in use simultaneously. Enter an ALTINPUT
parameter in each process's parameter file with a RANGE
option to distribute the files among the processes. For further details, see Controlling Extract and Replicat.
Parent topic: Configuring Initial Data Synchronization
You can configure Extract and Replicat to run in batch when capturing and delivering incremental changes is not appropriate for the application. You can configure ongoing batch runs for a specific time daily, or special, one-time batch runs.
Parent topic: Configuring Initial Data Synchronization
You can run Extract against a specified period of audit trail or Oracle GoldenGate trail data a single time. Do this, for example, to extract changes to a particular account in a database over the course of a day.
To extract changes for a specific period, perform the following steps.
Parent topic: Batch Processing
When you are extracting batch files using RMTBATCH
, you may need to perform the following steps:
SYSKEYCONVERT
parameter in the Extract parameter file if the input record's length is variable. This specifies the format of the SYSKEY
in the output.POSITIONFIRSTRECORD
parameter to reread an input file when you have used SYSKEYCONVERT
. POSITIONFIRSTRECORD
resets Extract to read from the beginning of the input file.Use ALTINPUT
for direct file extraction. With ACI files, multiple files can be in use at one time. For example, processing can continue for Monday's file after midnight, while Tuesday's file is opened for new data. To handle a multiple file situation, run more than one Extract process for the file sequence. Use the ALTINPUT RANGE
option to distribute the files across the processes so that Extract never processes two files in sequence. You can also use ALTINPUT
to specify the access mode of the file open, and to move Extract to the next sequential file if an application has a file open that it is not updating.
By default, Extract looks for the next alphabetical file name. The file name must conform to a template for the file type, which defaults to predefined characteristics. You can also specify the template by parameter.
If the file type is ACITLF
or ACIPTLF
, the template is in the form $VOL.SUBVOL.XXYYMMDD
, where XX
is a two character prefix, followed by a six digit date.
If the file type is ACITLFX
or ACIPTLFX
, the template is in the form $VOL.SUBVOL.XMMDDNNN
, where X
is a one character prefix, followed by a month, day and three digit sequence number.
When specifying any of the ACI file types in the FILETYPE
option, do not include the date or sequence number. The next file is the one following the current file in name order, and must also satisfy any RANGE
criteria in the ALTINPUT
parameter.
If the file type is ENTRY
, you specify the template in the ALTINPUT
parameter TEMPLATE
option. NonStop wildcards are acceptable. For example, the template $DATA*.MYDAT.FL*
processes files starting with FL
residing on different $Data
volumes.
When using FILETYPE
ENTRY
, specify the first file to process, not the file prefix. By default, the next file is the next file name to fit the template. As an alternative, you can use FILETYPE
USENEXTMODIFIED
. This option selects the next file modified after the current file that also fits the template.
Parent topic: Trickle Batch Processing
Before moving to the next file in a sequence, Extract must process the entire contents of the current file. By default, Extract uses the following rules to determine that the current file has been exhausted and the next file is ready for processing.
End-of-file was reached in current file at least five seconds earlier, and no new data has appeared since.
No processes have the current file open for write access.
The next file exists and has at least one record in it.
The next file was modified after the current file.
You can modify these rules with the NOWAITNEXTMODIFIED
, WAITNEXTRBA
, and OPENTIMEOUT
options for the ALTINPUT
parameter.
Parent topic: Trickle Batch Processing