3 Configuring Initial Data Synchronization

Before running the Oracle GoldenGate for Nonstop for the first time, synchronize the source and target databases. Here we address initial data synchronization when the source is TMF-Enabled.

This topic includes the following sections:

Initial Data Synchronization

You can use Oracle GoldenGate to load data in any of the following ways.

  • Using database utilities. The utility program performs the initial load. Examples include loading the target using FUP or SQL and using back up and restore.

  • Loading from a file to a database utility. Extract writes records to an extract file in external ASCII format. The files are used as input to a bulk load utility that writes to target. Replicat creates the run and control files.

  • Loading from a file to Replicat. Extract writes records to an extract file and Replicat applies them to the target tables.

  • Using an Oracle GoldenGate direct load. Extract communicates with Replicat directly without using a Collector process or files. Replicat applies the data to the target.

  • Using direct bulk load. Extract writes records in external ASCII format and delivers them directly to Replicat, which delivers them to the Oracle SQL*Loader bulk load utility. This is the fastest method of loading Oracle data with Oracle GoldenGate.

The method you choose depends, in part, on the database types of your source and target. For example, if you are replicating from a NonStop source to an Oracle target, your choices include direct bulk load, which interacts with SQL*Loader.

Regardless of the method chosen, the initial data synchronization should be run after initiating change capture and before configuring delivery. The steps are:

  1. Configure and run Extract to capture all active database changes to an Oracle GoldenGate trail.

  2. Perform initial load.

  3. Configure and run Replicat.

Example Steps for Initial Data Load

This example shows how you might follow these three steps to configure change capture and delivery and perform an initial data load.

Configure and Run Extract

You can configure this Extract process to read change records from a TMF audit trail, log trail, or flat file, and process them to an Oracle GoldenGate trail. This step is necessary before doing the initial load because it extracts and delivers ongoing changes to the source database, preserving data integrity for your business operations.

Instructions for configuring and running Extract can be found in "Planning the Configuration".

Perform Initial Load Using the File to Replicat Method

You can perform your initial load using any of the methods; however this example addresses how to use Oracle GoldenGate to do the initial load by queuing data to an Oracle GoldenGate file that will be picked up by Replicat.

  1. Create one Extract parameter file to read directly from each source database.
  2. Use the NonStop text editor to set up an Extract parameter file to include the following information.
    • The SOURCEISFILE parameter to indicate that data should be retrieved directly from the table.

    • The format for the target file, usually FORMATASCII (for example, FORMATASCII, SQLLOADER or FORMATASCII, BCP).

    • If you are transmitting data to a system other than NonStop, or to a NonStop system over TCP/IP, include the name of the remote TCP/IP host and port number of the remote Collector.

    • The name of the local output file (EXTFILE) or the remote file (RMTFILE) to which the Extract program writes extract information. If you need to write to a series of trails, then add the MAXFILES 2 to the remote trail's parameter file. MAXFILES will append a six-digit trail sequence to the remote trail's file name.

    • The name of the file or table to extract (FILE or TABLE parameter).

    • Other optional parameters, including clauses for selecting records, column mapping, or data conversion.

  3. Configure Replicat as a batch task, specifying the SPECIALRUN and BEGIN and END parameters.
  4. Start the initial data load:
    TACL> RUN EXTRACT /IN parameter_file/
    TACL> RUN REPLICAT /IN parameter_file/
    

Configure and Run Replicat

When your initial data load finishes writing to its trails, configure Replicat on the target system. This can be the same parameter file you use for ongoing Replicat work, however, you will need to add HANDLECOLLISIONS and END parameters, run the batch, then remove those parameters before beginning ongoing change extract.

  1. Configure Replicat on the target system, including HANDLECOLLISIONS and END in the parameter file. The END parameter is the time recorded for the completion of the Extract process.
  2. Start Replicat with the START REPLICAT command.
  3. When Replicat stops, remove the HANDLECOLLISIONS and END parameters.
  4. Start Replicat for incremental data synchronization.

Direct Load

Using direct load, you can extract data directly from the source tables and send it, in a large block, to Replicat. You may do this on any operating system and database combination Oracle GoldenGate supports (such as NonStop to NonStop, NonStop to Oracle, Oracle to NonStop).

To run direct load:

  1. Define an Extract group:
    GGSCI> ADD EXTRACT group_name, SOURCEISFILE
    
  2. Define a Replicat group:
    GGSCI> ADD REPLICAT group_name, SPECIALRUN
    

    Replicat is automatically started by Manager, at Extract's request.

  3. Create the parameter files.

    For the Extract parameter file:

    EXTRACT INITEXT
    RMTHOST targethost, MGRPORT 7809
    RMTTASK REPLICAT, GROUP INITREP
    TABLE $DATA.MASTER.ACCOUNT, AUTOTRUNCATE;
    TABLE $DATA.MASTER.PRODUCT, AUTOTRUNCATE;
    TABLE $DATA.MASTER.CUSTOMER AUTOTRUNCATE;
    
    • AUTOTRUNCATE sends a PURGEDATA command to Replicat before any data is processed. This ensures the target is clean and ready to receive data.

      Note:

      Use AUTOTRUNCATE with extreme caution, as it causes all existing data to be purged from the target file. Refer to Reference for Oracle GoldenGate on HP NonStop Guardian for more information.

    • RMTHOST establishes the remote TCP/IP host and port number.

    • RMTTASK instructs Manager on the target system to start Replicat with the specified GROUP name.

    • The TABLE parameters identify the source tables.

    • Specify SOURCEISFILE in the parameter if you want to include a SOURCEISFILE option:

      • SELECTVIEW: Selects data from a specified SQL view in the file parameter. Without SELECTVIEW, Extract selects data from the base table of the view, then maps the base table columns to the view columns (This also occurs when processing audit trails and a view is specified.)

      • FASTUNLOAD: Processes the file or table several times faster than the default method. Records are written out in random order, rather than primary key order. FASTUNLOAD has no effect when an SQL view is specified. The file parameter option PARTITIONS can restrict the data retrieved to a certain subset of the file or table

      • FASTUNLOADSHARED: Allows a shared open of the source file or table. Use this only on files that are not receiving updates at the same time data is being extracted.

      For the Replicat parameter file:

      REPLICAT INITREP
      USERID GoldenUser, PASSWORD pass
      SOURCEDEFS $DATA.DIRDEF.SRCDEF
      MAP $DATA.MASTER.ACCOUNT, TARGET $DATA3.MASTER.ACCOUNT;
      MAP $DATA.MASTER.PRODUCT, TARGET $DATA3.MASTER.PRODUCT;
      MAP $DATA.MASTER.CUSTOMER, TARGET $DATA3.MASTER.CUSTOMER;
      

      In the above example:

      • USERID and PASSWORD are required to access the target database.

      • SOURCEDEFS identifies the file containing the source data definitions.

      • The MAP parameters map the source tables to the target tables, based on the data definitions in SOURCEDEFS.

  4. Start Extract:
    GGSCI> START EXTRACT INITEXT
    

Using Wildcards

Wildcards can be used for the FILE and TABLE statements in direct load parameter files, but not for views.

Refer back to the example of an Extract group added with the SOURCEISFILE parameter in "To run direct load:". If the ACCOUNT, PRODUCT and CUSTOMER files are the only files on $DATA.MASTER, the Extract parameters could be changed to use wildcards. This use of wildcards is shown in the following direct load parameter file:

EXTRACT INITEXT
RMTHOST targethost, MGRPORT 7809
RMTTASK REPLICAT, GROUP INITREP
TABLE $DATA.MASTER.*, AUTOTRUNCATE;

Direct Bulk Load

If you are loading to an Oracle target, you may choose to use direct bulk load. Direct bulk load is the fastest technique for capturing and delivering data to SQL*Loaders. Extract sends the data, in a large block, to Replicat. Manager dynamically starts Replicat, which communicates directly with SQL*Loader using an API.

Note:

You can only use this direct bulk load from NonStop to Oracle.

To run direct bulk load:

  1. Define an Extract group:
    GGSCI> ADD EXTRACT group_name, SOURCEISFILE
    
  2. Define a Replicat group:
    GGSCI> ADD REPLICAT group_name, SPECIALRUN
    

    Replicat is automatically started by Manager, at Extract's request.

  3. Create the Extract and Replicat parameter files.

    Following are the examples of sample direct bulk load parameter files.

    Sample Extract parameter file:

    EXTRACT INITEXT
    RMTHOST targethost, MGRPORT 7809
    RMTTASK REPLICAT, GROUP INITREP
    TABLE $DATA.MASTER.ACCOUNT;
    TABLE $DATA.MASTER.PRODUCT;
    TABLE $DATA.MASTER.CUSTOMER;
    
    • RMTHOST establishes the remote TCP/IP host and port number.

    • RMTTASK instructs Manager on the target system to start Replicat with the specified group name.

    • The TABLE parameters identify the source tables.

    Sample Replicat parameter file:

    REPLICAT INITREP
    USERID GoldenUser, PASSWORD pass
    BULKLOAD
    SOURCEDEFS /GGS/DIRDEF/SRCDEF
    MAP $DATA.MASTER.ACCOUNT, TARGET master.account;
    MAP $DATA.MASTER.PRODUCT, TARGET master.product;
    MAP $DATA.MASTER.CUSTOMER, TARGET master.customer;
    
    • USERID and PASSWORD are required to access the target database.

    • BULKLOAD tells Replicat that SQL*Loader will load the target tables.

    • SOURCEDEFS identifies the file containing the source data definitions.

    • The MAP parameters map the source tables to the target tables, based on the data definitions in SOURCEDEFS.

  4. Start Extract:
    GGSCI> START EXTRACT group_name
    

Synchronizing Nonstop Databases Using Database Utilities Through TCP/IP

You can synchronize two NonStop tables or files on different systems over a TCP/IP connection using the trail to database utility method. Use the following steps:

  1. Start the Collector on the target system.
  2. Create a parameter file to perform initial file extraction over TCP/IP.

    Sample Extract parameter file:

    SOURCEISFILE, FASTUNLOAD
    FORMATLOAD
    RMTHOST 192.0.2.12, PORT 7829
    RMTFILE $D3.INIDAT.TRANSTAB, PURGE
    FILE \SRC.$D2.MYDB.TRANSTAB;
    

    For this example parameter file, named GGSPARM.TRANINI, you are identifying the Collector on the remote system.

    • SOURCEISFILE, FASTUNLOAD directs Extract to retrieve data directly from the blocks of the table.

    • FORMATLOAD directs Extract to format the data compatible with FUP or SQLCI LOAD.

    • RMTHOST identifies the IP address and port of the Collector process. When NonStop is the receiver, a separate Collector is required for each simultaneously active session.

    • RMTFILE identifies a flat file that holds the extracted data until the load is finished.

    • FILE identifies the source file or table from which to extract the data.

  3. Run Extract to extract the data into a flat file on the target system.
  4. Use FUP or SQLCI to insert the data into the target system, similar to:
    SQLCI> LOAD $D3.INIDAT.TRANSTAB, $D4.MYDB.TRANSTAB, RECIN 236, RECOUT 236;
    

    The figures for RECIN and RECOUT are derived from Extract's recordings in $S.#TRAN, which includes the physical length of the records in the target. For Enscribe, this is the same as the record length returned by FUP INFO. For SQL, the size will vary and can be returned from RECORDSIZE column in the FILE table from the source table's catalog.

Controlling the IP Process for Replicat

Although you can configure multiple initial-load Extracts and Replicats, by default the Replicats will inherit the IP process of the Manager running on the target. This results in a single IP channel that does not spread the load across the processors.

To configure the Extract and Replicat pairs to use different channels, you can use static Replicats as shown in the next example.

  1. Configure multiple Extracts to use the PORT parameter, rather than MGRPORT. Assign a different port to each.
    EXTRACT extl1
    RMTHOST 192.0.2.1, PORT 12345
    RMTTASK REPLICAT, GROUP repl1
    EXTRACT extl2
    RMTHOST 192.0.2.2, PORT 12346
    RMTTASK REPLICAT, GROUP repl2
    
  2. Start static Replicats using the run-time parameter INITIALDATALOAD with the -p option to assign the port from the parameter file.
    TACL> ASSIGN STDERR, $0
    TACL> ADD DEFINE =TCPIP^PROCESS^NAME, FILES $ZTC1
    TACL> RUN REPLICAT/IN GGSPARM.REPL1,OUT GGSRPT.REPL1/INITIALDATALOAD -p 12345
    TACL> ASSIGN STDERR, $0
    TACL> ADD DEFINE =TCPIP^PROCESS^NAME, FILES $ZTC2
    TACL> RUN REPLICAT/IN GGSPARM.REPL2,OUT GGSRPT.REPL2/INITIALDATALOAD -p 12346
    

    Note:

    Since the Replicats are started statically, they will not be restarted by Manager if there is a system problem.

Loading Oracle, Microsoft, or Sybase SQL Server Tables

NonStop tables and files can be synchronized with Oracle or SQL Server tables in very much the same way as NonStop-to-NonStop synchronization.

Loading to Oracle or SQL Server

To load to Oracle or SQL Server:

  1. Run DEFGEN to export source data definitions to the target system. Be sure to satisfy any other prerequisites.

  2. Start the Collector on the target system:

    For UNIX:

    $server –d /ggs/mydb.def 2> server.log &
    

    For Windows:

    server –d \ggs\mydb.def 2> server.log
    

    Use the –d option to specify the definitions file created by DEFGEN (mydb.def).

  3. Create an Extract parameter file to perform initial table extract over TCP/IP.

  4. Run the Extract program to extract the data into a flat file on the target system:

    TACL> RUN GGS.EXTRACT /IN GGSPARM.TRANINI, OUT $S.#TRAN/
    

    This command creates a flat file on the target. If you specified the FORMATASCII, SQLLOADER in a parameter file for Oracle, Oracle GoldenGate generates the flat file in a format that SQL*Loader can read. If you specified FORMATASCII, BCP in the parameter file for SQL Server, Oracle GoldenGate generates a flat file that is compatible with the BCP utility.

  5. Create a Replicat parameter file using the MAP parameter to map between source and target tables.

  6. Run Replicat to create files called TRANSTAB.ctl and TRANSTAB.run for Oracle, and TRANSTAB.bat and TRANSTAB.fmt for SQL Server. These files contain mapping definitions and run commands required to load.

    For UNIX:

    replicat paramfile/ggs/dirprm/tranini.prm
    

    For Windows:

    C:\replicat paramfile\ggs\dirprm\tranini
    
  7. Load the data.

    For UNIX:

    $TRANSTAB.run
    

    For Windows:

    TRANSTAB.bat
    

Initial Sync Parameter File Examples

This sections contains these examples:

  • An Extract parameter file example for Oracle running on UNIX

  • A Replicat parameter file example for Oracle running on UNIX

  • An Extract parameter file example for SQL Server running on Windows

  • A Replicat parameter file example for SQL Server running on Windows

Sample NonStop to Oracle Parameter Files

Following are examples of NonStop to Oracle Extract parameter files.

Extract Parameter File GGSPARM.ORAINI:

SOURCEISFILE, FASTUNLOAD
FORMATASCII, SQLLOADER
RMTHOST ntbox12, MGRPORT 7809, PARAMS "-d c:\ggs\dirdef\source.def"
RMTFILE TRANSTAB.dat, PURGE
FILE \SRC.$D2.MYDB.TRANSTAB;
  • FORMATASCII, SQLLOADER specifies the data format is compatible with Oracle's SQL*Loader utility.

  • RMTFILE identifies TRANSTAB.dat as the source table.

Replicat Parameter File /ggs/dirprm/tranini.prm:

GENLOADFILES
USERID me, PASSWORD orapw
SOURCEDEFS /ggs/mydb.def
MAP $D2.MYDB.TRANSTAB, TARGET ORATRANS;
RMTFILE /ggsdat/tranini, PURGE
FILE \SRC.$D2.MYDB.TRANSTAB;
  • GENLOADFILES generates load control files and then quits. These control files generate maps, even between dissimilar tables.

  • USERID and PASSWORD specify the database log on.

  • SOURCEDEFS specifies the location of the NonStop definitions exported by DEFGEN (These are required to generate a load map.)

  • MAP specifies the source to target relationship of the NonStop to Oracle table.

  • Errors are displayed to the screen and detailed messages are written to the TRANSTAB.err and TRANSTAB.log files

Sample SQL Server Parameter Files

Following are examples of parameter files for SQL Server.

Extract parameter file GGSPARM.SQLINI:

SOURCEISFILE, FASTUNLOAD
FORMATASCII, BCP
RMTHOST ntbox12, MGRPORT 7809, PARAMS "-d c:\ggs\dirdef\source.def"
RMTFILE C:\GGS\TRANSTAB.dat, PURGE
TABLE $DATA.MASTER.TRANS
  • FORMATASCII, BCP specifies the data format is compatible with the Microsoft BCP utility.

  • RMTFILE identifies TRANSTAB.dat as the source table. Using the dat extension makes it compatible with the load functions.

To load data to SQL Server, you must use the BCP template provided by Oracle GoldenGate. You can call BCP from your Replicat parameter file or run it interactively from the operating system shell. The template tells Replicat how data is laid out in the SQL Server target.

Replicat parameter file for GGSPARM.TRANINI

GENLOADFILES BCPFMT.TPLTARGETDB
TARGETDB MYAPP, USERID MYNAME, PASSWORD MSPW
SOURCEDEFS c:\ggs\mydb.def
MAP $D2.MYDB.TRANSTAB, TARGET SCHEMA.ORATRANS;

Limiting the Enscribe Source Key Range for Initial Load

If your parameters meet the requirements, the FILE parameter options STARTKEY and ENDKEY can be used to limit the range of Enscribe records selected for a SOURCEISFILE initial-load process. This allows you to load subsets of the data for different purposes or to break up the a large initial data load. Refer to Reference for Oracle GoldenGate on HP NonStop Guardian FILE | TABLE parameter for specifics on the requirements and how to use STARTKEY and ENDKEY.

Restarting an Initial Load

You can restart initial loads using the RESTARTCHECKPOINTS option of the SOURCEISFILE or SOURCEISTABLE parameter if your Extract is added from GGSCI.

You can use RESTARTCHECKPOINTS for:

  • SQL/MP source tables with or without the SQLPREDICATE option

  • Enscribe whether or not you use the FILE STARTKEY and ENDKEY options

  • Both SQL/MP and Enscribe with or without FASTUNLOAD.

Refer to Reference for Oracle GoldenGate on HP NonStop Guardian for additional conditions and restrictions for using the SOURCEISFILE RESTARTCHECKPOINTS option.

The messages generated when the SOURCEISFILE Extract restarts vary based on the type of database and the parameters and options that are used. Some different types of examples are shown next.

Example 1   SQL/MP tables produced without using FASTUNLOAD

A message similar to the following is produced for SQL/MP source tables without FASTUNLOAD. In this example the option SQLPREDICATE is being used and WHERE (STATE = "CA") is the user's predicate. AC_KEY is the multi-column key for the restart.

Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126
Extract SourceIsFile process is restartable
Processing File \NY.$DATA02.ACDAT.ACCT
Using this SQL statement to retrieve data:
SELECT * FROM \NY.$DATA02.ACDAT.ACCT WHERE (STATE = "CA") AND AC_KEY1, AC_KEY2, AC_KEY3 > 13 ,4781 ,27 BROWSE ACCESS
Example 2   SQL/MP or Enscribe tables produced using FASTUNLOAD

A message similar to the following is produced for SQL/MP or Enscribe source tables using FASTUNLOAD. The restart key is RBA 9555968 of partition $DATA03.

Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126
Extract SourceIsFile process is restartable
Processing File \NY.$DATA02.ACDAT.ACCT2
Processing Partition \NY.$DATA03.ACDAT.ACCT2
Positioning Restart at RBA 9555968 
Example 3   Enscribe tables produced without FASTUNLOAD or STARTKEY

A message similar to the following is produced for Enscribe that is not using FASTUNLOAD and without STARTKEY. The CUST-KEY used for the restart is 1234.

Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126
ExtractSourceIsFile process is restartable
Processing File \NY.$DATA02.ACDAT.ALTPART
Processing using restart values ( CUST-KEY = 1234 )
Example 4   Enscribe tables produced without FASTLOAD and with a STARTKEY

A message similar to the following is produced for Enscribe without FASTUNLOAD and with STARTKEY. The CUST-KEY used for the restart is 1234.

file $data02.acdat.altpart, startkey (CUST-key = 0000), def ens-rec,
                          endkey   (CUST-key = 5555);
file $data02.acdat.altpart, startkey (CUST-key = 5556), def ens-rec,
                          endkey   (CUST-key = 999999);
Output extract file \NY.$DATA02.ACDAT.PA000009 Write Position: RBA 19126
Extract SourceIsFile process is restartable
Processing File \NY.$DATA02.ACDAT.ALTPART
Processing using restart values ( CUST-KEY = 1234 )
Finished to EndKey ( CUST-KEY = 5555 )
Processing from StartKey ( CUST-KEY = 5556 )
Finished to EndKey ( CUST-KEY = 999999 )

Loading Initial Data from Windows and Unix

Use Replicat to load data from a Windows or UNIX system into a NonStop target database. See Getting Started with the Oracle GoldenGate Process Interfaces for details.

Integrating Source and Target Data

When only a subset of source rows or columns are needed in the target, you can use one of the following techniques to integrate selected data into your target:

  • Selecting on the source with WHERE or FILTER

  • Mapping columns on the target with COLMAP

When the data source is a SQL table, you can specify SQL Views. This allows automatic filtering of columns before transmission.

Data transformation (such as six-to-eight digit date conversion) takes a little extra effort during the load process. There are a couple of ways to achieve initial loads in this situation.

The first solution involves extracting the entire table into a flat file. In this case, do not specify FORMATASCII. Next use Replicat to load the table using the SPECIALRUN parameter. This method, while slower than native loads, is often sufficient and allows field conversion functions to be used during replication.

The second solution is to perform all data mapping on the NonStop before transmission on the target side. This means that all conversion work is performed by Extract. Using this strategy can result in less network traffic, since filtering can be performed before data reaches the pipe. However, this can also require the creation of a dummy table or DDL definition on the NonStop side that mimics the structure of the real target table.

Distributing Extracted Data

In addition to extracting and replicating database changes, Extract can forward and distribute changes that have already been extracted. This process is known as data pumping.

Use data pumping in the following scenarios:

  • A network or target system may be down for an extended time, but extraction or logging activities must occur constantly.

  • Data extracted by Logger must be forwarded over TCP/IP to non-NonStop systems.

Running Extract for these purposes is nearly identical to capturing data from TMF audit trails. To run Extract in this manner, perform the following tasks.

  1. Using the EXTTRAILSOURCE or LOGTRAILSOURCE option, create an initial Extract checkpoint with the GGSCI ADD EXTRACT command.
  2. Add a local or remote Oracle GoldenGate trail with the GGSCI ADD EXTTRAIL or ADD RMTTRAIL command. By adding the trails, you direct Extract where to write the data you need.
  3. Set up an Extract parameter file.
  4. Start Extract using the GGSCI START EXTRACT command.

Direct File Extraction

Rather than capturing from trails, you can extract directly from a file or a sequence of files. You can read a file directly only when the following conditions are true:

  • The file or sequence of files is entry-sequenced.

  • Only inserts occur against the files (no updates).

  • Records are inserted only at the end of the file.

Use this feature when:

  • The method of logging is non-TMF.

  • The files are BASE24 TLF or PTLF.

  • The input files meet the conditions described above.

  • You want to transfer the batch file contents a little at a time throughout the day ("trickle" transfer), rather than all at once at the end of the day.

To extract directly from a file:

  1. Enter a GGSCI ADD EXTRACT command, specifying the FILETYPE parameter. FILETYPE indicates the type of file from which you are reading.

  2. If more than one file in a sequence might be open at a time, start Extract for each file in use simultaneously. Enter an ALTINPUT parameter in each process's parameter file with a RANGE option to distribute the files among the processes. For further details, see Controlling Extract and Replicat.

Batch Processing

You can configure Extract and Replicat to run in batch when capturing and delivering incremental changes is not appropriate for the application. You can configure ongoing batch runs for a specific time daily, or special, one-time batch runs.

One-Time Database Extraction

You can run Extract against a specified period of audit trail or Oracle GoldenGate trail data a single time. Do this, for example, to extract changes to a particular account in a database over the course of a day.

To extract changes for a specific period, perform the following steps.

  1. Set up a parameter file using the NonStop editor.
  2. Use SPECIALRUN to capture data from TMF-audit trails. SPECIALRUN indicates that no checkpoints are maintained.
  3. To extract data from an Oracle GoldenGate trail, use the SPECIALRUN, EXTTRAILSOURCE or LOGTRAILSOURCE option.
  4. Set up BEGIN and END parameters to designate the period of activity to extract.
  5. Designate an EXTFILE or RMTFILE rather than an extract trail. If you require multiple trails, add the MAXFILE argument to the EXTFILE or RMTFILE parameter.
  6. Specify additional parameters as needed.
  7. Start Extract from TACL, as in this example.
    TACL> RUN EXTRACT /IN GGSPARM.SPECEXT, OUT GGSRPT.SPECEXT/
    

Trickle Batch Processing

When you are extracting batch files using RMTBATCH, you may need to perform the following steps:

  1. Use the SYSKEYCONVERT parameter in the Extract parameter file if the input record's length is variable. This specifies the format of the SYSKEY in the output.
  2. Use the POSITIONFIRSTRECORD parameter to reread an input file when you have used SYSKEYCONVERT. POSITIONFIRSTRECORD resets Extract to read from the beginning of the input file.

Determining the Next File

Use ALTINPUT for direct file extraction. With ACI files, multiple files can be in use at one time. For example, processing can continue for Monday's file after midnight, while Tuesday's file is opened for new data. To handle a multiple file situation, run more than one Extract process for the file sequence. Use the ALTINPUT RANGE option to distribute the files across the processes so that Extract never processes two files in sequence. You can also use ALTINPUT to specify the access mode of the file open, and to move Extract to the next sequential file if an application has a file open that it is not updating.

By default, Extract looks for the next alphabetical file name. The file name must conform to a template for the file type, which defaults to predefined characteristics. You can also specify the template by parameter.

If the file type is ACITLF or ACIPTLF, the template is in the form $VOL.SUBVOL.XXYYMMDD, where XX is a two character prefix, followed by a six digit date.

If the file type is ACITLFX or ACIPTLFX, the template is in the form $VOL.SUBVOL.XMMDDNNN, where X is a one character prefix, followed by a month, day and three digit sequence number.

When specifying any of the ACI file types in the FILETYPE option, do not include the date or sequence number. The next file is the one following the current file in name order, and must also satisfy any RANGE criteria in the ALTINPUT parameter.

If the file type is ENTRY, you specify the template in the ALTINPUT parameter TEMPLATE option. NonStop wildcards are acceptable. For example, the template $DATA*.MYDAT.FL* processes files starting with FL residing on different $Data volumes.

When using FILETYPE ENTRY, specify the first file to process, not the file prefix. By default, the next file is the next file name to fit the template. As an alternative, you can use FILETYPE USENEXTMODIFIED. This option selects the next file modified after the current file that also fits the template.

When the Next File is Processed

Before moving to the next file in a sequence, Extract must process the entire contents of the current file. By default, Extract uses the following rules to determine that the current file has been exhausted and the next file is ready for processing.

  • End-of-file was reached in current file at least five seconds earlier, and no new data has appeared since.

  • No processes have the current file open for write access.

  • The next file exists and has at least one record in it.

  • The next file was modified after the current file.

You can modify these rules with the NOWAITNEXTMODIFIED, WAITNEXTRBA, and OPENTIMEOUT options for the ALTINPUT parameter.