1.7 ADD EXTRACT

Use ADD EXTRACT to create an Extract group. Unless a SOURCEISTABLE task or an alias Extract is specified, ADD EXTRACT creates an online group that uses checkpoints so that processing continuity is maintained from run to run.

For DB2 for i, this command establishes a global start point for all journals and is a required first step. After issuing the ADD EXTRACT command, you can then optionally position any given journal at a specific journal sequence number by using the ALTER EXTRACT command with an appropriate journal option.

For Informix, this command initializes the position in the Informix logical log to capture the CDC record. When the first DML is processed, the Informix CDC record is contained in the logical log. Each record in the log is associated with a position, which is called LSN (Log Sequence Number). Only when initialization is complete will it be able to honor the positioning based on the LSN. The very first time you add an Extract, you must ensure that the has initialization completed (the duration depends on number of tables in your Extract parameter file) using the INFO EXTRACT command. Then ensure that the LSN number displayed matches the LSN displayed as the first record processed in the Extract report file. For example, if the LSN number returned by INFO EXTRACT is LSN: 892:0X1235018, then the message in the report file must be Position of first record processed LSN: 892:0X1235018, Apr 16, 2014 2:56:58 AM. When the initial Extract log positioning is complete, you can issue a stop or kill command though not before; doing so before results in any capture restart always starting from the EOF position of the database logs.

Oracle GoldenGate supports up to 5,000 concurrent Extract and Replicat groups per instance of Oracle GoldenGate Manager. At the supported level, all groups can be controlled and viewed in full with GGSCI commands such as the INFO and STATUS commands. Oracle GoldenGate recommends keeping the combined number of Extract and Replicat groups at the default level of 300 or below in order to manage your environment effectively.

This command cannot exceed 500 bytes in size for all keywords and input, including any text that you enter for the DESC option.

Syntax for a Regular, Passive, or Data Pump Extract

ADD EXTRACT group_name
{, SOURCEISTABLE |
    , TRANLOG [bsds_name            | LRI_number] |
    , INTEGRATED TRANLOG |
    , VAM |
    , EXTFILESOURCE file_name |
    , EXTTRAILSOURCE trail_name |
    , VAMTRAILSOURCE VAM_trail_name}
[, BEGIN {NOW | yyyy-mm-dd[ hh:mi:[ss[.cccccc]]]} |
[, EXTSEQNO sequence_number, EXTRBA relative_byte_address |
[, EOF |
[, LSN value |
[, EXTRBA relative_byte_address |
[, EOF | LSN value |
[, PAGE data_page, ROW row_ID |
[, SEQNO sequence_number
[, SCN value]
[, THREADS n]
[, PASSIVE]
[, PARAMS file_name]
[, REPORT file_name]
[, DESC 'description']
[, CPU number]
[, PRI number]
[, HOMETERM device_name]
[, PROCESSNAME process_name]
[, SOCKSPROXY {host_name | IP_address}[:port] [PROXYCSALIAS credential_store_alias [PROXYCSDOMAIN credential_store_domain]]]
[, RMTNAME passive_Extract_name]
[, DESC 'description']
group_name

The name of the Extract group. The name of an Extract group can contain up to eight characters. See Administering Oracle GoldenGate for Windows and UNIX for group naming conventions.

SOURCEISTABLE

Creates an Extract task that extracts entire records from the database for an initial load using the Oracle GoldenGate direct load method or the direct bulk load to SQL*Loader method. If SOURCEISTABLE is not specified, ADD EXTRACT creates an online change-synchronization process, and one of the other data source options must be specified. When using SOURCEISTABLE, do not specify any service options. Task parameters must be specified in the parameter file.

For more information about initial load methods, see Administering Oracle GoldenGate for Windows and UNIX.

TRANLOG [bsds_name | LRI_NUMBER]

Specifies the transaction log as the data source. Use this option for all databases except Informix and Teradata. TRANLOG requires the BEGIN option.

(DB2 on z/OS) You can use the bsds_name option for DB2 on a z/OS system to specify the Bootstrap Data Set file name of the transaction log, though it is not required and is not used. You do not need to change existing TRANLOG parameters.

(DB2 LUW) You can use the LRI_NUMBER option for DB2 LUW systems to specify the LRI record value for the checkpoint transaction log.

(Oracle) As of Oracle Standard or Enterprise Edition 11.2.0.3, this mode is known as classic capture mode. Extract reads the Oracle redo logs directly. See INTEGRATED TRANLOG for an alternate configuration.

INTEGRATED TRANLOG

(Oracle) Adds this Extract in integrated capture mode. In this mode, Extract integrates with the database logmining server, which passes logical change records (LCRs) directly to Extract. Extract does not read the redo log. Before using INTEGRATED TRANLOG, use the REGISTER EXTRACT command. For information about integrated capture, see Installing and Configuring Oracle GoldenGate for Oracle Database.

VAM

(Informix, MySQL, and Teradata) Specifies that the Extract API known as the Vendor Access Module (VAM) will be used to transfer change data to Extract.

EXTFILESOURCE file_name

Specifies an extract file as the data source. Use this option with a secondary Extract group (data pump) that acts as an intermediary between a primary Extract group and the target system.

For file_name, specify the relative or fully qualified path name of the file, for example dirdat\extfile or c:\ggs\dirdat\extfile.

EXTTRAILSOURCE trail_name

Specifies a trail as the data source. Use this option with a secondary Extract group (data pump) that acts as an intermediary between a primary Extract group and the target system.

For trail_name, specify the relative or fully qualified path name of the trail, for example dirdat\aa or c:\ggs\dirdat\aa.

VAMTRAILSOURCE VAM_trail_name

(Teradata) Specifies a VAM trail. Use this option when using Teradata maximum protection mode.

For VAM_trail_name, specify the relative or fully qualified path name of the VAM trail to which the primary Extract group is writing. Use a VAM-sort Extract group to read the VAM trail and send the data to the target system.

BEGIN {NOW | yyyy-mm-dd[ hh:mi:[ss[.cccccc]]]}

Specifies a timestamp in the data source at which to begin processing.

NOW

For all databases except DB2 LUW, NOW specifies the time at which the ADD EXTRACT command is issued.

For DB2 LUW, NOW specifies the time at which START EXTRACT takes effect. It positions to the first record that approximately matches the date and time. This is because the only log records that contain timestamps are the commit and abort transaction records, so the starting position can only be calculated relative to those timestamps. This is a limitation of the API that is used by Oracle GoldenGate.

Do not use NOW for a data pump Extract except to bypass data that was captured to the trail prior to the ADD EXTRACT statement.

yyyy-mm-dd[ hh:mi:[ss[.cccccc]]]

A date and time (timestamp) in the given form. For an Oracle Extract in integrated mode, the timestamp value must be greater than the timestamp at which the Extract was registered with the database.

  • Positioning by timestamp in a SQL Server transaction log is affected by the following characteristics of SQL Server:

    • The timestamps recorded in the SQL Server transaction log use a 3.3333 microsecond (ms) granularity. This level of granularity may not allow positioning by time between two transactions, if the transactions began in the same 3.3333 ms time interval.

    • Timestamps are not recorded in every SQL Server log record, but only in the records that begin and commit the transaction, as well as some others that do not contain data.

    • SQL Server timestamps are not from the system clock, but instead are from an internal clock that is specific to the individual processors in use. This clock updates several times a second, but between updates it could get out of sync with the system clock. This further reduces the precision of positioning by time.

    • Timestamps recorded for log backup files may not precisely correspond to times recorded inside the backup (however this imprecision is less than a second).

    Positioning to an LSN is precise.

  • Positioning by timestamp in a Sybase transaction log is affected by the following characteristics of Sybase:

    Sybase only records timestamps in BEGIN and COMMIT records. Regardless of the actual timestamp that is specified, the start position will be the first record of the transaction that starts closest to, or at, the specified timestamp. The Extract report will display the following positions:

    Positioning To: This is the specified begin time, for example:

    Positioning to begin time Jan 1, 2011 12:13:33 PM.
    

    Positioned To: If the specified timestamp is less than, or equal to, the timestamp of the transaction log that contains the BEGIN or COMMIT record, Positioned To Page is displayed as in this example:

    2011-01-01 12:13:39  INFO    OGG-01516  Positioned to
    Page #: 0004460243
    Row #: 00111, Jan 1, 2011 12:13:38 PM.
    

    First Record Position: This is the position of the first valid record at, or after, the Positioned To position, as in this example:

    2011-01-01 12:13:39  INFO OGG-01517  Position of first record processed
    Page #: 0004460243
    Row #: 00111, Jan 1, 2011 12:13:38 PM.
    
EXTSEQNO sequence_number, EXTRBA relative_byte_address

Valid for a primary Extract in classic capture mode for Oracle, a primary Extract for NonStop SQL/MX, and a data pump Extract. Not supported for an Oracle Extract in integrated mode. Specifies either of the following:

  • sequence number of an Oracle redo log and RBA within that log at which to begin capturing data.

  • the NonStop SQL/MX TMF audit trail sequence number and relative byte address within that file at which to begin capturing data. Together these specify the location in the TMF Master Audit Trail (MAT).

  • the file in a trail in which to begin capturing data (for a data pump). Specify the sequence number, but not any zeroes used for padding. For example, if the trail file is c:\ggs\dirdat\aa000026, you would specify EXTSEQNO 26. By default, processing begins at the beginning of a trail unless this option is used.

Contact Oracle Support before using this option. For more information, go to http://support.oracle.com.

EXTRBA relative_byte_address

Valid for DB2 on z/OS. Specifies the relative byte address within a transaction log at which to begin capturing data. The required format is 0Xnnn, where nnn is a 1 to 20 digit hexadecimal number (the first character is the digit zero, and the second character can be upper or lower case letter x).

EOF

Valid for SQL Server and DB2 for i. Configures processing to start at the end of the log files (or journals) that the next record will be written to. Any active transactions will not be captured.

LSN value

Valid for Informix and SQL Server. Specifies the LSN in a transaction log at which to start capturing data. The specified LSN should exist in a log backup or the online log. An alias for this option is EXTLSN.

For SQL Server, an LSN is composed of one of these, depending on how the database returns it:

  • Colon separated hex string (8:8:4) padded with leading zeroes and 0X prefix, as in 0X00000d7e:0000036b:01bd

  • Colon separated decimal string (10:10:5) padded with leading zeroes, as in 0000003454:0000000875:00445

  • Colon separated hex string with 0X prefix and without leading zeroes, as in 0Xd7e:36b:1bd

  • Colon separated decimal string without leading zeroes, as in 3454:875:445

  • Decimal string, as in 3454000000087500445

In the preceding, the first value is the virtual log file number, the second is the segment number within the virtual log, and the third is the entry number.

You can find the LSN for named transactions by using a query like:

select [Current LSN], [Transaction Name], [Begin Time]
  from fn_dblog(null, null)
 where Operation = 'LOP_BEGIN_XACT'
   and [Begin Time] >= 'time'

The time format that you should use in the query should be similar to '2015/01/30 12:00:00.000' and not '2015-01-30 12:00:00.000'.

You can determine the time that a particular transaction started, then find the relevant LSN, and then position between two transactions with the same begin time.

EOF | LSN value

Valid for DB2 LUW. Specifies a start position in the transaction logs when Extract starts.

EOF

Configures processing to start at the active LSN in the log files. The active LSN is the position at the end of the log files that the next record will be written to. Any active transactions will not be captured.

LSN value

Configures processing to start at an exact LSN if a valid log record exists there. If one does not exist, Extract will abend. Note that, although Extract might position to a given LSN, that LSN might not necessarily be the first one that Extract will process. There are numerous record types in the log files that Extract ignores, such as DB2 internal log records. Extract will report the actual starting LSN to the Extract report file.

PAGE data_page, ROW row_ID

Valid for Sybase. Specifies a data page and row that together define a start position in a Sybase transaction log. Because the start position must be the first record of the transaction that starts closest to, or at, the specified PAGE and ROW, the Extract report will display the following positions:

  • Positioning To is the position of the record that is specified with PAGE and ROW.

  • Positioned To is the position where the first BEGIN record is found at, or after, the Positioning To position.

  • First Record Position is the position of the first valid record at, or after, the Positioned To position.

SEQNO sequence_number

Valid for DB2 for i. Starts capture at, or just after, a system sequence number, which is a decimal number up to 20 digits in length.

SCN value

Valid for Oracle. Starts Extract at the transaction in the redo log that has the specified Oracle system change number (SCN). This option is valid for Extract both in classic capture and integrated modes. For Extract in integrated mode, the SCN value must be greater than the SCN at which the Extract was registered with the database. For more information, see REGISTER EXTRACT.

PARAMS file_name

Specifies the full path name of an Extract parameter file in a location other than the default of dirprm within the Oracle GoldenGate directory.

REPORT file_name

Specifies the full path name of an Extract report file in a location other than the default of dirrpt within the Oracle GoldenGate directory.

THREADS n

Valid for Oracle classic capture mode. Specifies the number of producer threads that Extract maintains to read redo logs.

Required in an Oracle RAC configuration to specify the number of producer threads. These are the Extract threads that read the different redo logs on the various RAC nodes. The value must be the same as the number of nodes from which you want to capture redo data.

PASSIVE

Specifies that this Extract group runs in passive mode and can only be started and stopped by starting or stopping an alias Extract group on the target system. Source-target connections will be established not by this group, but by the alias Extract from the target.

This option can be used for a regular Extract group or a data-pump Extract group. It should only be used by whichever Extract on the source system is the one that will be sending the data across the network to a remote trail on the target.

For instructions on how to configure passive and alias Extract groups, see Administering Oracle GoldenGate for Windows and UNIX.

DESC 'description'

Specifies a description of the group, such as 'Extracts account_tab on Serv1'. Enclose the description within single quotes. You may use the abbreviated keyword DESC or the full word DESCRIPTION.

CPU number

Valid for SQL/MX. Specifies the number of the CPU to be used for the process. Valid values are numbers 0 - 15 and -1 is default, which is assigned 1 higher than the last Manager started.

PRI number

Valid for SQL/MX. Specifies the Extract process priority. Valid values are numbers are 1 - 199 and -1 is the default, and is the same as the manager process priority.

HOMETERM device_name

Valid for SQL/MX. Specifies the name of the device to be used and must be a terminal or process. It can be entered in either Guardian $ or OSS /G/xxxxx form. The default is $zhome or the current session HOMETERM when $zhome is not defined.

PROCESSNAME process_name

Valid for SQL/MX. Specifies the name of the process as alphanumeric string up to five characters and can be entered in either Guardian $ or OSS /G/xxxxx form. The default is a system generated process name.

SOCKSPROXY {host_name | IP_address}[:port] [PROXYCSALIAS credential_store_alias [PROXYCSDOMAIN credential_store_domain]

Use for an alias Extract. Specifies the DNS host name or IP address of the proxy server. You can use either one to define the host though you must use the IP address if your DNS server is unreachable. If you are using an IP address, use either an IPv6 or IPv4 mapped address, depending on the stack of the destination system. You must specify the PROXYCSALIAS. In addition, you can specify the port to use, and the credential store domain.

RMTNAME passive_extract_name

Use for an alias Extract. Specifies the passive Extract name, if different from that of the alias Extract.

Examples

Example 1   

The following creates an Extract group named finance that extracts database changes from the transaction logs. Extraction starts with records generated at the time when the group was created with ADD EXTRACT.

ADD EXTRACT finance, TRANLOG, BEGIN NOW
Example 2   

The following creates an Extract group named finance that extracts database changes from Oracle RAC logs. Extraction starts with records generated at the time when the group was created. There are four RAC instances, meaning there will be four Extract threads.

ADD EXTRACT finance, TRANLOG, BEGIN NOW, THREADS 4
Example 3   

The following creates an Extract group named finance that extracts database changes from the transaction logs. Extraction starts with records generated at 8:00 on January 21, 2011.

ADD EXTRACT finance, TRANLOG, BEGIN 2011-01-21 08:00
Example 4   

The following creates an integrated capture Extract group.

ADD EXTRACT finance, INTEGRATED TRANLOG, BEGIN NOW
Example 5   

The following creates an Extract group named finance that interfaces with a Teradata TAM in either maximum performance or maximum protection mode. No BEGIN point is used for Teradata sources.

ADD EXTRACT finance, VAM
Example 6   

The following creates a VAM-sort Extract group named finance. The process reads from the VAM trail /ggs/dirdat/vt.

ADD EXTRACT finance, VAMTRAILSOURCE dirdat/vt
Example 7   

The following creates a data-pump Extract group named finance. It reads from the Oracle GoldenGate trail c:\ggs\dirdat\lt.

ADD EXTRACT finance, EXTTRAILSOURCE dirdat\lt
Example 8   

The following creates an initial-load Extract named load.

ADD EXTRACT load, SOURCEISTABLE
Example 9   

The following creates a passive Extract group named finance that extracts database changes from the transaction logs.

ADD EXTRACT finance, TRANLOG, BEGIN NOW, PASSIVE
Example 10   

The following creates an alias Extract group named financeA. The alias Extract is associated with a passive extract named finance on source system sysA. The Manager on that system is using port 7800.

ADD EXTRACT financeA, RMTHOST sysA, MGRPORT 7800, RMTNAME finance
Example 11   

The following examples create and position Extract at a specific Oracle system change number (SCN) in the redo log.

ADD EXTRACT finance TRANLOG SCN 123456
ADD EXTRACT finance INTEGRATED TRANLOG SCN 123456
Example 12   

The following example creates an alias Extract specifying the host to use.

ADD EXTRACT apmp desc "alias extract" RMTHOST lc01abc MGRPORT 7813 RMTNAME 
ppmp SOCKSPROXY lc02def:3128 PROXYCSALIAS proxyAlias 
Example 13   

The following example creates an Extract on a SQL/MX system.

ADD EXTRACT ext exttcp, CPU 3, PRI 148, HOMETERM $ZTN0.#PTHBP32,  PROCESSNAME $ext1
Example 14   

The following example creates an Extract on a DB2 LUW system.

ADD EXTRACT extcust, TRANLOG LRI 8066.322711