17 Configuring Extract

This chapter contains instructions for configuring the Oracle GoldenGate capture process to capture initial load and transactional data from a PostgreSQL database.

Topics:

About Extract

For Oracle GoldenGate for PostgreSQL, there are two types of Extracts that can be created.

Initial Load Extract

An Initial Load Extract is used to read all records from a table and write them to an EXTFILE or RMTFILE. Initial load Extracts are created with the SOURCEISTABLE option of the ADD EXTRACT command and do not maintain checkpointing for recovery.

For more information on the Initial Load process, see Instantiating Oracle GoldenGate with an Initial Load in Administering Oracle GoldenGate.

Change Data Capture Extract

A Change Data Capture Extract is used to capture transactional data changes from that point in time at which it is created or positioned into the write-ahead log.

The Oracle GoldenGate Extract process for PostgreSQL receives logical records from the PostgreSQL test_decoding database plugin and writes them in commit order into trail files for downstream consumption by a Replicat.

Extract Deployment Options

  • Local deployment: For a local deployment, the source database and Oracle GoldenGate are installed on the same server. No extra consideration is needed for local deployments.

  • Remote deployment: For a remote deployment, the source database and Oracle GoldenGate are installed on separate servers. Remote deployments are the only option available for supporting cloud databases, such as Azure for PostgreSQL or Amazon Aurora PostgreSQL.

    For remote deployments, operating system endianness between the database server and Oracle GoldenGate server need to be the same.

    Server time and time zones of the Oracle GoldenGate server should be synchronized with that of the database server. If this is not possible, then positioning of an Extract when creating or altering one will need to be done by LSN.

    In remote capture use cases, using SQLEXEC may introduce additional latency, as the SQLEXEC operation must be done serially for each record that the Extract processes. If special filtering that would require a SQLEXEC is done by a remote hub Extract and the performance impact is too severe, it may become necessary to move the Extract process closer to the source database.

    With remote deployments, low network latency is important, and it is recommended that the network latency between the Oracle GoldenGate server and the source database server be less than 1 millisecond.

Prerequisites for Creating an Extract

Review the Installing Oracle GoldenGate for PostgreSQL and ensure that the DataDirect drivers are installed correctly, which varies depending on the operating system.

Ensure that the PostgreSQL Client Authentication Configuration file, $PGDATA/pg_hba.conf, on the database server is configured to allow connections from the Oracle GoldenGate server, if installed remotely. See https://www.postgresql.org/docs/13/auth-pg-hba-conf.html for more information.

Registering an Extract for PostgreSQL

An Extract for PostgreSQL must be registered with the database and be granted a reserved replication slot. Replication slots are allocated through the database configuration setting max_replication_slots and can be configured as discussed in the Database Configuration topic of this document.

Follow these instructions to register an Extract. Extract registration must be done prior to creating an Extract. See REGISTER EXTRACT in the Command Line Interface Reference for Oracle GoldenGate guide for more information.

  1. Using GGSCI, connect to the DSN for the source database.
    GGSCI> DBLOGIN SOURCEDB dsn USERIDALIAS alias
  2. Register the Extract using the GGSCI command. This command internally creates the replication slot. Extract names cannot be more than 8 alpha-numeric characters.
    GGSCI> REGISTER EXTRACT extname

Creating a Change Data Capture Extract

These steps configure a CDC Extract to capture transactional data from a source PostgreSQL database.

Note:

One Extract per database is generally sufficient, but multiple Extracts are allowed if the replication slots are available.
  1. In GGSCI, create the Extract parameter file.
    EDIT PARAMS extname

    In this sample, extname is the name of the primary Extract and matches the name of the Extract that was registered with the database in the previous steps.

  2. Enter the Extract parameters in the order shown, starting a new line for each parameter statement.

    Sample basic parameters for Extract:

    EXTRACT extname
    SOURCEDB dsn_name USERIDALIAS alias
    EXTTRAIL ./dirdat/ep
    GETTRUNCATES
    TABLE schema.object;
    Parameter Description

    EXTRACT extname

    extname is the name of the Extract and cannot be more than 8 alpha-numeric characters in length. For more information, see extract in Reference for Oracle GoldenGate.

    SOURCEDB dsn_name

    Specifies the name of the database connection DSN.

    USERIDALIAS alias

    Specifies the alias of the database login credential of the user that is assigned to Extract. This credential must exist in the Oracle GoldenGate credential store. For more information, see Establishing Oracle GoldenGate Credentials.

    EXTTRAIL trailname

    Specifies a two character, local trail to which the primary Extract writes captured data.

    GETTRUNCATES

    Optional parameter but needed in order to capture truncation operations.

    TABLE schema.object;

    or

    TABLE schema.*;

    Specifies the database object for which to capture data.
    • TABLE specifies a table or a wildcarded set of tables.

    • schema is the schema name or a wildcarded set of schemas.

    • object is the table or sequence name, or a wildcarded set of those objects.

    • * is a wildcard for all tables in the schema.

    Terminate the parameter statement with a semi-colon.

    To exclude a name from a wildcard specification, use the SCHEMAEXCLUDE, TABLEEXCLUDE, and EXCLUDEWILDCARDOBJECTSONLY parameters as appropriate.

    Note:

    If the schema of tables to be captured from is the same as the schema in GGSCHEMA of the GLOBALS file, which is not recommended, then you cannot use schema.* in the TABLE statement.
  3. Enter any optional Extract parameters that are recommended for your configuration. You can edit this file at any point before starting processing by using the EDIT PARAMS command.

  4. Save and close the file.

  5. Add the Extract and its associated trail file.

    GGSCI> ADD EXTRACT extname, TRANLOG, BEGIN NOW
    GGSCI> ADD EXTTRAIL ./dirdat/ep, EXTRACT extname
  6. Start the Extract.

Modifying DDL for Extract

Follow these steps to modify DDL for Extract for PostgreSQL:
  1. Pause or stop the application data to the table(s), which need to be modified.

  2. Ensure that Extract processes all the transactions prior to making any DDL changes. An Event Marker table helps ensure full completion of transactions.

  3. Stop the Extract.

  4. At source, execute DELTE TRANDATA for the table(s) on which DDL modification has to be performed.

  5. Run the ALTER TABLE statement to add or drop the column in or from the table(s).

  6. Enable trandata (ADD TRANDATA) for the same table(s) at source.

  7. Start Extract.