13 Associating Replicated Data with Metadata

This chapter describes the uses of metadata and how to associate replicated data with metadata.

This chapter includes the following sections:

13.1 Overview

When replicating data from one table to another, an important consideration is whether the column structures (metadata) of the source and target tables are identical. Oracle GoldenGate looks up metadata for the following purposes:

  • On the source, to supply complete information about captured operations to the Replicat process.

  • On the target, to determine the structures of the target tables, so that the replicated data is correctly mapped and converted (if needed) by Replicat.

In each of the following scenarios, you must use a different parameter or set of parameters to describe the metadata properly to the Oracle GoldenGate process that is processing it:

  • You are replicating a source table to a target table that has identical metadata definitions (homogeneous replication).

  • You are replicating a source table to a target table that has different metadata definitions.

  • You are replicating a source table to two target tables, one with identical definitions and one that has different definitions.

13.2 Configuring Oracle GoldenGate to Assume Identical Metadata

When source and target tables have identical metadata definitions, use the ASSUMETARGETDEFS parameter in the Replicat parameter file. This parameter directs Replicat to assume that the target definitions are the same as those of the source, and to apply those definitions to the replicated data when constructing SQL statements. The source and target tables must be identical in every way, thus needing no conversion processing, although their catalogs or containers, owners and/or names can be different.

13.2.1 Rules for Tables to be Considered Identical

For source and target structures to be identical, they must:

  • be of the same database type, that is, all Oracle.

  • have the same character set and locale, for example american_AMERICA.

  • contain the same number of columns.

  • have identical column names (including case, white spaces, and quotes if applicable).

  • have identical data types.

  • have identical column lengths.

  • have the same column length semantics for character columns (bytes versus characters).

  • define all of the columns in the same order.

The following is a simple Replicat parameter file that illustrates the use of ASSUMETARGETDEFS. For more information, see ASSUMETARGETDEFS in Reference for Oracle GoldenGate for Windows and UNIX.

-- Specifies the group name.
REPLICAT acctrep
-- Specifies database login with an alias to a credential in the credential store.
USERIDALIAS ogg
-- Specifies a file for discard output.
DISCARDFILE ./dirrpt/backup/r_prod.dsc, APPEND
-- States that source and target definitions are identical.
ASSUMETARGETDEFS
-- Maps source tables to target tables.
MAP hq.product, TARGET region1.product;
MAP hq.price, TARGET region1.price;

When source and target structures are different, use the SOURCEDEFS parameter. See Section 13.3, "Configuring Oracle GoldenGate to Assume Dissimilar Metadata." ASSUMETARGETDEFS and SOURCEDEFS cannot be used in the same parameter file.

13.3 Configuring Oracle GoldenGate to Assume Dissimilar Metadata

Source and target metadata definitions are not considered identical if they do not meet the rules in Section 13.2.1, "Rules for Tables to be Considered Identical". When source and target table definitions are dissimilar, Oracle GoldenGate must perform a conversion from one format to the other. To perform conversions, both sets of definitions must be known to Oracle GoldenGate. Oracle GoldenGate can query the local database to get one set of definitions, but it must rely on a data-definitions file to get definitions from the remote database. The data-definitions file contains information about the metadata of the data that is being replicated. There are two types of definitions files:

  • A source-definitions file contains the definitions of source tables.

  • A target-definitions file contains the definitions of the target tables.

You can use multiple data-definitions files in a parameter file. For example, each one can contain the definitions for a distinct application.

13.3.1 Contents of the Definitions File

The format of a data-definitions file is for internal use and should not be edited by an Oracle GoldenGate user unless instructed to do so in documented procedures or by a support representative. The file begins with a file header that shows the version of DEFGEN, information about character sets, the database type, the locale, and internal metadata that indicates other data properties. Following the header are the table-definition sections. Each table-definition section contains a table name, record length, number of columns, and one or more column definitions.

13.3.2 Which Definitions File Type to Use, and Where

The type of definitions file to use depends on where column mapping and conversion will be performed.

  • When replicating from any type of Windows or UNIX-based database system to any other Windows or UNIX-based system, the mapping and conversion can be performed by Extract, a data-pump Extract, or Replicat, but is usually performed by Replicat on the target system. However, if Oracle GoldenGate must convert between different character sets, the mapping and conversion must be performed by Replicat on the target. See Section 13.3.3, "Understanding the Effect of Character Sets on Definitions Files."

  • When replicating from any Windows, UNIX, or Linux-based database system to an Enscribe target on a NonStop system, the mapping and conversion must be performed on the Windows, UNIX, or Linux system: Only Extract can convert two- and three-part SQL names and data types to the three-part file names that are used on the NonStop platform. In this scenario, Oracle GoldenGate cannot convert between source and target character sets. See Section 13.3.3, "Understanding the Effect of Character Sets on Definitions Files."

Therefore:

  • To perform column mapping and conversion on the target, use a source-definitions file that was generated on the source to supply the source definitions to Replicat.

  • To perform column mapping and conversion on the source, use a target-definitions file that was generated on the target to supply target definitions to the primary Extract or a data-pump Extract, depending on which process does the conversion.

  • To perform column mapping or transformation on an intermediary system, you may need to use multiple definition file types. See Section 5.4, "Creating a Reporting Configuration with a Data Pump on an Intermediary System" and Section 5.5, "Creating a Cascading Reporting Configuration." Note that if there is not a Replicat on the intermediary system, conversion between character sets cannot be performed.

13.3.3 Understanding the Effect of Character Sets on Definitions Files

Oracle GoldenGate takes into consideration the character set encoding of the database when performing data conversion, and it takes into consideration the character set of the local operating system when creating a definitions file. Take the following guidelines into account when the source and target data have different character sets.

13.3.3.1 Confining Data Mapping and Conversion to the Replicat Process

Replicat is the only process that converts replicated data between different character sets. It converts data from the source database character set to the target database character set (or to the character set of the database on an intermediary system in a cascading configuration). As a result, data mapping and conversion must be performed by Replicat if source and target character sets are different. It cannot be performed on a source system, nor on an intermediary system that only contains a data pump. A target-definitions file is invalid in these cases.

13.3.3.2 Avoiding File Corruptions Due to Operating System Character Sets

By default, DEFGEN writes the definitions file itself in the character set of the local operating system. A definitions file can be created on the local system and transferred to the remote system without any encoding-related problems if the following is true:

  • The remote system to which you are transfering the definitions file has the same or equivalent operating-system character set as the local system

  • The operating-system character set of the remote system is a subset of the operating-system character set of the local system, For example, if the source and target character sets both are ASCII-compatible or EBCDIC-compatible and all table and column names use only 7-bit US-ASCII or equivalent characters, you can move the definition file between those systems.

Many operating-system character sets have little or no compatibility between them. To write the definitions file in a character set that is compatible with, or the same as, the one used by the remote system, use the CHARSET option of the DEFSFILE parameter when you configure DEFGEN.

13.3.3.3 Changing the Character Set of Existing Definitions Files

In the case of an existing definitions file that is transfered to an operating system with an incompatible character set, you can run the DEFGEN utility on that system to convert the character set of the file to the required one. This procedure takes two input arguments: the name of the definitions file and the UPDATECS character_set parameter. For example:

defgen ./dirdef/source.def UPDATECS UTF-8

UPDATECS helps in situations such as when a Japanese table name on Japanese Windows is written in Windows CP932 to the data-definitions file, and then the definitions file is transferred to Japanese UNIX. The file cannot be used unless the UNIX is configured in PCK locale. Thus, you must use UPDATECS to convert the encoding of the definitions file to the correct format.

13.3.3.4 Downloading from a z/OS system to another platform

Definitions files generated on an IBM z/OS platform must be downloaded in BINARY mode when transferring them to a non-z/OS platform.

13.3.4 Using a Definitions Template

When you create a definitions file, you can specify a definitions template that reduces the need to create new definitions files when tables are added to the Oracle GoldenGate configuration after the initial startup. To use a template, all of the new tables must have identical structures, such as in a customer database where there are separate but identical tables for each customer (see Section 13.2.1, "Rules for Tables to be Considered Identical").

If you do not use a template and new tables are added after startup, you must generate a definitions file for each new table that is added to the Oracle GoldenGate configuration, then copy their contents to the existing master definitions file, and then restart the process.

13.3.5 Configuring Oracle GoldenGate to Capture Data-definitions

To configure Oracle GoldenGate to use a data-definitions file and template (if needed), you will:

13.3.5.1 Configure DEFGEN

Perform these steps on the system from which you want to obtain metadata definitions.

  1. From the Oracle GoldenGate directory, run GGSCI.

  2. In GGSCI, issue the following command to create a DEFGEN parameter file.

    EDIT PARAMS DEFGEN
    
  3. Enter the parameters listed in Table 13-1 in the order shown, starting a new line for each parameter statement.

    Table 13-1 DEFGEN Parameters

    Parameter Description
    CHARSET character_set
    

    Use this parameter to specify a character set that DEFGEN will use to read the parameter file. By default, the character set of the parameter file is that of the local operating system. If used, CHARSET must be the first line of the parameter file.

    DEFSFILE file_name [APPEND | PURGE] [CHARSET character_set] [FORMAT RELEASE major.minor]
    
    • APPEND directs DEFGEN to write new content (from the current run) at the end of any existing content, if the specified file already exists.

    • PURGE directs DEFGEN to purge the specified file before writing new content from the current run. This is the default.

    • CHARSET generates the definitions file in the specified character set instead of the default character set of the operating system.

    • FORMAT RELEASE specifies the Oracle GoldenGate release version of the definitions file. Use when the definitions file will be read by a process that is in an earlier version of Oracle GoldenGate than the DEFGEN process.

    Specifies the relative or fully qualified name of the data-definitions file that is to be the output of DEFGEN.

    See Reference for Oracle GoldenGate for Windows and UNIX for important information about these parameter options and their effect on character sets.

    See Section 13.3.3, "Understanding the Effect of Character Sets on Definitions Files" for more information.

    [{SOURCEDB | TARGETDB} datasource]
    {USERIDALIAS alias | USERID user, PASSWORD password [encryption_options]}
    
    • SOURCEDB | TARGETDB specifies a data source name, if required as part of the connection information. Not required for Oracle.

    • USERID user, PASSWORD password [encryption_options] specifies a user name and password, with optional encryption options.

    • USERIDALIAS supplies database authentication through credentials stored in the Oracle GoldenGate credential store.

    Specifies database connection information.

    The datasource can be a DSN (Datasource Name), a SQL/MX catalog, or a container of an Oracle container database (CDB). If connecting to an Oracle container database, connect to the root container as the common user if you need to generate definitions for objects in more than one container. Otherwise, you can connect to a specific container to generate definitions only for that container.

    For more information about SOURCEDB, USERID, and USERIDALIAS, including the databases they support, see Reference for Oracle GoldenGate for Windows and UNIX.

    NOCATALOG

    Removes the container name (Oracle) or the catalog name (SQL/MX) from table names before their definitions are written to the definitions file. Use this parameter if the definitions file is to be used for mapping to a database that only supports two-part names (owner.object).

    TABLE [container. | catalog.]owner.table
    [, {DEF | TARGETDEF} template];
    

    Where:

    • container is a container in an Oracle container database.

    • catalog is a catalog in a SQL/MX database.

    • owner is the name of the schema that contains the table to be defined.

    • table is the table that is to be defined.

    • [, {DEF | TARGETDEF} template] additionally creates a definitions template based on the metadata of this table. This option is not supported for initial loads. See Reference for Oracle GoldenGate for Windows and UNIX for information about this option.

    Specifies the fully qualified name of a table or tables for which definitions will be defined and optionally uses the metadata of the table as a basis for a definitions template. Case-sensitivity of both table name and template name is preserved for case-sensitive databases. See Section 4.5, "Specifying Object Names in Oracle GoldenGate Input" for instructions on wildcarding and case-sensitivity.

    Specify a source table(s) if generating a source-definitions file or a target table(s) if generating a target-definitions file.

    To exclude tables from a wildcard specification, use the TABLEEXCLUDE parameter.

    Note that DEFGEN does not support UDTs.


  4. Save and close the file.

  5. Exit GGSCI.

13.3.5.2 Run DEFGEN

  1. From the directory where Oracle GoldenGate is installed, run DEFGEN using the following arguments. This example shows a UNIX file system structure.

    defgen paramfile dirprm/defgen.prm [reportfile dirrpt/defgen.rpt]
    [NOEXTATTR]
    

    Where:

    • defgen is the name of the program.

    • paramfile is a required keyword. dirprm/defgen.prm is the relative or full path name of the DEFGEN parameter file. (The typical location is shown in the example.)

    • reportfile is a required keyword. dirrpt/defgen.rpt sends output to the screen and to the designated report file. (The typical location is shown in the example.) You can omit the reportfile argument to print to the screen only.

    • NOEXTATTR can be used to support backward compatibility with Oracle GoldenGate versions that are older than Release 11.2.1 and do not support character sets other than ASCII, nor case-sensitivity or object names that are quoted with spaces. NOEXTATTR prevents DEFGEN from including the database locale and character set that support the globalization features that were introduced in Oracle GoldenGate Release 11.2.1. If the table or column name has multi-byte or special characters such as white spaces, DEFGEN does not include the table definition when NOEXTATTR is specified. If APPEND mode is used in the parameter file, NOEXTATTR is ignored, and the new table definition is appended in the existing file format, whether with the extra attributes or not.

  2. Repeat these steps for any additional definitions files that you want to create.

  3. Using ASCII mode, FTP the definitions file (or files) from the local Oracle GoldenGate dirdef sub-directory to the remote dirdef sub-directory.

13.3.5.3 Transfer the Definitions File to the Remote System

Use BINARY mode to FTP the data definitions file to the remote system if the local and remote operating systems are different and the definitions file is created for the remote operating system character set. This avoids unexpected characters to be placed in the file by the FTP program, such as new-line and line-feed characters. Always use BINARY mode when transferring definitions files from z/OS to a non-z/OS platform.

13.3.5.4 Specify the Definitions File

Associate a data-definitions file with the correct Oracle GoldenGate process in the following ways:

  • Associate a target-definitions file with an Extract group or data pump by using the TARGETDEFS parameter in the Extract parameter file.

  • Associate a source-definitions file with the Replicat group by using the SOURCEDEFS parameter in the Replicat parameter file.

  • If Oracle GoldenGate is to perform mapping or conversion on an intermediary system that contains neither the source nor target database, associate a source-definitions file and a target-definitions file with the data pump Extract by using SOURCEDEFS and TARGETDEFS in the parameter file. For Oracle databases, the Oracle libraries also must be present on the intermediary system.

See Section 13.3.7, "Examples of Using a Definitions File" for the correct way to specify multiple definitions files.

Do not use SOURCEDEFS and ASSUMETARGETDEFS in the same parameter file. See Section 13.2, "Configuring Oracle GoldenGate to Assume Identical Metadata" for more information about ASSUMETARGETDEFS.

13.3.6 Adding Tables that Satisfy a Definitions Template

To map a new table in the Oracle GoldenGate configuration to a definitions template, use the following options of the TABLE and MAP parameters, as appropriate:

  • DEF to specify the name of a source-definitions template.

  • TARGETDEF to specify the name of a target-definitions template.

Because these options direct the Extract or Replicat process to use the same definitions as the specified template, you need not create a new definitions file for the new table, nor restart the process.

13.3.7 Examples of Using a Definitions File

This topic contains some basic use cases that include a definitions file.

Creating a Source-definitions file for Use on a Target System

Creating Target-definitions Files for Use on a Source System

Creating Multiple Source-definitions Files for Use on a Target System

13.3.7.1 Creating a Source-definitions file for Use on a Target System

The following configuration uses a DEFGEN parameter file that creates a source-definitions file as output. This example is for tables from an Oracle database.

DEFSFILE C:\ggs\dirdef\record.def
USERIDALIAS ogg
TABLE acct.cust100, DEF custdef;
TABLE ord.*;
TABLE hr.*;

The results of this DEFGEN configuration are:

  • Individual definitions by name are created for all tables in the ord and hr schemas.

  • A custdef template is created based on table acct.cust100. In the database, there are other acct.cust* tables, each with identical definitions to acct.cust100.

The tables are mapped in the Replicat parameter file as follows:

-- This is a simplified parameter file. Your requirements may vary.
REPLICAT acctrep
USERIDALIAS ogg
SOURCEDEFS c:\ggs\dirdef\record.def
MAP acct.cust*, TARGET acct.cust*, DEF custdef;
MAP ord.prod, TARGET ord.prod;
MAP ord.parts, TARGET ord.parts;
MAP hr.emp, TARGET hr.emp;
MAP hr.salary, TARGET hr.salary;

Note that definitions for tables that satisfy the wildcard specification acct.cust* are obtained from the custdef template, as directed by the DEF option of the first MAP statement.

13.3.7.2 Creating Target-definitions Files for Use on a Source System

If target definitions are required for the same tables, those tables can be mapped for a primary Extract or a data pump.

  • Target definitions are required instead of source definitions if the target is an Enscribe database.

  • Target definitions are required in addition to source definitions if mapping and conversion are to be done on an intermediary system.

The DEFGEN configuration to make the target-definitions file looks similar to the following:

DEFSFILE C:\ggs\dirdef\trecord.def
USERIDALIAS ogg
TABLE acct.cust100, DEF tcustdef;
TABLE ord.*;
TABLE hr.*;

Note:

See the previous example for the DEFGEN configuration that makes the source-definitions file.

The Extract configuration looks similar to the following:

-- This is a simplified parameter file. Your requirements may vary.
EXTRACT acctex
USERIDALIAS ogg
RMTHOSTOPTIONS sysb, MGRPORT 7890, ENCRYPT AES192 KEYNAME mykey1
ENCRYPTTRAIL AES192
RMTTRAIL $data.ggsdat.rt
SOURCEDEFS c:\ggs\dirdef\record.def
TARGETDEFS c:\ggs\dirdef\trecord.def
TABLE acct.cust*, TARGET acct.cust*, DEF custdef, TARGETDEF tcustdef;
TABLE ord.prod, TARGET ord.prod;
TABLE ord.parts, TARGET ord.parts;
TABLE hr.emp, TARGET hr.emp;
TABLE hr.salary, TARGET hr.salary;

In this example, the source template named custdef (from the record.def file) and a target template named tcustdef (from the trecord.def file) are used for the acc.cust* tables. Definitions for the tables from the ord and hr schemas are obtained from explicit definitions based on the table names (but a wildcard specification could have been used here, instead)

13.3.7.3 Creating Multiple Source-definitions Files for Use on a Target System

This is a simple example of how to use multiple definitions files. Your parameter requirements may vary, based on the Oracle GoldenGate topology and database type.

The following is the DEFGEN parameter file that creates the first data-definitions file.

DEFSFILE C:\ggs\dirdef\sales.def
USERIDALIAS ogg
TABLE ord.*;

The following is the DEFGEN parameter file that creates the second data-definitions file. Note the file name and table specification are different from the first one.

DEFSFILE C:\ggs\dirdef\admin.def
USERIDALIAS ogg
TABLE hr.*;

The tables for the first and second definitions file are mapped in the same Replicat parameter file as follows:

REPLICAT acctrep
USERIDALIAS ogg
SOURCEDEFS c:\ggs\dirdef\sales.def
MAP ord.*, TARGET ord.*;
SOURCEDEFS c:\ggs\dirdef\admin.def
MAP hr.*, TARGET hr.*;

13.4 Configuring Oracle GoldenGate to Use a Combination of Similar and Dissimilar Definitions

ASSUMETARGETDEFS and SOURCEDEFS can be used in the same parameter file. This can be done when column mapping or conversion must be performed between some of the source-target table pairs, but not for other table pairs that are identical.

The following is an example of how to use SOURCEDEFS and ASSUMETARGETDEFS in the same parameter file. This example builds on the previous examples where tables in the acct, ord, and hr schemas require SOURCEDEFS, but it adds a rpt schema with tables that are dynamically created with the name stock appended with a random numerical value. For Oracle GoldenGate to replicate the DDL as well as the DML, the target tables must be identical. In that case, ASSUMETARGETDEFS is required.

REPLICAT acctrep
USERIDALIAS ogg
SOURCEDEFS c:\ggs\dirdef\record.def
MAP acct.cust*, TARGET acct.cust*, DEF custdef;
MAP ord.prod, TARGET ord.prod;
MAP ord.parts, TARGET ord.parts;
MAP hr.emp, TARGET hr.emp;
MAP hr.salary, TARGET hr.salary;
ASSUMETARGETDEFS
MAP rpt.stock, TARGET rpt.stock;