11 Associating Replicated Data with Metadata
Topics:
- Understanding Metadata Combinations
- Understanding Data Definition Files
- Using Automatic Trail File Recovery
- Configuring Oracle GoldenGate to Use Self-Describing Trail Files
- Configuring Oracle GoldenGate to Assume Identical Metadata
- Configuring Oracle GoldenGate to Assume Dissimilar Metadata
- Configuring Oracle GoldenGate to Use a Combination of Similar and Dissimilar Definitions
11.1 Understanding Metadata Combinations
When replicating data from one table to another, an important consideration is whether the column structures (metadata) of the source and target tables are identical. Oracle GoldenGate looks up metadata for the following purposes:
-
On the source, to supply complete information about captured operations to the Replicat process.
-
On the target, to determine the structures of the target tables, so that the replicated data is correctly mapped and converted (if needed) by Replicat.
In each of the following scenarios, you must use a different parameter or set of parameters to describe the metadata properly to the Oracle GoldenGate process that is processing it:
-
You are replicating a source table to a target table that has identical metadata definitions (homogeneous replication).
-
You are replicating a source table to a target table that has different metadata definitions.
-
You are replicating a source table to two target tables, one with identical definitions and one that has different definitions.
Topics:
Parent topic: Associating Replicated Data with Metadata
11.2 Understanding Data Definition Files
Oracle GoldenGate can query the local database to get one set of definitions, but it must rely on a data-definitions file to get definitions from the remote database. The data-definitions file contains information about the metadata of the data that is being replicated. There are two types of definitions files:
-
A source-definitions file contains the definitions of source tables.
-
A target-definitions file contains the definitions of the target tables.
You can use multiple data-definitions files in a parameter file. For example, each one can contain the definitions for a distinct application.
Topics:
- Contents of the Definitions File
- Which Definitions File Type to Use, and Where
- Understanding the Effect of Character Sets on Definitions Files
- Using a Definitions Template
- Configuring Oracle GoldenGate to Capture Data-definitions
- Adding Tables that Satisfy a Definitions Template
- Examples of Using a Definitions File
Parent topic: Associating Replicated Data with Metadata
11.2.1 Contents of the Definitions File
The format of a data-definitions file is for internal use and should not be edited by an Oracle GoldenGate user unless instructed to do so in documented procedures or by a support representative. The file begins with a file header that shows the version of DEFGEN, information about character sets, the database type, the locale, and internal metadata that indicates other data properties. Following the header are the table-definition sections. Each table-definition section contains a table name, record length, number of columns, and one or more column definitions.
Parent topic: Understanding Data Definition Files
11.2.2 Which Definitions File Type to Use, and Where
The type of definitions file to use depends on where column mapping and conversion will be performed.
-
When replicating from any type of Windows or UNIX-based database system to any other Windows or UNIX-based system, the mapping and conversion can be performed by Extract, a data-pump Extract, or Replicat, but is usually performed by Replicat on the target system. However, if Oracle GoldenGate must convert between different character sets, the mapping and conversion must be performed by Replicat on the target. See Understanding the Effect of Character Sets on Definitions Files.
-
When replicating from any Windows, UNIX, or Linux-based database system to an Enscribe target on a NonStop system, the mapping and conversion must be performed on the Windows, UNIX, or Linux system: Only Extract can convert two- and three-part SQL names and data types to the three-part file names that are used on the NonStop platform. In this scenario, Oracle GoldenGate cannot convert between source and target character sets. See Understanding the Effect of Character Sets on Definitions Files.
Therefore:
-
To perform column mapping and conversion on the target, use a source-definitions file that was generated on the source to supply the source definitions to Replicat.
-
To perform column mapping and conversion on the source, use a target-definitions file that was generated on the target to supply target definitions to the primary Extract or a data-pump Extract, depending on which process does the conversion.
-
To perform column mapping or transformation on an intermediary system, you may need to use multiple definition file types. See Creating a Reporting Configuration with a Data Pump on an Intermediary System and Creating a Cascading Reporting Configuration. Note that if there is not a Replicat on the intermediary system, conversion between character sets cannot be performed.
Parent topic: Understanding Data Definition Files
11.2.3 Understanding the Effect of Character Sets on Definitions Files
Oracle GoldenGate takes into consideration the character set encoding of the database when performing data conversion, and it takes into consideration the character set of the local operating system when creating a definitions file. Take the following guidelines into account when the source and target data have different character sets.
- Confining Data Mapping and Conversion to the Replicat Process
- Avoiding File Corruptions Due to Operating System Character Sets
- Changing the Character Set of Existing Definitions Files
- Downloading from a z/OS system to another platform
Parent topic: Understanding Data Definition Files
11.2.3.1 Confining Data Mapping and Conversion to the Replicat Process
Replicat is the only process that converts replicated data between different character sets. It converts data from the source database character set to the target database character set (or to the character set of the database on an intermediary system in a cascading configuration). As a result, data mapping and conversion must be performed by Replicat if source and target character sets are different. It cannot be performed on a source system, nor on an intermediary system that only contains a data pump. A target-definitions file is invalid in these cases.
11.2.3.2 Avoiding File Corruptions Due to Operating System Character Sets
By default, DEFGEN writes the definitions file itself in the character set of the local operating system. A definitions file can be created on the local system and transferred to the remote system without any encoding-related problems if the following is true:
-
The remote system to which you are transferring the definitions file has the same or equivalent operating-system character set as the local system
-
The operating-system character set of the remote system is a subset of the operating-system character set of the local system, For example, if the source and target character sets both are ASCII-compatible or EBCDIC-compatible and all table and column names use only 7-bit US-ASCII or equivalent characters, you can move the definition file between those systems.
Many operating-system character sets have little or no compatibility between them. To write the definitions file in a character set that is compatible with, or the same as, the one used by the remote system, use the CHARSET
option of the DEFSFILE
parameter when you configure DEFGEN.
11.2.3.3 Changing the Character Set of Existing Definitions Files
In the case of an existing definitions file that is transferred to an operating system with an incompatible character set, you can run the DEFGEN utility on that system to convert the character set of the file to the required one. This procedure takes two input arguments: the name of the definitions file and the UPDATECS
character_set
parameter. For example:
defgen ./dirdef/source.def UPDATECS UTF-8
UPDATECS
helps in situations such as when a Japanese table name on Japanese Windows is written in Windows CP932 to the data-definitions file, and then the definitions file is transferred to Japanese UNIX. The file cannot be used unless the UNIX is configured in PCK locale. Thus, you must use UPDATECS
to convert the encoding of the definitions file to the correct format.
11.2.4 Using a Definitions Template
When you create a definitions file, you can specify a definitions template that reduces the need to create new definitions files when tables are added to the Oracle GoldenGate configuration after the initial startup. To use a template, all of the new tables must have identical structures, such as in a customer database where there are separate but identical tables for each customer.
If you do not use a template and new tables are added after startup, you must generate a definitions file for each new table that is added to the Oracle GoldenGate configuration, then copy their contents to the existing master definitions file, and then restart the process.
Parent topic: Understanding Data Definition Files
11.2.5 Configuring Oracle GoldenGate to Capture Data-definitions
To configure Oracle GoldenGate to use a data-definitions file and template (if needed), you will:
Topics:
- Configure DEFGEN
- Run DEFGEN
- Transfer the Definitions File to the Remote System
- Specify the Definitions File
Parent topic: Understanding Data Definition Files
11.2.5.1 Configure DEFGEN
Perform these steps on the system from which you want to obtain metadata definitions.
Note:
Do not create a data-definitions file for Oracle sequences. It is
not needed and DEFGEN
does not support it.
11.2.5.3 Transfer the Definitions File to the Remote System
Use BINARY
mode to FTP the data definitions file to the remote system if the local and remote operating systems are different and the definitions file is created for the remote operating system character set. This avoids unexpected characters to be placed in the file by the FTP program, such as new-line and line-feed characters. Always use BINARY
mode when transferring definitions files from z/OS to a non-z/OS platform.
11.2.5.4 Specify the Definitions File
Associate a data-definitions file with the correct Oracle GoldenGate process in the following ways:
-
Associate a target-definitions file with an Extract group or data pump by using the
TARGETDEFS
parameter in the Extract parameter file. -
Associate a source-definitions file with the Replicat group by using the
SOURCEDEFS
parameter in the Replicat parameter file. -
If Oracle GoldenGate is to perform mapping or conversion on an intermediary system that contains neither the source nor target database, associate a source-definitions file and a target-definitions file with the data pump Extract by using
SOURCEDEFS
andTARGETDEFS
in the parameter file. For Oracle databases, the Oracle libraries also must be present on the intermediary system.
See Examples of Using a Definitions File for the correct way to specify multiple definitions files.
Do not use SOURCEDEFS
and ASSUMETARGETDEFS
in the same parameter file. See Configuring Oracle GoldenGate to Assume Identical Metadata for more information about ASSUMETARGETDEFS
.
11.2.6 Adding Tables that Satisfy a Definitions Template
To map a new table in the Oracle GoldenGate configuration to a definitions template, use the following options of the TABLE
and MAP
parameters, as appropriate:
-
DEF
to specify the name of a source-definitions template. -
TARGETDEF
to specify the name of a target-definitions template.
Because these options direct the Extract or Replicat process to use the same definitions as the specified template, you need not create a new definitions file for the new table, nor restart the process.
Parent topic: Understanding Data Definition Files
11.2.7 Examples of Using a Definitions File
This topic contains some basic use cases that include a definitions file.
Topics:
- Creating a Source-definitions file for Use on a Target System
- Creating Target-definitions Files for Use on a Source System
- Creating Multiple Source Definition Files for Use on a Target System
Parent topic: Understanding Data Definition Files
11.2.7.1 Creating a Source-definitions file for Use on a Target System
The following configuration uses a DEFGEN parameter file that creates a source-definitions file as output. This example is for tables from an Oracle database.
DEFSFILE C:\ggs\dirdef\record.def USERIDALIAS ogg TABLE acct.cust100, DEF custdef; TABLE ord.*; TABLE hr.*;
The results of this DEFGEN configuration are:
-
Individual definitions by name are created for all tables in the
ord
andhr
schemas. -
A
custdef
template is created based on tableacct.cust100
. In the database, there are otheracct.cust*
tables, each with identical definitions toacct.cust100
.
The tables are mapped in the Replicat parameter file as follows:
-- This is a simplified parameter file. Your requirements may vary. REPLICAT acctrep USERIDALIAS ogg SOURCEDEFS c:\ggs\dirdef\record.def MAP acct.cust*, TARGET acct.cust*, DEF custdef; MAP ord.prod, TARGET ord.prod; MAP ord.parts, TARGET ord.parts; MAP hr.emp, TARGET hr.emp; MAP hr.salary, TARGET hr.salary;
Note that definitions for tables that satisfy the wildcard specification acct.cust*
are obtained from the custdef
template, as directed by the DEF
option of the first MAP
statement.
Parent topic: Examples of Using a Definitions File
11.2.7.2 Creating Target-definitions Files for Use on a Source System
If target definitions are required for the same tables, those tables can be mapped for a primary Extract or a data pump.
-
Target definitions are required instead of source definitions if the target is an Enscribe database.
-
Target definitions are required in addition to source definitions if mapping and conversion are to be done on an intermediary system.
The DEFGEN configuration to make the target-definitions file looks similar to the following:
DEFSFILE C:\ggs\dirdef\trecord.def USERIDALIAS ogg TABLE acct.cust100, DEF tcustdef; TABLE ord.*; TABLE hr.*;
Note:
See the previous example for the DEFGEN configuration that makes the source-definitions file.
The Extract configuration looks similar to the following:
-- This is a simplified parameter file. Your requirements may vary. EXTRACT acctex USERIDALIAS ogg RMTHOSTOPTIONS sysb, MGRPORT 7890, ENCRYPT AES192 KEYNAME mykey1 ENCRYPTTRAIL AES192 RMTTRAIL $data.ggsdat.rt SOURCEDEFS c:\ggs\dirdef\record.def TARGETDEFS c:\ggs\dirdef\trecord.def TABLE acct.cust*, TARGET acct.cust*, DEF custdef, TARGETDEF tcustdef; TABLE ord.prod, TARGET ord.prod; TABLE ord.parts, TARGET ord.parts; TABLE hr.emp, TARGET hr.emp; TABLE hr.salary, TARGET hr.salary;
In this example, the source template named custdef
(from the record.def
file) and a target template named tcustdef
(from the trecord.def
file) are used for the acc.cust*
tables. Definitions for the tables from the ord
and hr
schemas are obtained from explicit definitions based on the table names (but a wildcard specification could have been used here, instead)
Parent topic: Examples of Using a Definitions File
11.2.7.3 Creating Multiple Source Definition Files for Use on a Target System
This is a simple example of how to use multiple definitions files. Your parameter requirements may vary, based on the Oracle GoldenGate topology and database type.
The following is the DEFGEN parameter file that creates the first data-definitions file.
DEFSFILE C:\ggs\dirdef\sales.def USERIDALIAS ogg TABLE ord.*;
The following is the DEFGEN parameter file that creates the second data-definitions file. Note the file name and table specification are different from the first one.
DEFSFILE C:\ggs\dirdef\admin.def USERIDALIAS ogg TABLE hr.*;
The tables for the first and second definitions file are mapped in the same Replicat parameter file as follows:
REPLICAT acctrep USERIDALIAS ogg SOURCEDEFS c:\ggs\dirdef\sales.def MAP ord.*, TARGET ord.*; SOURCEDEFS c:\ggs\dirdef\admin.def MAP hr.*, TARGET hr.*;
Parent topic: Examples of Using a Definitions File
11.3 Using Automatic Trail File Recovery
The trail recovery process has the ability to, in some cases, automatically rebuild trail files that are corrupt or missing by Oracle GoldenGate. When an Extract pump restarts, if the last trail that the pump was writing to is missing, then the Extract pump attempts to rebuild the missing trail file on the target system. This is done automatically using the checkpoint information for the process and the last valid trail file. The Replicat process automatically skips over any duplicate data in the trail files that have been rebuilt by the new trail recovery feature. This recovery will occur as long as there is at least 1 target trail from this sequence and that the trail files still exist on the source where the Extract pump is reading them.
This process can also be used to rebuild corrupt or invalid trail files on the target. Simply delete the corrupt trail file, and any trail files after that, and then restart the Extract pump. With this new behavior, Oracle recommends that PURGEOLDEXTRACTS MINKEEP
rules are properly configured to ensure that there are trail files from the source that can be used to rebuild the target environment. This feature requires that Oracle GoldenGate release 12.1.2.1.8 or greater is used on both the source and target servers. Do not attempt to start the Replicat with NOFILTERDUPTRANSACTIONS
because it will override Replicat's default behavior and may cause transactions that have already been applied to the target database to be applied again.
Parent topic: Associating Replicated Data with Metadata
11.4 Configuring Oracle GoldenGate to Use Self-Describing Trail Files
The default behavior in this release is to store and forward metadata from the source to the target and encapsulates it in each of the trail files. In other words, a self-describing Extract trail or file is created by adding the metadata records in each file. There are two types of metadata records:
-
Database Definition Record (DDR)
A DDR provides information about a specific database, such as character set and time zone. A Database Definition Record is added to the trail to store the database metadata for each pluggable database being captured. All the row change records from a pluggable database will have character and timestamp with local time zone data based on the corresponding DDR for that pluggable database. DDRs are generated for both consolidated and non-consolidated databases.
-
Table Definition Record (TDR)
A TDR provides details about the definition about a table and the columns that it contains. The content of this record is similar to what is provided in a definitions file, which is a subset of the information found in the
file_def
andcol_def
classes. Each database can embed its own database specific information to each TDR. A TDR contains a complete table definition and is used to describe many row change records for the same table. A new TDR is written when the output trail rolls over to a new file or the source table definition has changed.It is important to note that a TDR describes the definition of a table object represented by the row change records. It will be similar though may not be identical to the table definition in the source. For example, if a column-conversion function is applied to a source column, the metadata for that value in the database will be different from the metadata that shows up in a trail file.
The metadata records in a self-describing trail file format operate as follows:
Using self-described trail files eliminates the need for SOURCEDEFS
and ASSUMETARGETDEFS
so parameter files are simpler and it is simpler to configure heterogeneous replication and provides:
-
A reduction in trail file size due to object name compression.
-
The ability to extract data from multiple catalogs with different character sets and time zones into one trail.
-
The ability to configure DDL replication among more than two Oracle databases. There is no need to use the
GETREPLICATS
,UPDATEMETADATA
, andNOTAG
parameters. You can replicate DDLs when source and target tables are not alike and without having to synchronize Oracle GoldenGate processes . -
No necessity to create and maintain source definitions files.
Understanding the Self-Describing Trail Behavior
When you are modifying the Source Table Definition the following criteria must be met to update the new TDR into the Extract's memory, as well as the trail file.
- Oracle Database Sources
-
Integrated Extract (with Oracle Database 11.2.0.4 or higher and compatible = 11.2.0.4 or higher): No manual steps are needed because integrated Extract seamlessly generates updated metadata records after a DDL is performed on the source table. This is true irrespective of whether DDL replication is enabled or not. Any classic Extract would need to be switched to integrated Extract.
- Heterogeneous Database Sources
-
As in releases earlier than 12.2, DDL should be performed only after Extract has completely output all the relevant database changes to the trail and is stopped. After DDL is executed, you must restart the Extract should be restarted. Unlike previous releases, there is no need to stop Replicat and regenerate
SOURCEDEFS
usingDEFGEN
.
Topics:
Parent topic: Associating Replicated Data with Metadata
11.4.1 Support Considerations
Review the following support information:
-
Trail File Formats:
-
Must be Oracle GoldenGate release 12c (12.2.0.1) or greater to contain metadata records.
-
Cannot generate a 12c (12.2.0.1) trail format with the older trail format in a multi-trail configuration.
-
FORMATASCII
,FORMATSQL
andFORMATXML
trails will not contain metadata records.
-
-
For existing trail file configurations, you can easily switch between the previous and self-describing extract trail methods of resolving the table metadata by:
-
Use the
USE_TRAILDEFS
GLOBALS parameter to control all pumps and Replicats. -
Use the
OVERRIDE
option ofSOURCEDEFS
andASSUMETARGETDEFS
to control an individual pump or Replicat. Oracle does not recommend this.
-
-
Logdump displays the metadata records similar to DEFGEN output.
-
Reverse is not supported in the 12c (12.2.0.1) trail format.
-
If a table is mapped, the generated TDR is based on the definition of the mapped table not the source table.
-
Metadata in the trail is supported for all databases except HP NonStop (Guardian).
11.4.2 Using Self-Describing Trail Files
Use the USE_TRAILDEFS
GLOBALS parameter to enable or disable all pumps and Replicats. This command usage in relation to the SOURCDEFS
and ASSUMETARGETDEF
, and its source table definitions are described as follows.
Figure 11-1 USE_TRAILDEFS
| NOUSE_TRAILDEFS
USAGE
You must use the OVERRIDE
option with the ASSUMETARGETDEFS
and SOURCEDEFS
parameters when using self-describing trail files.
11.4.3 Examples of Parameter Files
The following is an example of an Extract parameter file:
EXTRACT ext1 USERID tkggadmin@inst1, password tkggadmin DDL include objname hr.*, include objname st_hr.* RMTTRAIL $data/ggs12.2/a1 TABLE hr.*; TABLE st_hr.salary, TARGET hr.salary, COLMAP (USEDEFAULTS, ts = @GETENV('GGHEADER' , 'COMMITTIMESTAMP')); RMTTRAIL $data/ggs12.2/a2, NO_OBJECTDEFS TABLE orders.*;
The following is an example of an Replicat parameter file:
REPLCAT rep1 USERID tkggadmin@inst2, password tkggadmin DDLERROR default discard DDL include all DISCARDFILE ./dirrpt/rep1.dsc purge MAP hr.*, TARGET hr.*;
11.5 Configuring Oracle GoldenGate to Assume Identical Metadata
Note:
This section does not apply to self-describing trail files.
When source and target tables have identical metadata definitions, use the ASSUMETARGETDEFS
parameter in the Replicat parameter file. This parameter directs Replicat to assume that the target definitions are the same as those of the source, and to apply those definitions to the replicated data when constructing SQL statements. The source and target tables must be identical in every way, thus needing no conversion processing, although their catalogs or containers, owners and/or names can be different.
Parent topic: Associating Replicated Data with Metadata
11.6 Configuring Oracle GoldenGate to Assume Dissimilar Metadata
When source and target table definitions are dissimilar, Oracle GoldenGate must perform a conversion from one format to the other. To perform conversions, both sets of definitions must be known to Oracle GoldenGate.
Parent topic: Associating Replicated Data with Metadata
11.7 Configuring Oracle GoldenGate to Use a Combination of Similar and Dissimilar Definitions
Note:
This section does not apply to self-describing trail files.
ASSUMETARGETDEFS
and SOURCEDEFS
can be used in the same parameter file. This can be done when column mapping or conversion must be performed between some of the source-target table pairs, but not for other table pairs that are identical.
The following is an example of how to use SOURCEDEFS
and ASSUMETARGETDEFS
in the same parameter file. This example builds on the previous examples where tables in the acct
, ord
, and hr
schemas require SOURCEDEFS
, but it adds a rpt
schema with tables that are dynamically created with the name stock
appended with a random numerical value. For Oracle GoldenGate to replicate the DDL as well as the DML, the target tables must be identical. In that case, ASSUMETARGETDEFS
is required.
REPLICAT acctrep USERIDALIAS ogg SOURCEDEFS c:\ggs\dirdef\record.def MAP acct.cust*, TARGET acct.cust*, DEF custdef; MAP ord.prod, TARGET ord.prod; MAP ord.parts, TARGET ord.parts; MAP hr.emp, TARGET hr.emp; MAP hr.salary, TARGET hr.salary; ASSUMETARGETDEFS MAP rpt.stock, TARGET rpt.stock;
Parent topic: Associating Replicated Data with Metadata