10.2.39 Oracle GoldenGate Java Delivery

This part of the book contains information on using Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) to process transaction records and apply it to various targets by means of Java module.

For more information, see Understanding Oracle GoldenGate for Distributed Applications and Analytics.

10.2.39.1 Configuring Java Delivery

10.2.39.1.1 Configuring the JRE in the Properties File

The current release of Oracle GoldenGate Java Delivery requires Java 8. Refer to the section on configuring Java for how to correctly access Java and the required Java shared libraries. Modify the Adapter Properties file to point to the location of the Oracle GoldenGate for Java main JAR (ggjava.jar) and set any additional JVM runtime boot options as required (these are passed directly to the JVM at startup):

jvm.bootoptions=-Djava.class.path=.:ggjava/ggjava.jar -Xmx512m -Xmx64m

Note the following options in particular:

  • java.class.path must include pathing to the core application (ggjava/ggjava.jar). The current directory (.) should be included as well in the classpath. Logging initializes when the JVM is loaded therefore the java.class.path variable should including any pathing to logging properties files (such as log4j properties files). The dependency JARs required for logging functionality are included in ggjava.jar and do not need to be explicitly included. Pathing can reference files and directories relative to the Oracle GoldenGate install directory, to allow storing Java property files, Velocity templates and other classpath resources in the dirprm subdirectory. It is also possible to append to the classpath in the Java application properties file. Pathing to handler dependency JARs can be added here as well. However, it is considered to be a better practice to use the gg.classpath variable to include any handler dependencies.

  • The jvm.bootoptions property also allows you to control the initial heap size of the JVM (Xms) and the maximum heap size of the JVM (Xmx). Increasing the maximum heap size can improve performance by requiring less frequent garbage collections. Additionally, you may need to increase the maximum heap size if a Java out of memory exception occurs.

Once the properties file is correctly configured for your system, it usually remains unchanged. See Common Properties, for additional configuration options.

10.2.39.1.2 Configuring Oracle GoldenGate for Java Delivery

Java Delivery is compatible with the Oracle GoldenGate Replicat process. Transaction data is read from the Oracle GoldenGate trail files and delivered to the Oracle GoldenGate Java Delivery module across JNI interface. The data is transferred to the Oracle GoldenGate Java Delivery module using the JNI interface. The Java Delivery module is configurable to allow data to be streamed into various targets. The supported targets for the Oracle GoldenGate Java Adapter product include JMS, file writing, and custom integrations. The Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) includes all of those integrations and streaming capabilities to its targets.

10.2.39.1.2.1 Configuring a Replicat for Java Delivery

The Oracle GoldenGate Replicat process can be configured to send transaction data to the Oracle GoldenGate for Java module. Replicat consumes a local trail (for example dirdat/aa) and sends the data to the Java Delivery module. The Java module is responsible for processing all the data and applying it to the desired target.

Following is an example of adding a Replicat process:

ADD REPLICAT javarep, EXTTRAIL ./dirdat/aa

The process names and trail names used in the preceding example can be replaced with any valid name. Process names must be 8 characters or less, trail names must to be two characters. In the Replicat parameter file (javarep.prm), specify the location of the user exit library.

The Replicat process has transaction grouping built into the application. Transaction grouping can significantly improve performance when streaming data to a target database. Transaction grouping can also significantly improve performance when streaming data to Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA). The Replicat parameter to control transaction grouping is the GROUPTRANSOPS variable in the Replicat configuration file. The default value of this variable is 1000 which means the Replicat process will attempt to group 1000 operations into single target transaction. Performance testing has generally shown that the higher the GROUPTRANSOPS the better the performance when streaming data to GG for DAA. Setting the GROUPTRANSOPS variable to 1 means that the original transaction boundaries from the source trail file (source database) will be maintained.

Table 10-63 User Exit Replicat Parameters

Parameter Explanation
REPLICAT javarep

All Replicat parameter files start with the Replicat name

SOURCEDEFS ./dirdef/tcust.def

(Optional) If the input trail files do not contain the metadata records, the Replicat process requires metadata describing the trail data. This can come from a database or a source definitions file. This metadata defines the column names and data types in the trail being read (./dirdat/aa).

TARGETDB LIBFILE libggjava.so SET properties= dirprm/javarep.properties

The TARGETDB LIBFILE libggjava.so parameter serves as a trigger to initialize the Java module. The SET clause to specify the Java properties file is optional. If specified, it should contain an absolute or relative path (relative to the Replicat executable) to the properties file for the Java module. The default value is replicat_name.properties in the dirprm directory.

MAP schema.*, TARGET *.*;

The tables to pass to the Java module; tables not included will be skipped. If mapping from source to target tables is required, you can use the MAP source_specification TARGET target_specification.

GROUPTRANSOPS 1000

Group source transactions into a single larger target transaction for improved performance. GROUPTRANSOPS of 1000 is the default setting. GROUPTRANSOPS sets a minimum value rather than an absolute value, to avoid splitting apart source transactions. Replicat waits until it receives all operations from the last source transaction in the group before applying the target transaction.

For example, if transaction 1 contains 200 operations, and transaction 2 contains 400 operations, and transaction 3 contains 500 operations, then Replicat transaction contains all 1,100 operations even though GROUPTRANSOPS is set to the default of 1,000. Conversely, Replicat might apply a transaction before reaching the value set by GROUPTRANSOPS if there is no more data in the trail to process.

10.2.39.1.3 Configuring the Java Handlers

The Handlers are integrations with target applications which plug into the Oracle GoldenGate Java Delivery module. It is the Java Handlers which provide the functionality to push data to integration targets such as JMS or Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA). The Java Adapter properties file is used to configure Java Delivery and Java handlers. To test the configuration, users may use the built-in file handler. Here are some example properties, followed by explanations of the properties (comment lines start with #):

# the list of active handlers
gg.handlerlist=myhandler
# set properties on 'myhandler'
gg.handler.myhandler.type=file
gg.handler.myhandler.format=tx2xml.vm
gg.handler.myhandler.file=output.xml

This property file declares the following:

  • Active event handlers. In the example a single event handler is active, called myhandler. Multiple handlers may be specified, separated by commas. For example: gg.handlerlist=myhandler, yourhandler

    Note:

    Starting Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) 23ai release, you will be able to specify only a single handler.
  • Configuration of the handlers. In the example myhandler is declared to be a file type of handler: gg.handler.myhandler.type=file

    Note:

    See the documentation for each type of handler (for example, the JMS handler or the file writer handler) for the list of valid properties that can be set.

  • The format of the output is defined by the Velocity template tx2xml.vm. You may specify your own custom template to define the message format; just specify the path to your template relative to the Java classpath.

This property file is actually a complete example that will write captured transactions to the output file output.xml. Other handler types can be specified using the keywords: jms_text (or jms), jms_map, singlefile (a file that does not roll), and others. Custom handlers can be implemented, in which case the type would be the fully qualified name of the Java class for the handler. GG for DAA package also contains built in the DAA target types.

Note:

See the documentation for each type of handler (for example, the JMS handler or the file writer handler) for the list of valid properties that can be set.

10.2.39.2 Running Java Delivery

10.2.39.2.1 Starting the Application

To run the Java Delivery and execute the Java application, you only need an existing Oracle GoldenGate trail file. If the trail file does not contain metadata records, a source definitions file is also required to describe the schema for operations in the trail file. For the examples that follow, a simple TCUSTMER and TCUSTORD trail is used (matching the demo SQL provided with the Oracle GoldenGate software download).

10.2.39.2.1.1 Starting Using Replicat

To run Java Delivery using Replicat, simply start the Replicat process from GGSCI:

GGSCI> START REPLICAT javarep
GGSCI> INFO REPLICAT javarep

The INFO command returns information similar to the following:

REPLICAT JAVAREP            Last Started 2015-09-10 17:25 Status RUNNING
Checkpoint Lag              00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint File    ./dirdat/aa0000002015-09-10 17:50:41.000000 
                             RBA 2702

10.2.39.2.2 Restarting the Java Delivery

There are two possible checkpoint files when running with Replicat, the Replicat process checkpoint file and the Java Delivery checkpoint file. Both files are located in the dirchk directory and created using the following naming conventions.

Replicat checkpoint file

group_name.cpr

Java delivery checkpoint file:

group_name.cpj

To suppress the creation and use of the Java Delivery checkpoint the Replicat process should be created using the following syntax:

ADD REPLICAT myrep EXTTRAIL ./dirdat/tr NODBCHECKPOINT

It is the NODBCHECKPOINT syntax that disables the creation and use of the Java Delivery checkpoint file.

10.2.39.2.2.1 Restarting Java Delivery in Replicat

The checkpoint handling in Replicat is more straightforward as it includes logic to pick which one out of the two checkpoint information is of higher priority. The logic is as follows:

  • If the Java Delivery is started after user manually performed an ADD or ALTER REPLICAT, then the checkpoint information held by Replicat process will be used as the starting position.

  • If the Java Delivery is started without prior manual intervention to alter checkpoint (for example, upon graceful stop or an abend), then the checkpoint information held by Java module will be used as the starting position.

    For example, restarting a Java Delivery using Replicat at the beginning of a trail looks like the following:

    1. Reset the Replicat to the beginning of the trail data:

      GGSCI> ALTER REPLICAT JAVAREP, EXTSEQNO 0, EXTRBA 0

    2. Reset the Replicat

      GGSCI> START JAVAREP
      GGSCI> INFO JAVAREP
      REPLICAT   JAVAREP    Last Started 2015-09-10 17:25   Status RUNNING
      Checkpoint Lag       00:00:00 (updated 00:00:00 ago)
      Log Read Checkpoint  File ./dirdat/aa000000
      2015-09-10 17:50:41.000000  RBA 2702
      

      It may take a few seconds for the Replicat process status to report itself as running. Check the report file to see if it abended or is still in the process of starting:

      GGSCI> VIEW REPORT JAVAREP

      In the case where the Java Delivery is restarted after a crash or an abend, the last position kept by the Java module will be used when the application restarts.

10.2.39.3 Configuring Event Handlers

This chapter discusses types of event handlers explaining how to specify the event handler to use and what your options are. It explains how to format the output and what you can expect from the Oracle GoldenGate Report file.

10.2.39.3.1 Specifying Event Handlers

Processing transaction, operation and metadata events in Java works as follows:

  • The Oracle GoldenGate Replicat or Extract process reads local trail data and passes the transactions, operations and database metadata to the Java Delivery Module. Metadata can come from the trail itself, a source definitions file.

  • Events are fired by the Java framework, optionally filtered by custom Event Filters.

  • Handlers (event listeners) process these events, and process the transactions, operations and metadata. Custom formatters may be applied for certain types of targets.

There are several existing handlers:

  • Various built in Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) handlers to apply records to supported GG for DAA targets, see Replicate Data to configure various handlers supported in GG for DAA.

  • JMS message handlers to send to a JMS provider using either a MapMessage, or using a TextMessage with customized formatters.

  • A specialized message handler to send JMS messages to Oracle Advanced Queuing (AQ).

  • A file writer handler, for writing to a single file, or a rolling file.

    Note:

    The file writer handler is particularly useful as development utility, since the file writer can take the exact same formatter as the JMS TextMessage handler. Using the file writer provides a simple way to test and tune the formatters for JMS without actually sending the messages to JMS

Event handlers can be configured using the main Java property file or they may optionally read in their own properties directly from yet another property file (depending on the handler implementation). Handler properties are set using the following syntax:

gg.handler.{name}.someproperty=somevalue

This will cause the property someproperty to be set to the value somevalue for the handler instance identified in the property file by name. This name is used in the property file to define active handlers and set their properties; it is user-defined.

Implementation note (for Java developers): Following the preceding example: when the handler is instantiated, the method void setSomeProperty(String value) will be called on the handler instance, passing in the String value somevalue. A JavaBean PropertyEditor may also be defined for the handler, in which case the string can be automatically converted to the appropriate type for the setter method. For example, in the Java application properties file, we may have the following:

# the list of active handlers: only two are active
gg.handlerlist=one, two
# set properties on 'one'
gg.handler.one.type=file
gg.handler.one.format=com.mycompany.MyFormatter
gg.handler.one.file=output.xml
# properties for handler 'two'
gg.handler.two.type=jms_text
gg.handler.two.format=com.mycompany.MyFormatter
gg.handler.two.properties=jboss.properties
# set properties for handler 'foo'; this handler is ignored
gg.handler.foo.type=com.mycompany.MyHandler
gg.handler.foo.someproperty=somevalue

The type identifies the handler class; the other properties depend on the type of handler created. If a separate properties file is used to initialize the handler (such as the JMS handlers), the properties file is found in the classpath. For example, if properties file is at: {gg_install_dir}/dirprm/foo.properties, then specify in the properties file as follows: gg.handler.name.properties=foo.properties.

10.2.39.3.2 JMS Handler

The main Java property file identifies active handlers. The JMS handler may optionally use a separate property file for JMS-specific configuration. This allows more than one JMS handler to be configured to run at the same time.

There are examples included for several JMS providers (JBoss, TIBCO, Solace, ActiveMQ, WebLogic). For a specific JMS provider, you can choose the appropriate properties files as a starting point for your environment. Each JMS provider has slightly different settings, and your environment will have unique settings as well.

The installation directory for the Java JARs (ggjava) contains the core application JARs (ggjava.jar) and its dependencies in resources/lib/*.jar. The resources directory contains all dependencies and configuration, and is in the classpath.

If the JMS client JARs already exist somewhere on the system, they can be referenced directly and added to the classpath without copying them.

The following types of JMS handlers can be specified:

  • jms – sends text messages to a topic or queue. The messages may be formatted using Velocity templates or by writing a formatter in Java. The same formatters can be used for a jms_text message as for writing to files. (jms_text is a synonym for jms.)

  • aq – sends text messages to Oracle Advanced Queuing (AQ). The aq handler is a jms handler configured for delivery to AQ. The messages can be formatted using Velocity templates or a custom formatter.

  • jms_map – sends a JMS MapMessage to a topic or queue. The JMSType of the message is set to the name of the table. The body of the message consists of the following metadata, followed by column name and column value pairs:

    • GG_ID – position of the record, uniquely identifies this operation

    • GG_OPTYPE – type of SQL (insert/update/delete),

    • GG_TABLE – table name on which the operation occurred

    • GG_TX_TIMESTAMP – timestamp of the operation

10.2.39.3.3 File Handler

The file handler is often used to verify the message format when the actual target is JMS, and the message format is being developed using custom Java or Velocity templates. Here is a property file using a file handler:

# one file handler active, using Velocity template formatting
gg.handlerlist=myfile
gg.handler.myfile.type=file
gg.handler.myfile.rollover.size=5M
gg.handler.myfile.format=sample2xml.vm
gg.handler.myfile.file=output.xml

This example uses a single handler (though, a JMS handler and the file handler could be used at the same time), writing to a file called output.xml, using a Velocity template called sample2xml.vm. The template is found using the classpath.

10.2.39.3.4 Custom Handlers

For information on coding a custom handler, see Coding a Custom Handler in Java.

10.2.39.3.5 Formatting the Output

As previously described, the existing JMS and file output handlers can be configured through the properties file. Each handler has its own specific properties that can be set: for example, the output file can be set for the file handler, and the JMS destination can be set for the JMS handler. Both of these handlers may also specify a custom formatter. The same formatter may be used for both handlers. As an alternative to writing Java code for custom formatting, a Velocity template may be specified For further information, see Filtering Events.

10.2.39.3.6 Reporting

Summary statistics about the throughput and amount of data processed are generated when the Replicat or Extract process stops. Additionally, statistics can be written periodically either after a specified amount of time or after a specified number of records have been processed. If both time and number of records are specified, then the report is generated for whichever event happens first. These statistical summaries are written to the Oracle GoldenGate report file and the log files.

10.2.39.4 Java Delivery Properties

10.2.39.4.1 Common Properties

The following properties are common to Java Delivery using either Replicat or Extract.

10.2.39.4.1.1 Logging Properties

Logging is controlled by the following properties.

10.2.39.4.1.1.1 gg.log

Specifies the type of logging that is to be used. The default implementation for the Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) is the jdk option. This is the built-in Java logging called java.util.logging (JUL). The other logging options are log4j or logback.

For example, to set the type of logging to log4j:

gg.log=log4j 

The recommended setting is log4j. The log file is created in the dirrpt subdirectory of the installation. The default log file name includes the group name of the associated Extract and the file extension is .log.

<process name>_<log level>_log4j.log

Therefore if the Oracle GoldenGate Replicat process is called javaue, and the gg.log.level is set to debug, the resulting log file name will be:

javaue_debug_log4j.log

10.2.39.4.1.1.2 gg.log.level

Specifies the overall log level for all modules. The syntax is:

gg.log.level={ERROR|WARN|INFO|DEBUG|TRACE}

The log levels are defined as follows:

  • ERROR – Only write messages if errors occur

  • WARN – Write error and warning messages

  • INFO – Write error, warning and informational messages

  • DEBUG – Write all messages, including debug ones.

  • TRACE - Highest level of logging, includes all messages.

The default logging level is INFO. The messages in this case will be produced on startup, shutdown and periodically during operation. If the level is switched to DEBUG, large volumes of messages may occur which could impact performance. For example, the following sets the global logging level to INFO:

# global logging level
gg.log.level=INFO
10.2.39.4.1.1.3 gg.log.file

Specifies the path to the log file. The syntax is:

gg.log.file=path_to_file

Where the path_to_file is the fully defined location of the log file. This allows a change to the name of the log, but you must include the Replicat name if you have more than one Replicat to avoid one overwriting the log of the other.

10.2.39.4.1.1.4 gg.log.classpath

Specifies the classpath to the JARs used to implement logging. This configuration property is not typically used as the ggjava.jar library includes the required logging dependency libraries.

gg.log.classpath=path_to_jars
10.2.39.4.1.2 JVM Boot Options

The following options configure the Java Runtime Environment. Java classpath and memory options are configurable.

10.2.39.4.1.2.1 jvm.bootoptions

Specifies the initial Java classpath and other boot options that will be applied when the JVM starts. The java.class.path needs colon (:) separators for UNIX/Linux and semicolons (;) for Windows. This is where to specify various options for the JVM, such as initial and maximum heap size and classpath; for example:

  • -Xms: initial java heap size

  • -Xmx: maximum java heap size

  • -Djava.class.path: classpath specifying location of at least the main application JAR, ggjava.jar. Other JARs, such as JMS provider JARs, may also be specified here as well; alternatively, these may be specified in the Java application properties file. If using a separate log4j properties file then the location of the properties file must be included in the bootoptions java.class.path included in the bootoptions variable.

  • -verbose:jni: run in verbose mode (for JNI)

For example (all on a single line):

jvm.bootoptions= -Djava.class.path=ggjava/ggjava.jar 
-Dlog4j.configuration=my-log4j.properties -Xmx512m

The log4j.configuration property identifies a log4j properties file that is resolved by searching the classpath. You may use your own log4j configuration, or one of the preconfigured log4j settings: log4j.properties (default level of logging), debug-log4j.properties (debug logging) or trace-log4j.properties (very verbose logging). To use log4j logging with the Replicat process gg.log=log4j must be set.

Use of the one of the preconfigured log4j settings does not require any change to the classpath since those files are already included in the classpath. The -Djava.class.path variable must include the path to the directory containing a custom log4j configuration file without the * wild card appended.

10.2.39.4.2 Delivery Properties

The following properties are available to Java Delivery:

10.2.39.4.2.1 General Properties

The following properties apply to all writer configurations:

10.2.39.4.2.1.1 goldengate.userexit.writers

Specifies the name of the writer. This is always jvm and should not be modified.

For example:

goldengate.userexit.writers=jvm

All other properties in the file should be prefixed by the writer name, jvm.

10.2.39.4.2.1.2 goldengate.userexit.chkptprefix

Specifies a string value for the prefix added to the Java checkpoint file name. For example:

goldengate.userexit.chkptprefix=javaue_

10.2.39.4.2.1.3 goldengate.userexit.nochkpt

Disables or enables the checkpoint file. The default is false, the checkpoint file is enabled. Set this property to true if transactions are supported and enabled on the target.

For example, Java Adapter Properties if JMS is the target and JMS local transactions are enabled (the default), set goldengate.userexit.nochkpt=true to disable the user exit checkpoint file. If JMS transactions are disabled by setting localTx=false on the handler, the checkpoint file should be enabled by setting goldengate.userexit.nochkpt=false.

goldengate.userexit.nochkpt=true|false

10.2.39.4.2.1.4 goldengate.userexit.usetargetcols

Specifies whether or not mapping to target columns is allowed. The default is false, no target mapping.

goldengate.userexit.usetargetcols=true|false

10.2.39.4.2.2 Statistics and Reporting

Disables or enables the checkpoint file handling. This causes the standard Oracle GoldenGate reporting to be incomplete. Oracle GoldenGate for Java adds its own reporting to handle this issue.

Statistics can be reported every t seconds or every n records - or if both are specified, whichever criteria is met first.

There are two sets of statistics recorded: those maintained by the Replicat module and those obtained from the Java module. The reports received from the Java side are formatted and returned by the individual handlers.

The statistics include the total number of operations, transactions and corresponding rates.

10.2.39.4.2.2.1 jvm.stats.display

Controls the output of statistics to the Oracle GoldenGate report file and to the user exit log files.

The following example outputs these statistics.

jvm.stats.display=true
10.2.39.4.2.2.2 jvm.stats.full

Controls the output of statistics from the Java side, in addition to the statistics from the C side.

Java side statistics are more detailed but also involve some additional overhead, so if statistics are reported often and a less detailed summary is adequate, it is recommended that stats.full property is set to false.

The following example will output Java statistics in addition to C.

jvm.stats.full=true
10.2.39.4.2.2.3 jvm.stats.time | jvm.stats.numrecs

Specifies a time interval, in seconds or a number of records, after which statistics will be reported. The default is to report statistics every hour or every 10000 records (which ever occurs first).

For example, to report ever 10 minutes or every 1000 records, specify:

jvm.stats.time=600
jvm.stats.numrecs=1000

The Java application statistics are handler-dependent:

  • For the all handlers, there is at least the total elapsed time, processing time, number of operations, transactions;

  • For the JMS handler, there is additionally the total number of bytes received and sent.

  • The report can be customized using a template.

10.2.39.4.3 Java Application Properties

The following defines the properties which may be set in the Java application property file.

10.2.39.4.3.1 Properties for All Handlers

The following properties apply to all handlers:

10.2.39.4.3.1.1 gg.handlerlist

The handler list is a list of active handlers separated by commas. These values are used in the rest of the property file to configure the individual handlers. For example:

gg.handlerlist=name1, name2
gg.handler.name1.propertyA=value1
gg.handler.name1.propertyB=value2
gg.handler.name1.propertyC=value3
gg.handler.name2.propertyA=value1
gg.handler.name2.propertyB=value2
gg.handler.name2.propertyC=value3

Using the handlerlist property, you may include completely configured handlers in the property file and just disable them by removing them from the handlerlist.

10.2.39.4.3.1.2 gg.handler.name.type

This type of handler. This is either a predefined value for built-in handlers, or a fully qualified Java class name. The syntax is:

gg.handler.name.type={jms|jms_map|aq|singlefile|rollingfile|custom_java_class}

Where:

All but the last are pre-defined handlers:

  • jms – Sends transactions, operations, and metadata as formatted messages to a JMS provider

  • aq – Sends transactions, operations, and metadata as formatted messages to Oracle Advanced Queuing (AQ)

  • jms_map – Sends JMS map messages

  • singlefile – Writes to a single file on disk, but does not roll the file

  • rollingfile – Writes transactions, operations, and metadata to a file on disk, rolling the file over after a certain size, amount of time, or both. For example:

    gg.handler.name1.rolloverSize=5000000
    gg.handler.name1.rolloverTime=1m
    
    
  • custom_java_class – Any class that extends the Oracle GoldenGate for Java AbstractHandler class and can handle transaction, operation, or metadata events

    The Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) package also contains more predefined handlers to write to various GG for DAA targets.

10.2.39.4.3.2 Properties for Formatted Output

The following properties apply to all handlers capable of producing formatted output; this includes:

  • The jms_text handler (but not the jms_map handler)

  • The aq handler

  • The singlefile and rolling handlers, for writing formatted output to files

  • The predefined Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) handlers

10.2.39.4.3.2.1 gg.handler.name.format

Specifies the format used to transform operations and transactions into messages sent to JMS, to the Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) target or to a file. The format is specified uniquely for each handler. The value may be:

  • Velocity template

  • Java class name (fully qualified - the class specified must be a type of formatter)

  • csv for delimited values (such as comma separated values; the delimiter can be customized)

  • fixed for fixed-length fields

  • Built-in formatter, such as:

    • xml – demo XML format

    • xml2 – internal XML format

For example, to specify a custom Java class:

gg.handlerlist=abc
gg.handler.abc.format=com.mycompany.MyFormat

Or, for a Velocity template:

gg.handlerlist=xyz
gg.handler.xyz.format=path/to/sample.vm

If using templates, the file is found relative to some directory or JAR that is in the classpath. By default, the Oracle GoldenGate installation directory is in the classpath, so the preceding template could be placed in the dirprm directory of the Oracle GoldenGate installation location.

The default format is to use the built-in XML formatter.

10.2.39.4.3.2.2 gg.handler.name.includeTables

Specifies a list of tables this handler will include.

If the schema (or owner) of the table is specified, then only that schema matches the table name; otherwise, the table name matches any schema. A comma separated list of tables can be specified. For example, to have the handler only process tables foo.customer and bar.orders:

gg.handler.myhandler.includeTables=foo.customer, bar.orders

If the catalog and schema (or owner) of the table are specified, then only that catalog and schema matches the table name; otherwise, the table name matches any catalog and schema. A comma separated list of tables can be specified. For example, to have the handler only process tables dbo.foo.customer and dbo.bar.orders:

gg.handler.myhandler.includeTables=dbo.foo.customer, dbo.bar.orders

Note:

In order to selectively process operations on a table by table basis, the handler must be processing in operation mode. If the handler is processing in transaction mode, then when a single transaction contains several operations spanning several tables, if any table matches the include list of tables, the transaction will be included.

10.2.39.4.3.2.3 gg.handler.name.excludeTables

Specifies a list of tables this handler will exclude.

If the schema (or owner) of the table is specified, then only that schema matches the table name; otherwise, the table name matches any schema. A list of tables may be specified, comma-separated. For example, to have the handler process all operations on all tables except table date_modified in all schemas:

gg.handler.myhandler.excludeTables=date_modified

If the catalog and schema (or owner) of the table are specified, then only that catalog and schema matches the table name; otherwise, the table name matches any catalog and schema. A list of tables may be specified, comma-separated. For example, to have the handler process all operations on all tables except table date_modified in catalog dbo and schema bar:

gg.handler.myhandler.excludeTables=dbo.bar.date_modified
10.2.39.4.3.2.4 gg.handler.name.mode, gg.handler.name.format.mode

Specifies whether to output one operation per message (op) or one transaction per message (tx). The default is op. Use gg.handler.name.format.mode when you have a custom formatter.

10.2.39.4.3.3 Properties for CSV and Fixed Format Output

If the handler is set to use either comma separated values (CSV) CSV or fixed format output, the following properties may also be set.

10.2.39.4.3.3.1 gg.handler.name.format.delim

Specifies the delimiter to use between fields. Set this to no value to have no delimiter used. For example:

gg.handler.handler1.format.delim=,
10.2.39.4.3.3.2 gg.handler.name.format.quote

Specifies the quote character to be used if column values are quoted. For example:

gg.handler.handler1.format.quote='
10.2.39.4.3.3.3 gg.handler.name.format.metacols

Specifies the metadata column values to appear at the beginning of the record, before any column data. Specify any of the following, in the order they should appear:

  • position – unique position indicator of records in a trail

  • opcodeI, U, or D for insert, update, or delete records (see: insertChar, updateChar, deleteChar)

  • txind – transaction indicator – such as 0=begin, 1=middle, 2=end, 3=whole tx (see beginTxChar, middleTxChar, endTxChar, wholeTxChar)

  • opcount – position of a record in a transaction, starting from 0

  • catalog – catalog of the schema for the record

  • schema – schema/owner of the table for the record

  • tableonly – just table (no schema/owner)

  • table – full name of table, catalog.schema.table

  • timestamp – commit timestamp of record

For example:

gg.handler.handler1.format.metacols=opcode, table, txind, position
10.2.39.4.3.3.4 gg.handler.name.format.missingColumnChar

Specifies a special column prefix for a column value that was not captured from the source database transaction log. The column value is not in trail and it is unknown if it has a value or is NULL

The character used to represent the missing state of the column value can be customized. For example:

gg.handler.handler1.format.missingColumnChar=M

By default, the missing column value is set to an empty string and does not show.

10.2.39.4.3.3.5 gg.handler.name.format.presentColumnChar

Specifies a special column prefix for a column value that exists in the trail and is not NULL.

The character used to represent the state of the column can be customized. For example:

gg.handler.handler1.format.presentColumnChar=P

By default, the present column value is set to an empty string and does not show.

10.2.39.4.3.3.6 gg.handler.name.format.nullColumnChar

Specifies a special column prefix for a column value that exists in the trail and is set to NULL.

The character used to represent the state of the column can be customized. For example:

gg.handler.handler1.format.nullColumnChar=N

By default, the null column value is set to an empty string and does not show.

10.2.39.4.3.3.7 gg.handler.name.format.beginTxChar

Specifies the header metadata character (see metacols) used to identify a record as the begin of a transaction. For example:

gg.handler.handler1.format.beginTxChar=B
10.2.39.4.3.3.8 gg.handler.name.format.middleTxChar

Specifies the header metadata characters (see metacols) used to identify a record as the middle of a transaction. For example:

gg.handler.handler1.format.middleTxChar=M
10.2.39.4.3.3.9 gg.handler.name.format.endTxChar

Specifies the header metadata characters (see metacols) used to identify a record as the end of a transaction. For example:

gg.handler.handler1.format.endTxChar=E
10.2.39.4.3.3.10 gg.handler.name.format.wholeTxChar

Specifies the header metadata characters (see metacols) used to identify a record as a complete transaction; referred to as a whole transaction. For example:

gg.handler.handler1.format.wholeTxChar=W
10.2.39.4.3.3.11 gg.handler.name.format.insertChar

Specifies the character to identify an insert operation. The default I.

For example, to use INS instead of I for insert operations:

gg.handler.handler1.format.insertChar=INS
10.2.39.4.3.3.12 gg.handler.name.format.updateChar

Specifies the character to identify an update operation. The default is U.

For example, to use UPD instead of U for update operations:

gg.handler.handler1.format.updateChar=UPD
10.2.39.4.3.3.13 gg.handler.name.format.deleteChar

Specifies the character to identify a delete operation. The default is D.

For example, to use DEL instead of D for delete operations:

gg.handler.handler1.format.deleteChar=DEL
10.2.39.4.3.3.14 gg.handler.name.format.truncateChar

Specifies the character to identify a truncate operation. The default is T.

For example, to use TRUNC instead of T for truncate operations:

gg.handler.handler1.format.truncateChar=TRUNC
10.2.39.4.3.3.15 gg.handler.name.format.endOfLine

Specifies the end-of-line character as:

  • EOL - Native platform

  • CR - Neutral (UNIX-style \n)

  • CRLF - Windows (\r\n)

For example:

gg.handler.handler1.format.endOfLine=CR
10.2.39.4.3.3.16 gg.handler.name.format.justify

Specifies the left or right justification of fixed fields. For example:

gg.handler.handler1.format.justify=left
10.2.39.4.3.3.17 gg.handler.name.format.includeBefores

Controls whether before images should be included in the output. There must be before images in the trail. For example:

gg.handler.handler1.format.includeBefores=false
10.2.39.4.3.4 File Writer Properties

The following properties only apply to handlers that write their output to files: the file handler and the singlefile handler.

10.2.39.4.3.4.1 gg.handler.name.file

Specifies the name of the output file for the given handler. If the handler is a rolling file, this name is used to derive the rolled file names. The default file name is output.xml.

10.2.39.4.3.4.2 gg.handler.name.append

Controls whether the file should be appended to (true) or overwritten upon restart (false).

10.2.39.4.3.4.3 gg.handler.name.rolloverSize

If using the file handler, this specifies the size of the file before a rollover should be attempted. The file size will be at least this size, but will most likely be larger. Operations and transactions are not broken across files. The size is specified in bytes, but a suffix may be given to identify MB or KB. For example:

gg.handler.myfile.rolloverSize=5MB

The default rollover size is 10MB.

10.2.39.4.3.5 JMS Handler Properties

The following properties apply to the JMS handlers. Some of these values may be defined in the Java application properties file using the name of the handler. Other properties may be placed into a separate JMS properties file, which is useful if using more than one JMS handler at a time. For example:

gg.handler.myjms.type=jms_text
gg.handler.myjms.format=xml
gg.handler.myjms.properties=weblogic.properties

Just as with Velocity templates and formatting property files, this additional JMS properties file is found in the classpath. The preceding properties file weblogic.properties would be found in {gg_install_dir}/dirprm/weblogic.properties, since the dirprm directory is included by default in the classpath.

Settings that can be made in the Java application properties file will override the corresponding value set in the supplemental JMS properties file (weblogic.properties in the preceding example). In the following example, the destination property is specified in the Java application properties file. This allows the same default connection information for the two handlers myjms1 and myjms2, but customizes the target destination queue.

gg.handlerlist=myjms1,myjms2
gg.handler.myjms1.type=jms_text
gg.handler.myjms1.destination=queue.sampleA
gg.handler.myjms1.format=sample.vm
gg.handler.myjms1.properties=tibco-default.properties
gg.handler.myjms2.type=jms_map
gg.handler.myjms2.destination=queue.sampleB
gg.handler.myjms2.properties=tibco-default.properties

To set a property, specify the handler name as a prefix; for example:

gg.handlerlist=sample
gg.handler.sample.type=jms_text
gg.handler.sample.format=my_template.vm
gg.handler.sample.destination=gg.myqueue
gg.handler.sample.queueortopic=queue
gg.handler.sample.connectionUrl=tcp://host:61616?jms.useAsyncSend=true
gg.handler.sample.useJndi=false
gg.handler.sample.connectionFactory=ConnectionFactory
gg.handler.sample.connectionFactoryClass=\
    org.apache.activemq.ActiveMQConnectionFactory
gg.handler.sample.timeToLive=50000
10.2.39.4.3.5.1 Standard JMS Settings

The following outlines the JMS properties which may be set, and the accepted values. These apply for both JMS handler types: jms_text (TextMessage) and jms_map (MapMessage).

10.2.39.4.3.5.1.1 gg.handler.name.destination

The queue or topic to which the message is sent. This must be correctly configured on the JMS server. Typical values may be: queue/A, queue.Test, example.MyTopic, etc.

gg.handler.name.destination=queue_or_topic
10.2.39.4.3.5.1.2 gg.handler.name.user

(Optional) User name required to send messages to the JMS server.

gg.handler.name.user=user_name
10.2.39.4.3.5.1.3 gg.handler.name.password

(Optional) Password required to send messages to the JMS server

gg.handler.name.password=password
10.2.39.4.3.5.1.4 gg.handler.name.queueOrTopic

Whether the handler is sending to a queue (a single receiver) or a topic (publish / subscribe). This must be correctly configured in the JMS provider. This property is an alias of gg.handler.name.destination. The syntax is:

gg.handler.name.queueOrTopic=queue|topic

Where:

  • queue – a message is removed from the queue once it has been read. This is the default.

  • topic – messages are published and may be delivered to multiple subscribers.

10.2.39.4.3.5.1.5 gg.handler.name.persistent

If the delivery mode is set to persistent or not. If the messages are to be persistent, the JMS provider must be configured to log the message to stable storage as part of the client's send operation. The syntax is:

gg.handler.name.persistent={true|false}
10.2.39.4.3.5.1.6 gg.handler.name.priority

JMS defines a 10 level priority value, with 0 as the lowest and 9 as the highest. Priority is set to 4 by default. The syntax is:

gg.handler.name.priority=integer

For example:

gg.handler.name.priority=5
10.2.39.4.3.5.1.7 gg.handler.name.timeToLive

The length of time in milliseconds from its dispatch time that a produced message should be retained by the message system. A value of zero specifies the time is unlimited. The default is zero. The syntax is:

gg.handler.name.timeToLive=milliseconds

For example:

gg.handler.name.timeToLive= 36000
10.2.39.4.3.5.1.8 gg.handler.name.connectionFactory

Name of the connection factory to lookup using JNDI. ConnectionFactoryJNDIName is an alias. The syntax is:

gg.handler.name.connectionFactory=JNDI_name
10.2.39.4.3.5.1.9 gg.handler.name.useJndi

If gg.handler.name.usejndi is false, then JNDI is not used to configure the JMS client. Instead, factories and connections are explicitly constructed. The syntax is:

gg.handler.name.useJndi=true|false
10.2.39.4.3.5.1.10 gg.handler.name.connectionUrl

Connection URL is used only when not using JNDI to explicitly create the connection. The syntax is:

gg.handler.name.connectionUrl=url
10.2.39.4.3.5.1.11 gg.handler.name.connectionFactoryClass

The Connection Factory Class is used to access a factory only when not using JNDI. The value of this property is the Java class name to instantiate; constructing a factory object explicitly.

gg.handler.name.connectionFactoryClass=java_class_name
10.2.39.4.3.5.1.12 gg.handler.name.localTX

Specifies whether or not local transactions are used. The default is true, local transactions are used. The syntax is:

gg.handler.name.localTX=true|false
10.2.39.4.3.5.1.13 gg.handlerlist.nop

Disables the sending of JMS messages to allow testing of message generation. This is a global property used only for testing. The events are still generated and handled and the message is constructed. The default is false; do not disable message send. The syntax is:

gg.handlerlist.nop=true|false

Users can take advantage of this option to measure the performance of trail records processing without involving the handler module. This approach can narrow down the possible culprits of a suspected performance issue while applying trail records to the target system.

10.2.39.4.3.5.1.14 gg.handler.name.physicalDestination

Name of the queue or topic object, obtained through the ConnectionFactory API instead of the JNDI provider.

gg.handler.name.physicalDestination=queue_name

10.2.39.4.3.5.2 Group Transaction Properties

These properties set limits for grouping transactions.

10.2.39.4.3.6 JNDI Properties

These JNDI properties are required for connection to an Initial Context to look up the connection factory and initial destination.

java.naming.provider.url=url
java.naming.factory.initial=java-class-name

If JNDI security is enabled, the following properties may be set:

java.naming.security.principal=user-name
java.naming.security.credentials=password-or-other-authenticator

For example:

java.naming.provider.url= t3://localhost:7001
java.naming.factory.initial=weblogic.jndi.WLInitialContextFactory
java.naming.security.principal=jndiuser
java.naming.security.credentials=jndipw
10.2.39.4.3.7 General Properties

The following are general properties that are used for the Java framework:

10.2.39.4.3.7.1 gg.classpath

Specifies a comma delimited list of additional paths to directories or JARs to add to the classpath. Optionally, the list can be delimited by semicolons for Windows systems or by colons for UNIX. For example:

gg.classpath=C:\Program Files\MyProgram\bin;C:\Program Files\ProgramB\app\bin;

This Adapter properties file configuration property should be used to configure pathing to custom Java JARs or to the external dependencies of Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA).

10.2.39.4.3.7.2 gg.report.time

Specifies how often statistics are calculated and sent to Extract for the processing report. If Extract is configured to print a report, these statistics are included. The syntax is:

gg.report.time=report_interval{s|m|h}

Where:

  • report_interval is an integer

  • The valid time units are:

    • s - seconds

    • m - minutes

    • h - hours

If no value is entered, the default is to calculate and send every 24 hours.

10.2.39.4.3.7.3 gg.binaryencoding

Specifies the binary encoding type. The desired output encoding for binary data can be configured using this property. For example:

gg.binaryencoding=base64|hex

The default value is base64. The valid values to represent binary data are:

  • base64 - a base64 string 

  • hex - a hexadecimal string

10.2.39.4.3.8 Java Delivery Transaction Grouping

Transaction grouping can significantly improve the performance of Java integrations especially Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) integrations. Java Delivery provides functionality to perform transaction grouping. When Java Delivery is hosted by a Replicat process then the GROUPTRANSOPS Replicat configuration should be used to perform transaction grouping.

10.2.39.5 Developing Custom Filters, Formatters, and Handlers

10.2.39.5.1 Filtering Events

By default, all transactions, operations and metadata events are passed to the DataSourceListener event handlers. An event filter can be implemented to filter the events sent to the handlers. The filter could select certain operations on certain tables containing certain column values, for example

Filters are additive: if more than one filter is set for a handler, then all filters must return true in order for the event to be passed to the handler.

You can configure filters using the Java application properties file:

# handler "foo" only receives certain events
gg.handler.one.type=jms
gg.handler.one.format=mytemplate.vm
gg.handler.one.filter=com.mycompany.MyFilter

To activate the filter, you write the filter and set it on the handler; no additional logic needs to be added to specific handlers.

10.2.39.5.2 Custom Formatting

You can customize the output format of a built-in handler by:

  • Writing a custom formatter in Java or

  • Using a velocity template

10.2.39.5.2.1 Coding a Custom Formatter in Java

The preceding examples show a JMS handler and a file output handler using the same formatter (com.mycompany.MyFormatter). The following is an example of how this formatter may be implemented.

Example 10-2 Custom Formatting Implementation

package com.mycompany.MyFormatter;
import oracle.goldengate.datasource.DsOperation;
import oracle.goldengate.datasource.DsTransaction;
import oracle.goldengate.datasource.format.DsFormatterAdapter;
import oracle.goldengate.datasource.meta.ColumnMetaData;
import oracle.goldengate.datasource.meta.DsMetaData;
import oracle.goldengate.datasource.meta.TableMetaData;
import java.io.PrintWriter;
public class MyFormatter extends DsFormatterAdapter {
        public MyFormatter() { }
        @Override
        public void formatTx(DsTransaction tx,
DsMetaData meta,
PrintWriter out)
        {
            out.print("Transaction: " );
            out.print("numOps=\'" + tx.getSize() + "\' " );
            out.println("ts=\'" + tx.getStartTxTimeAsString() + "\'");
            for(DsOperation op: tx.getOperations()) {
TableName currTable = op.getTableName();
TableMetaData tMeta = dbMeta.getTableMetaData(currTable);
String opType = op.getOperationType().toString();
String table = tMeta.getTableName().getFullName();
out.println(opType + " on table \"" + table + "\":" );
int colNum = 0;
for(DsColumn col: op.getColumns())
{
ColumnMetaData cMeta = tMeta.getColumnMetaData( colNum++ );
out.println(
cMeta.getColumnName() + " = " + col.getAfterValue() );
}
        }
        @Override
        public void formatOp(DsTransaction tx,
DsOperation op,
TableMetaData tMeta,
PrintWriter out)
        {
            // not used...
        }
}

The formatter defines methods for either formatting complete transactions (after they are committed) or individual operations (as they are received, before the commit). If the formatter is in operation mode, then formatOp(...) is called; otherwise, formatTx(...) is called at transaction commit.

To compile and use this custom formatter, include the Oracle GoldenGate for Java JARs in the classpath and place the compiled .class files in gg_install_dir/dirprm:

javac -d gg_install_dir/dirprm
-classpath ggjava/ggjava.jar MyFormatter.java

The resulting class files are located in resources/classes (in correct package structure):

gg_install_dir/dirprm/com/mycompany/MyFormatter.class

Alternatively, the custom classes can be put into a JAR; in this case, either include the JAR file in the JVM classpath using the user exit properties (using java.class.path in the jvm.bootoptions property), or by setting the Java application properties file to include your custom JAR:

# set properties on 'one'
gg.handler.one.type=file
gg.handler.one.format=com.mycompany.MyFormatter
gg.handler.one.file=output.xml
gg.classpath=/path/to/my.jar,/path/to/directory/of/jars/*
10.2.39.5.2.2 Using a Velocity Template

As an alternative to writing Java code for custom formatting, Velocity templates can be a good alternative to quickly prototype formatters. For example, the following template could be specified as the format of a JMS or file handler:

Transaction: numOps='$tx.size' ts='$tx.timestamp'
#for each( $op in $tx )
operation: $op.sqlType, on table "$op.tableName":
#for each( $col in $op )
$op.tableName, $col.meta.columnName = $col.value
#end
#end

If the template were named sample.vm, it could be placed in the classpath, for example:

gg_install_dir/dirprm/sample.vm
	

Update the Java application properties file to use the template:

# set properties on 'one'
gg.handler.one.type=file
gg.handler.one.format=sample.vm
gg.handler.one.file=output.xml

When modifying templates, there is no need to recompile any Java source; simply save the template and re-run the Java application. When the application is run, the following output would be generated (assuming a table named SCHEMA.SOMETABLE, with columns TESTCOLA and TESTCOLB):

Transaction: numOps='3' ts='2008-12-31 12:34:56.000'
operation: UPDATE, on table "SCHEMA.SOMETABLE":
SCHEMA.SOMETABLE, TESTCOLA = value 123
SCHEMA.SOMETABLE, TESTCOLB = value abc
operation: UPDATE, on table "SCHEMA.SOMETABLE":
SCHEMA.SOMETABLE, TESTCOLA = value 456
SCHEMA.SOMETABLE, TESTCOLB = value def
operation: UPDATE, on table "SCHEMA.SOMETABLE":
SCHEMA.SOMETABLE, TESTCOLA = value 789
SCHEMA.SOMETABLE, TESTCOLB = value ghi

10.2.39.5.3 Coding a Custom Handler in Java

A custom handler can be implemented by extending AbstractHandler as in the following example:

import oracle.goldengate.datasource.*;
import static oracle.goldengate.datasource.GGDataSource.Status;
public class SampleHandler extends AbstractHandler {
        @Override
        public void init(DsConfiguration conf, DsMetaData metaData) {
            super.init(conf, metaData);
            // ... do additional config...
        }
        @Override
        public Status operationAdded(DsEvent e, DsTransaction tx, DsOperation op) { ... }
        @Override
        public Status transactionCommit(DsEvent e, DsTransaction tx) { ... }
        @Override
        public Status metaDataChanged(DsEvent e, DsMetaData meta) { .... }
        @Override
        public void destroy() { /* ... do cleanup ... */ }
        @Override
        public String reportStatus() { return "status report..."; }
        @Override
        public Status ddlOperation(OpType opType, ObjectType objectType, String objectName, String ddlText) }

The method in AbstractHandler is not abstract rather it has a body. In the body it performs cached metadata invalidation by marking the metadata object as dirty. It also provides TRACE-level logging of DDL events when the ddlOperation method is specified. You can override this method in your custom handler implementations. You should always call the super method before any custom handling to ensure the functionality in AbstractHandler is executed

When a transaction is processed from the Extract, the order of calls into the handler is as follows:

  1. Initialization:

    • First, the handler is constructed.

    • Next, all the "setters" are called on the instance with values from the property file.

    • Finally, the handler is initialized; the init(...) method is called before any transactions are received. It is important that the init(...) method call super.init(...) to properly initialize the base class.

  2. Metadata is then received. If the Java module is processing an operation on a table not yet seen during this run, a metadata event is fired, and the metadataChanged(...) method is called. Typically, there is no need to implement this method. The DsMetaData is automatically updated with new data source metadata as it is received.

  3. A transaction is started. A transaction event is fired, causing the transactionBegin(...) method on the handler to be invoked (this is not shown). This is typically not used, since the transaction has zero operations at this point.

  4. Operations are added to the transaction, one after another. This causes the operationAdded(...) method to be called on the handler for each operation added. The containing transaction is also passed into the method, along with the data source metadata that contains all processed table metadata. The transaction has not yet been committed, and could be aborted before the commit is received.

    Each operation contains the column values from the transaction (possibly just the changed values when Extract is processing with compressed updates.) The column values may contain both before and after values.

    For the ddlOperation method, the options are:

    • opType - Is an enumeration that identifies the DDL operation type that is occurring (CREATE, ALTER, and so on).

    • objectType - Is an enumeration that identifies the type of the target of the DDL (TABLE, VIEW, and so on).

    • objectName - Is the fully qualified source object name; typically a fully qualified table name.

    • ddlText - Is the raw DDL text executed on the source relational database.

  5. The transaction is committed. This causes the transactionCommit(...) method to be called.

  6. Periodically, reportStatus may be called; it is also called at process shutdown. Typically, this displays the statistics from processing (the number of operations andtransactions processed and other details).

An example of a simple printer handler, which just prints out very basic event information for transactions, operations and metadata follows. The handler also has a property myoutput for setting the output file name; this can be set in the Java application properties file as follows:

gg.handlerlist=sample
# set properties on 'sample'
gg.handler.sample.type=sample.SampleHandler
gg.handler.sample.myoutput=out.txt

To use the custom handler,

  1. Compile the class

  2. Include the class in the application classpath,

  3. Add the handler to the list of active handlers in the Java application properties file.

To compile the handler, include the Oracle GoldenGate for Java JARs in the classpath and place the compiled .class files in gg_install_dir/javaue/resources/classes:

javac -d gg_install_dir/dirprm
-classpath ggjava/ggjava.jar SampleHandler.java

The resulting class files would be located in resources/classes, in correct package structure, such as:

gg_install_dir/dirprm/sample/SampleHandler.class

Note:

For any Java application development beyond hello world examples, either Ant or Maven would be used to compile, test and package the application. The examples showing javac are for illustration purposes only.

Alternatively, custom classes can be put into a JAR and included in the classpath. Either include the custom JAR files in the JVM classpath using the Java properties (using java.class.path in the jvm.bootoptions property), or by setting the Java application properties file to include your custom JAR:

# set properties on 'one'
gg.handler.one.type=sample.SampleHandler
gg.handler.one.myoutput=out.txt
gg.classpath=/path/to/my.jar,/path/to/directory/of/jars/*

The classpath property can be set on any handler to include additional individual JARs, a directory (which would contain resources or extracted class files) or a whole directory of JARs. To include a whole directory of JARs, use the Java 6 style syntax:

c:/path/to/directory/* (or on UNIX: /path/to/directory/* )

Only the wildcard * can be specified; a file pattern cannot be used. This automatically matches all files in the directory ending with the .jar suffix. To include multiple JARs or multiple directories, you can use the system-specific path separator (on UNIX, the colon and on Windows the semicolon) or you can use platform-independent commas, as shown in the preceding example.

If the handler requires many properties to be set, just include the property in the parameter file, and your handler's corresponding "setter" will be called. For example:

gg.handler.one.type=com.mycompany.MyHandler
gg.handler.one.myOutput=out.txt
gg.handler.one.myCustomProperty=12345

The preceding example would invoke the following methods in the custom handler:

public void setMyOutput(String s) {
        // use the string...
} public void setMyCustomProperty(int j) {
        // use the int...
}

Any standard Java type may be used, such as int, long, String, boolean. For custom types, you may create a custom property editor to convert the String to your custom type.

10.2.39.5.4 Additional Resources

There is Javadoc available for the Java API. The Javadoc has been intentionally reduced to a set of core packages, classes and interfaces in order to only distribute the relevant interfaces and classes useful for customizing and extension.

In each package, some classes have been intentionally omitted for clarity. The important classes are:

  • oracle.goldengate.datasource.DsTransaction: represents a database transaction. A transaction contains zero or more operations.

  • oracle.goldengate.datasource.DsOperation: represents a database operation (insert, update, delete). An operation contains zero or more column values representing the data-change event. Columns indexes are offset by zero in the Java API.

  • oracle.goldengate.datasource.DsColumn: represents a column value. A column value is a composite of a before and an after value. A column value may be 'present' (having a value or be null) or 'missing' (is not included in the source trail).

    • oracle.goldengate.datasource.DsColumnComposite is the composite

    • oracle.goldengate.datasource.DsColumnBeforeValue is the column value before the operation (this is optional, and may not be included in the operation)

    • oracle.goldengate.datasource.DsColumnAfterValue is the value after the operation

  • oracle.goldengate.datasource.meta.DsMetaData: represents all database metadata seen; initially, the object is empty. DsMetaData contains a hash map of zero or more instances of TableMetaData, using the TableName as a key.

  • oracle.goldengate.datasource.meta.TableMetaData: represents all metadata for a single table; contains zero or more ColumnMetaData.

  • oracle.goldengate.datasource.meta.ColumnMetaData: contains column names and data types, as defined in the database and/or in the Oracle GoldenGate source definitions file.

See the Javadoc for additional details.

10.2.39.6 Configuring Data Transforms

Data Transforms is the Oracle GoldenGate module for Distributed Applications and Analytics, which can help with column level data transformations during the replicat process.

It's a 2 step process:

  1. Configuring a Matcher:

    Matcher configuration helps in identifying target columns, which you want to apply the Data Transforms on.

  2. Configuring a Converter:

    Converter defines the logic to be used to convert the matched target columns prior to writing it to the target.

10.2.39.6.1 Built-in Regex Based Data Transforms

By default, Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) provides a default regex based implementation for both matcher and the converter.

Data Transform Configuration

# Transform name (To be referred in the subsequent configs)
gg.transforms=t1

# Configure the matcher implementation (using the built-in regex type in this ex)
gg.transform.t1.matcher=regex

# Configure the converter implementation (using the built-in regex type in this ex)
gg.transform.t1.converter=regex

# These matcher configs correspond to the buit-in regex matcher

# Target catalogs to match. Default value is *
gg.transform.t1.matcher.catalogRegex={}

# Target schema to match. Default value is *
gg.transform.t1.matcher.schemaRegex={}

# Target tables to match (*Required field)
gg.transform.t1.matcher.tableRegex={}

# Target columns to match (*Required field)
gg.transform.t1.matcher.columnRegex={}



# These converter configs correspond to the buit-in regex converter

# Content search regex (from the columns selected, filter only specific values matching this regex)
gg.transform.t1.converter.replaceRegex={}

# Content replacement value
gg.transform.t1.converter.replaceString={}

Note:

tableRegex and columnRegex params do not have any default value. No tables or columns will be matched if either tableRegex or columnRegex is not defined.

Example on how to use the built-in regex based data transform

The following configuration creates a data transform which identifies all the target objects with:

Matcher

  1. Table name starting with tab.
  2. Column name ending with col.

Converter

  1. Converts the above matched column values to a fixed value, for example: TestValue.
gg.transforms=t1  

gg.transform.t1.matcher=regex
gg.transform.t1.converter=regex


gg.transform.t1.matcher.catalogRegex=.*
gg.transform.t1.matcher.schemaRegex=.*

# Table name staring with 'tab'
gg.transform.t1.matcher.tableRegex=^tab.*

# Column name ending with 'col'
gg.transform.t1.matcher.columnRegex=.*col$


gg.transform.t1.converter.replaceRegex=.*

# Replacement value
gg.transform.t1.converter.replaceString=TestVal

10.2.39.6.2 Developing Custom Data Transforms

A custom data transform implementation can be achieved by implementing the matcher and converter interfaces as shown in the example below.

Consider a scenario where you want to mask a sensitive field's value during replicat process.
  1. Configure the target column which matches the following criteria:
    1. Catalog name: Cat1
    2. Schema name: Sch1
    3. Table name: Sample_Table
    4. Column name: Sample_Column
  2. Configure a converter with some conversion implementation.
    1. Replace the column values for the above matched column with a masked value
@Matcher(id = "matcher1", description = "Custom target column matcher.")
public class CustomTargetMatcher implements TargetMatcher {
    @Override
    public boolean matches(final TableMetaData tableMetaData) {
        return tableMetaData.getCatalogName().equals("Cat1") && tableMetaData.getSchemaName().equals("Sch1") && tableMetaData.getTableName().equals("Sample_Table");
    }
    @Override
    public boolean matches(final ColumnMetaData columnMetaData) {
                return columnMetaData.getColumnName().equals("Sample_Column");
    }
}

@Converter(id = "converter1", description = "Custom data converter.")
public class CustomConverter implements DataConverter {
    
    public String convert(String originalData, final TableMetaData tableMetaData, final ColumnMetaData columnMetaData) {
         return "********"; // Masked Value
    }
}

Adapter properties for this implementation

gg.transforms=t1

# This config corresponds to the @Matcher => id param
gg.transform.t1.matcher=matcher1
# This config corresponds to the @Converter => id param
gg.transform.t1.converter=converter1

To use the custom classes:

Place the custom classes into a JAR and include them in the classpath. Include the custom JAR files in the JVM classpath using the Java properties (using java.class.path in the jvm.bootoptions property) or under gg.classpath

10.2.39.6.3 Additional Configurations

  1. When using data transform for timestamp-based columns, either remove or comment the replicat property for default timestamp format gg.format.timestamp.
  2. By default, column data transform are disabled for Key columns. You can use the following property to enable transform for key columns too: gg.transform.includeKeyColumn=true.

Note:

In case, there's no Primary Key for a source table, all the table columns comprised together are considered as primary keys. Ensure that the includeKeyColumn property is enabled for such cases as well.

10.2.39.6.4 Troubleshooting and Diagnostics

  1. Ensure that all the required transform parameters are declared under the replicat properties file.

    When the data transform is not configured appropriately and the replicat properties file has missing/invalid gg.transform properties, replicat will just skip this transform and continue.

    Replicat will also throw the following Warning messages for these scenarios.

    Transform property is not set [gg.transform.{name}.matcher.tableRegex].

    Transform property is not set [gg.transform.{name}.matcher.columnRegex].

  2. Ensure that the regex specified under each of the matcher/converter properties are valid regex strings.

    Replicat will throw the following error message and exception in case there’s an invalid regex configured: PatternSyntaxException – If the regular expression's syntax is invalid.

    Fix the regex errors in order to continue with the replicat process.

  3. For the custom transform, ensure the implemented custom class has been correctly added to the classpath.

    Replicat throws the following error message in this case and it skips this transform and continues:

    Could not find transform class instance for type {type}.

    Ensure to add the custom class to the gg.classpath property.
  4. By default, the column data transform is disabled for Key columns. Use the following property to enable transform for key columns: gg.transform.includeKeyColumn=true.

    Note:

    In case, there's no Primary Key for a source table, all the table columns comprised together are considered as primary keys. Ensure that the includeKeyColumn property is enabled for such cases as well.