11 Using the MongoDB Handler

This chapter explains the MongoDB Handler and includes examples so that you can understand this functionality.

Topics:

11.1 Overview

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling.

See the MongoDB website for more information:

https://www.mongodb.com/

You can use the MongoDB Handler to replicate the transactional data from Oracle GoldenGate trail to a target MongoDB database.

11.2 Detailed Functionality

The MongoDB Handler takes operations from the source trail file and creates corresponding documents in the target MongoDB database.

A record in MongoDB is a Binary JSON (BSON) document, which is a data structure composed of field and value pairs. A BSON data structure is a binary representation of JSON documents. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.

A collection is a grouping of MongoDB documents and is the equivalent of an RDBMS table. In MongoDB, databases hold collections of documents. Collections do not enforce a schema. MongoDB documents within a collection can have different fields.

Topics:

11.2.1 Document Key Column

MongoDB databases require every document (row) to have a column named _id whose value should be unique in a collection (table). This is similar to a primary key for RDBMS tables. If a document does not contain a top-level _id column during an insert, the MongoDB driver adds this column.

The MongoDB Handler builds custom _id field values for every document based on the primary key column values in the trail record. This custom _id is built using all the key column values concatenated by a : (colon) separator. For example:

KeyColValue1:KeyColValue2:KeyColValue3

The MongoDB Handler enforces uniqueness based on these custom _id values. This means that every record in the trail must be unique based on the primary key columns values. Existence of non-unique records for the same table results in a MongoDB Handler failure and in Replicat abending with a duplicate key error.

The behavior of the _id field is:

  • By default, MongoDB creates a unique index on the column during the creation of a collection.

  • It is always the first column in a document.

  • It may contain values of any BSON data type except an array.

11.2.2 Primary Key Update Operation

MongoDB databases do not allow the _id column to be modified. This means a primary key update operation record in the trail needs special handling. The MongoDB Handler converts a primary key update operation into a combination of a DELETE (with old key) and an INSERT (with new key). To perform the INSERT, a complete before-image of the update operation in trail is recommended. You can generate the trail to populate a complete before image for update operations by enabling the Oracle GoldenGate GETUPDATEBEFORES and NOCOMPRESSUPDATES parameters, see Reference for Oracle GoldenGate for Windows and UNIX.

11.2.3 MongoDB Trail Data Types

The MongoDB Handler supports delivery to the BSON data types as follows:

  • 32-bit integer

  • 64-bit integer

  • Double

  • Date

  • String

  • Binary data

11.3 Setting Up and Running the MongoDB Handler

Instructions for configuring the MongoDB Handler components and running the handler are described in the following sections.

Topics:

11.3.1 Classpath Configuration

The MongoDB Java Driver is required for Oracle GoldenGate for Big Data to connect and stream data to MongoDB. The recommended version of MongoDB Java Driver is 3.2.2. The MongoDB Java Driver is not included in the packaging of Oracle GoldenGate for Big Data so you must download the driver from:

https://docs.mongodb.com/ecosystem/drivers/java/#download-upgrade

Select “mongo-java-driver" and the "3.2.2" version to download the recommended driver JAR file.

You must configure the gg.classpath variable to load the MongoDB Java Driver JAR at runtime. For example: gg.classpath=/home/mongodb/mongo-java-driver-3.2.2.jar

11.3.2 MongoDB Handler Configuration

The following are the configurable values for the MongoDB Handler. These properties are located in the Java Adapter properties file (not in the Replicat properties file).

Table 11-1 MongoDB Handler Configuration Properties

Properties Required/ Optional Legal Values Default Explanation

gg.handler.name.type

Required

mongodb

None

Selects the MongoDB Handler for use with Replicat.

gg.handler.name.bulkWrite

Optional

true | false

true

Set to true, the handler caches operations until a commit transaction event is received. When committing the transaction event, all the cached operations are written out to the target MongoDB database, which provides improved throughput.

Set to false, there is no caching within the handler and operations are immediately written to the MongoDB database.

gg.handler.name.WriteConcern

Optional

{“w”: “value” , “wtimeout”: “number” }

None

Sets the required write concern for all the operations performed by the MongoDB Handler.

The property value is in JSON format and can only accept keys as “w” and “wtimeout”.

For more information about write concerns, see https://docs.mongodb.com/manual/reference/write-concern/.

gg.handler.name.username

Optional

A legal username string.

None

Sets the authentication username to be used. Use with the AuthenticationMechanism property.

gg.handler.name.password

Optional

A legal password string.

None

Sets the authentication password to be used. Use with the AuthenticationMechanism property.

gg.handler.name.ServerAddressList

Optional

IP:PORT with multiple port values delimited by a comma

None

Enables the connection to a list of Replicat set members or a list of MongoDB databases.

This property accepts a comma separated list of [hostnames:port]. For example, localhost1:27017,localhost2:27018,localhost3:27019.

For more information, see http://api.mongodb.com/java/3.0/com/mongodb/MongoClient.html#MongoClient-java.util.List-java.util.List-com.mongodb.MongoClientOptions-.

gg.handler.name.AuthenticationMechanism

Optional

Comma separated list of authentication mechanism

None

Sets the authentication mechanism which is a process of verifying the identity of a client. The input would be a comma separated list of various authentication options. For example, GSSAPI,MONGODB_CR,MONGODB_X509,PLAIN,SCRAM_SHA_1.

For more information about authentication options, see http://api.mongodb.com/java/3.0/com/mongodb/MongoCredential.html,

gg.handler.name.source

Optional

Valid authentication source

None

Sets the source of the user name, typically the name of the database where the user is defined. Use with the AuthenticationMechanism property.

gg.handler.name.clientURI

Optional

Valid MongoDB client URI

None

Sets the MongoDB client URI. A client URI can also be used to set other MongoDB connection properties, such as authentication and WriteConcern. For example, mongodb://localhost:27017/

For more details about the format of the client URI, see http://api.mongodb.com/java/3.0/com/mongodb/MongoClientURI.html

gg.handler.name.Host

Optional

Valid MongoDB server name or IP address

None

Sets the MongoDB database hostname to connect to based on a (single) MongoDB node see http://api.mongodb.com/java/3.0/com/mongodb/MongoClient.html#MongoClient-java.lang.String-.

gg.handler.name.Port

Optional

Valid MongoDB port

None

Sets the MongoDB database instance port number. Use with the Host property.

gg.handler.name.CheckMaxRowSizeLimit

Optional

true | false

false

Set to true, the handler always checks the size of the BSON document inserted or modified to be within the limits defined by MongoDB database. Calculating the size involves the use of a default codec to generate a RawBsonDocument leading to a small degradation in the throughput of the MongoDB Handler.

If the size of the document exceeds the MongoDB limit, an exception occurs and Replicat abends.

11.3.3 Connecting and Authenticating

You can use various connection and authentication properties which can be configured in the handler properties file. When multiple connection properties are specified, the MongoDB Handler chooses the properties based on the following priority order:

Priority 1:
AuthentictionMechanism
UserName
Password
Source
Write Concern
Priority 2:
ServerAddressList
AuthentictionMechanism
UserName
Password
Source
Priority 3:
clientURI
Priority 4:
Host
Port
Priority 5:
Host

If none of the connection and authentication properties are specified, the handler tries to connect to localhost on port 27017.

11.3.4 Using Bulk Write

The MongoDB Handler uses the GROUPTRANSOPS parameter to retrieve the batch size. A batch of trail records are converted to a batch of MongoDB documents then written in one request to the database.

You can enable bulk write for better apply throughput using the BulkWrite handler property . By default, this is enabled and this is the recommended setting for the best performance of the handler..

You use the gg.handler.handler.BulkWrite=true | false property to enable or disable bulk write. The Oracle GoldenGate for Big Data default property, gg.handler.handler.mode=op | tx, is not used in the MongoDB Handler.

Oracle recommends that you use bulk write.

11.3.5 Using Write Concern

Write concern describes the level of acknowledgement requested from MongoDB for write operations to a standalone MongoDB, replica sets, and sharded-clusters. With sharded clusters, mongos instances will pass the write concern on to the shards.

Use the following configuration:

w: value
wtimeout: number

https://docs.mongodb.com/manual/reference/write-concern/

11.3.6 Using Three-Part Table Names

An Oracle GoldenGate trail may have data for sources that support three-part table names, such as Catalog.Schema.Table. MongoDB only supports two-part names, such as DBName.Collection. To support the mapping of source three-part names to MongoDB two-part names, the source Catalog and Schema is concatenated with an underscore delimiter to construct the Mongo DBName.

For example, Catalog.Schema.Table would become catalog1_schema1.table1.

11.3.7 Using Undo Handling

The MongoDB Handler can recover from bulk write errors using a lightweight undo engine. This engine does not provide the functionality provided by typical RDBMS undo engines, rather the best effort to assist you in error recovery. The error recovery works well when there are primary violations or any other bulk write error where the MongoDB database is able to provide information about the point of failure through the BulkWriteException.

Table 11-2lists the requirements to make the best use of this functionality.

Table 11-2 Undo Handling Requirements

Operation to Undo Require Full Before Image in the Trail?

INSERT

No

DELETE

Yes

UPDATE

No (Before image of fields in the SET clause.)

If there are errors during undo operations, it may be not possible to get the MongoDB collections to a consistent state so you would have to do a manual reconciliation of data.

11.4 Sample Configuration

The following is sample configuration for the MongoDB Handler from the Java Adapter properties file:

gg.handlerlist=mongodb
gg.handler.mongodb.type=mongodb

#The following handler properties are optional.
#Please refer to the Oracle GoldenGate for BigData documentation
#for details about the configuration.
#gg.handler.mongodb.clientURI=mongodb://localhost:27017/
#gg.handler.mongodb.Host=<MongoDBServer address>
#gg.handler.mongodb.Port=<MongoDBServer port>
#gg.handler.mongodb.WriteConcern={ w: <value>, wtimeout: <number> }
#gg.handler.mongodb.AuthenticationMechanism=GSSAPI,MONGODB_CR,MONGODB_X509,PLAIN,SCRAM_SHA_1
#gg.handler.mongodb.UserName=<Authentication username>
#gg.handler.mongodb.Password=<Authentication password>
#gg.handler.mongodb.Source=<Authentication source>
#gg.handler.mongodb.ServerAddressList=localhost1:27017,localhost2:27018,localhost3:27019,...
#gg.handler.mongodb.BulkWrite=<false|true>
#gg.handler.mongodb.CheckMaxRowSizeLimit=<true|false>

goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec

#Path to MongoDB Java driver.
# maven co-ordinates
# <dependency>
# <groupId>org.mongodb</groupId>
# <artifactId>mongo-java-driver</artifactId>
# <version>3.2.2</version>
# </dependency>
gg.classpath=/path/to/mongodb/java/driver/mongo-java-driver-3.2.2.jar
javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar:./dirprm