1 Introduction to Oracle GoldenGate Veridata

Oracle GoldenGate Veridata compares one set of data to another and identifies data that is out-of-sync, and allows you to repair that data.

This chapter includes the following sections:

1.1 Oracle GoldenGate Veridata Architecture

Oracle GoldenGate Veridata compares one set of data to another, identifies data that is out-of-sync, and enables you to repair any out-of-sync data. Oracle GoldenGate Veridata supports high-volume, 24x7 heterogeneous replication environments where downtime to compare data sets is not an option. By accounting for data that is being replicated while a comparison takes place, Oracle GoldenGate Veridata can run concurrently with data transactions and replication and still produce an accurate comparison report.

Oracle GoldenGate Veridata maps column data types across different types of databases automatically. For more information about this feature in Oracle GoldenGate Veridata web user interface, see the online help. Alternatively, you can map columns manually if automatic mapping does not accommodate format differences in a heterogeneous environment. To manually upload an XML file, use the veridata_import utility. For more information, see Using the Veridata Import and Export Utilities.

Figure 1-1 Oracle GoldenGate Veridata Architecture

Description of Figure 1-1 follows
Description of "Figure 1-1 Oracle GoldenGate Veridata Architecture"

Oracle GoldenGate Veridata Server

The Oracle GoldenGate Veridata Server performs the following functions:

  • Coordinate the execution of Oracle GoldenGate Veridata tasks

  • Sort rows

  • Compare data

  • Confirm out-of-sync data

  • Produce a report for review

Oracle GoldenGate Veridata Web User Interface

The Oracle GoldenGate Veridata web user interface (UI) is a browser-based graphical user interface for these activities:

  • Configure comparison objects and rules

  • Initiate comparisons

  • Review the status and output of comparisons

  • Repair out-of-sync data

  • Review out-of-sync data

Oracle GoldenGate Veridata Repository

The Oracle GoldenGate Veridata repository is a collection of database objects that persists configuration information to disk, saving it permanently as a user environment.

Note:

Out-of-sync data is not stored in the repository. This data is stored on the server’s file system.

Oracle GoldenGate Veridata Agent

The Oracle GoldenGate Veridata Agent (agent) executes the following database-related requests on behalf of the Oracle GoldenGate Veridata server:

  • Hash rows for initial comparison

  • Fetch and update rows to repair out-of-sync data

  • Return column-level details for out-of-sync rows

Oracle GoldenGate Veridata Manager

The Oracle GoldenGate Veridata Manager is part of the C-code agent that is required for the NonStop platform. It controls the agent process.

The Manager is not used in a Java agent, which is used for the other databases that are supported by Oracle GoldenGate Veridata.

Oracle GoldenGate Veridata Command Line Utilities

Oracle GoldenGate Veridata includes the following command-line utilities:

Table 1-1 Command-Line Utilities

Name Description

vericom

Enables you to run comparisons by using automated programs. See Managing Identities and Comparing Data.

veridata_import

Maps comparison objects and rules in an XML file and imports it into the repository. See Using the Veridata Import and Export Utilities.

veridata_export

Maps comparison objects and rules in the repository and exports them to an XML file. See Using the Veridata Import and Export Utilities.

veridata_param_process

Helps to use the Oracle GoldenGate parameter files. See Introduction to Oracle GoldenGate Veridata.

reportutil

Supports viewing encrypted report files and out-of-sync data.

1.2 Configuring Single Sign-on for Oracle GoldenGate Veridata

Oracle GoldenGate Veridata 12c (12.2.1) supports single sign-on (SSO) for authentication. When you configure SSO, set the SSO properties of the server and configure the sign-out URL for the SSO session.

To configure SSO for Oracle GoldenGate Veridata server:

  • Run the configureVeridata script: DOMAIN_HOME/veridata/bin/configureVeridata.sh -pUweb.singleSignOutUrl=Single sign out URL.

If your domain no longer uses SSO, then remove the SSO logout configuration: DOMAIN_HOME/veridata/bin/configureVeridata.sh -pUweb.singleSignOutUrl=default.

The parameter usage is explained in the veridata.cfg file:

# (web.singleSignOutUrl) as
# web.singleSignOutUrl - Specify the Single Sign Out URL here:
# Formats: /oamsso/logout.html?end_url=/veridata
# http://myoamserverhost:port/oam/server/logout?end_url=http://my.veridata.site.com:veridata-port/veridata
# http://myoamserverhost:port/oamsso/logout.html?end_url==http://my.veridata.site.com:veridata-port/veridata
 
# This URL must conform to the grammar in RFC 2396, except the few deviations mentioned in the java documentation for construction of a URI by parsing the given string.
 
web.singleSignOutUrl default 

1.3 Comparing Data with Oracle GoldenGate Veridata

This section explains how to configure the objects that are to be compared and how Oracle GoldenGate Veridata processes comparisons.

1.3.1 Oracle GoldenGate Veridata Comparison Objects

To begin using Oracle GoldenGate Veridata, you create the following objects that help you manage your work and identify the data that you want to compare:

  • Data source connections

  • Groups

  • Compare Pairs

  • Profiles

  • Jobs

To order rows for comparison, Oracle GoldenGate Veridata relies on a unique identifier (primary key, unique key, user-defined key).

1.3.1.1 Supported Datatypes - Oracle

The Oracle GoldenGate Veridata supports the listed datatypes for the Oracle database:

  • CHAR
  • NCHAR
  • VARCHAR2
  • VARCHAR
  • NVARCHAR2
  • NUMBER
  • DATE
  • BLOB
  • CLOB
  • NCLOB
  • LONG
  • RAW
  • LONG RAW
  • ROWID
  • BINARY_FLOAT
  • BINARY_DOUBLE
  • DATE
  • TIMESTAMP
  • TIMESTAMP WITH TIMEZONE
  • TIMESTAMP WITH LOCAL TIME ZONE
  • UROWID
  • XMLTYPE
  • INTERVAL YEAR [(year_precision)] TO MONTH
  • INTERVAL DAY [(day_precision)] TO SECOND [(fractional_seconds_precision)]
  • FLOAT
  • UDTs

Limitations and Clarifications:

Both Object types and Collection types UDTs are supported

1.3.1.2 Supported Datatypes - SQL Server

The Oracle GoldenGate Veridata supports the listed datatypes for the SQL Server database:

  • BIGINT
  • BIT
  • INT
  • SMALLINT
  • TINYINT
  • DECIMAL
  • MONEY
  • SMALLMONEY
  • NUMERIC
  • FLOAT
  • REAL
  • CHAR
  • NCHAR
  • VARCHAR
  • NVARCHAR
  • TEXT
  • NTEXT
  • BINARY
  • VARBINARY
  • IMAGE
  • DATE
  • DATETIME
  • DATETIME2
  • SMALLDATETIME
  • DATTIMEOFFSET
  • TIME
  • GEOGRAPHY
  • GEOMETRY
  • UNIQUEIDENTIFIER
  • XML

For more information about conversion of SQL Server data types to Oracle data types, see Data Type Conversion in Database Gateway for SQL Server User's Guide.

1.3.1.3 Supported Datatypes - PostgreSQL

The Oracle GoldenGate Veridata supports the listed datatypes for PostgreSQL:

  • Bit(n)
  • Bit Varying(n)
  • Boolean
  • Char
  • citext
  • Varchar(n)
  • Time with/without timezone
  • Date
  • Interval
  • Bigint
  • Serial
  • Smallserial
  • Bigserial
  • Numeric
  • Decimal
  • Money
  • Real
  • Double precision
  • cidr
  • inet
  • macaddr
  • macaddr8
  • uuid
  • text
  • bytea (binary)
  • xml
  • smallint
  • integer
  • json
  • jsonb

Non-Supported PostgreSQL Data Types

  • arrays
  • box
  • circle
  • composite types
  • line
  • lseq
  • object identifiers
  • OID
  • pg_lsn
  • pseduo types
  • Point
  • path
  • polygon
  • range types
  • tsvector
  • tsquery
  • enum
  • domain

Limitations of Support

  • Bit(n)/Bit Varying(n):
    • Heterogeneous: As the source side, Bit(n)/Bit Varying(n) can only be mapped with a character type of the non-PostgreSQL target database. This is because, the leading "0" of the source data gets truncated in the target db during repair, for example, the target is a number type.
    • Homogeneous: None.
  • Network datatype (inet,cidr,uuid):
    • Heterogeneous: All string datatypes like char/nchar/varchar in source side and network datatypes like inet/cidr/uuid in PostgreSQL target side can only considered as hash column.
    • Homogeneous: None.
  • MAC/MAC8 datatype:
    • Heterogeneous: All string datatypes like char/nchar/varchar in source side and datatypes like mac/mac8 in PostgreSQL target side, while repair operation, it is successful but it always save in the one formatxx:xx:xx:xx:xx:xx even though the allowed format is multiple, that is xxx:xxx:xxx:xxx or xx-xx-xx-xx-xx-xx. Therefore, it always shows the OOS after repair whenever there is a different format from source side.
    • Homogeneous: None.
  • JSONB datatype:
    • Heterogeneous: All string datatypes like char/nchar/varchar in source side and jsonb in PostgreSQL target side, after repair the order of json key/values are not stored in original order. Therefore, it will be OOS for next compare pair run.
    • Homogeneous: None.
  • Timestamp with timezone:
    • Heterogeneous: PostgreSQL is not storing the timezone value in the database. Whenever a timestamp with timezone value is inserted, PostgreSQL converts the timestamp into UTC and inserts into it. While retrieving, the actual timezone is not known. Comparing it with the DBs like oracle, sqlserver which stores complete timestamp along with timezone result in OOS all the time.
  • Time/Time with Timezone:
    • Homogeneous: PostgreSQL- PostgreSQL Time/Time with Timezone datatype comparison and repair has inconsistencies when the time column is primary key. Some of the rows may not be picked for compare and repair.
  • Real:
    • Heterogeneous: Oracle GoldenGate for Veridata cannot support the compare pair of FLOAT (Oracle) to REAL (PostgreSQL). Some data is always OOS after repair due to the internal representation of the data type in db. Substitute pairs are binary float (Oracle) to real (PostgreSQL) and float (Oracle) to numeric (PostgreSQL).
    • Real does not store the value as exact number in database. For example, 0.8 is stored as 0.800000011920929. On compare it always shows extra values, but it does not impact the repair functionality.
  • Interval:
    • Heterogeneous: During repair Veridata may insert/update 00 to column.
      Oracle GoldenGate Veridata currently uses CAST (? AS INTERVAL) as part of its insert and update queries. This cast defaults to INTERVAL SECOND. Therefore, when values like '05', '57', or '98' are used as cast, they result in values, such as:
      • 5 --> 00:00:05
      • 57 --> 00:00:57
      • 98 --> 00:01:38
      Therefore, if these values are inserted/updated to a column with INTERVAL HOUR, the result is 00. The same is for other type of INTERVAL.
  • citext

    Repair function is not fully supported when citext is mapped to a string-like data type on either source or target side. When citext is on source side,following change is needed for repair to work: go to Profile Configuration > Edit Existing Profile > Repair, and unselect the Check Changed Values setting.

1.3.1.4 Supported Datatypes - MySQL, MariaDB

The Oracle GoldenGate Veridata supports the listed datatypes for the MySQL, MariaDB databases:

  • BIGINT
  • BINARY
  • BIT
  • BLOB
  • CHAR
  • DATE
  • DATETIME
  • DECIMAL
  • DOUBLE
  • ENUM
  • FLOAT
  • INT
  • INTEGER
  • LONGBLOB
  • LONGTEXT
  • MEDIUMBLOB
  • MEDIUMINT
  • NUMERIC
  • REAL
  • SET
  • SMALLINT
  • TEXT
  • TIME
  • TIMESTAMP
  • TINYBLOB
  • TINYINT
  • TINYTEXT
  • VARCHAR
  • VARBINARY
  • YEAR
Limitations and Clarifications
  • For Oracle GoldenGate Veridata repair to properly throw errors when invalid or missing values are detected, user should set SQL Mode to "strict". This can be set on global level when MySQL starts, or session level via connection string in agent.properties, for example: database.url=jdbc:mysql://host:3306?sessionVariables=sql_mode=(select concat(@@SESSION.sql_mode,',STRICT_TRANS_TABLES'))
  • When using REAL as primary key, MySQL client and Veridata may not able to retrieve certain rows (for example, WHERE `PKCOL` = '-99.9999'), Oracle GoldenGate Veridata skips these rows during compare and repair.
  • When using nonstandard FLOAT(M,D) and DOUBLE(M,D), the (M,D) in DDL need to match the actual precision and scale stored, and both Source and Target need to be MySQL.
1.3.1.5 Supported Datatypes - Teradata

The Oracle GoldenGate Veridata supports the listed datatypes for the Teradata database:

  • ARRAY
  • BIGINT
  • BLOB
  • BYTE
  • BYTEINT
  • CHAR
  • CLOB
  • DATE
  • DECIMAL
  • DISTINCT
  • FLOAT
  • GRAPHIC
  • INTEGER
  • INTERVAL DAY
  • INTERVAL DAY TO HOUR
  • INTERVAL DAY TO MINUTE
  • INTERVAL DAY TO SECOND
  • INTERVAL HOUR
  • INTERVAL HOUR TO MINUTE
  • INTERVAL HOUR TO SECOND
  • INTERVAL MINUTE
  • INTERVAL MINUTE TO SECOND
  • INTERVAL MONTH
  • INTERVAL SECOND
  • INTERVAL YEAR
  • INTERVAL YEAR TO MONTH
  • LONG VARCHAR
  • LONG VARGRAPHIC
  • NUMBER
  • SMALLINT
  • STRUCT
  • TIME
  • TIME WITH TIME ZONE
  • TIMESTAMP
  • TIMESTAMP WITH TIME ZONE
  • VARBYTE
  • VARCHAR
  • VARGRAPHIC

1.3.1.6 Supported Datatypes - DB2 LUW

The Oracle GoldenGate Veridata supports the listed datatypes for the DB2 LUW database:

  • BIGINT
  • BLOB
  • CHAR
  • CHAR FOR BIT DATA
  • CLOB
  • DATE
  • DBCLOB
  • DECFLOAT
  • DECIMAL
  • DOUBLE
  • FLOAT
  • GRAPHIC
  • INTEGER
  • LONG VARCHAR
  • LONG VARCHAR FOR BIT DATA
  • LONG VARGRAPHIC
  • NUMERIC
  • REAL
  • SMALLINT
  • TIME
  • TIMESTAMP
  • VARCHAR
  • VARCHAR FOR BIT DATA
  • VARGRAPHIC
  • XML
1.3.1.7 Supported Datatypes - DB2 for i

The Oracle GoldenGate Veridata supports the listed datatypes for DB2 for i:

  • BIGINT
  • BLOB
  • CHAR
  • CHAR FOR BIT DATA
  • CLOB
  • DATE
  • DBCLOB
  • DECFLOAT
  • DECIMAL
  • DOUBLE
  • FLOAT
  • GRAPHIC
  • INTEGER
  • LONG VARCHAR
  • LONG VARCHAR FOR BIT DATA
  • LONG VARGRAPHIC
  • NUMERIC
  • REAL
  • ROWID
  • SMALLINT
  • TIME
  • TIMESTAMP
  • VARCHAR
  • VARCHAR FOR BIT DATA
  • VARGRAPHIC
  • XML
Limitations and Clarifications
  • The decimal part of REAL value from the wldb2 jdbc driver always store in the 16 digits precision, for example, 0.8 converts 0.800000011920929 and compare will always fail if the decimal values are present.
1.3.1.8 Supported Datatypes - DB2 z/OS

The Oracle GoldenGate Veridata supports the listed datatypes for the DB2 z/OS database:

  • BIGINT
  • BINARY
  • BLOB
  • CHAR
  • CHAR FOR BIT DATA
  • CLOB
  • DATE
  • DBCLOB
  • DECFLOAT
  • DECIMAL
  • DOUBLE
  • FLOAT
  • GRAPHIC
  • INTEGER
  • LONG VARCHAR
  • LONG VARCHAR FOR BIT DATA
  • LONG VARGRAPHIC
  • NUMERIC
  • REAL
  • ROWID
  • SMALLINT
  • TIME
  • TIMESTAMP
  • TIMESTAMP WITH TIME ZONE
  • VARBINARY
  • VARCHAR
  • VARCHAR FOR BIT DATA
  • VARGRAPHIC
  • XML
1.3.1.9 Supported Datatypes - Sybase

The Oracle GoldenGate Veridata supports the listed datatypes for the Sybase database:

  • bigdatetime
  • bigint
  • bigtime
  • binary
  • bit
  • char
  • date
  • datetime
  • decimal
  • float
  • image
  • int
  • money
  • numeric
  • real
  • smalldatetime
  • smallint
  • smallmoney
  • sysname
  • text
  • time
  • timestamp
  • tinyint
  • unichar
  • unitext
  • univarchar
  • unsigned bigint
  • unsigned int
  • unsigned smallint
  • varbinary
  • varchar
1.3.1.10 Supported Datatypes - Informix

The Oracle GoldenGate Veridata supports the listed datatypes for the Informix database:

  • bigint
  • bigserial
  • blob
  • boolean
  • byte
  • char
  • clob
  • date
  • datetime hour to second
  • datetime year to day
  • datetime year to fraction(5)
  • datetime year to second
  • decimal
  • float
  • html
  • int8
  • integer
  • interval
  • lvarchar
  • money
  • nchar
  • nvarchar
  • serial
  • serial8
  • smallfloat
  • smallint
  • text
  • varchar
1.3.1.11 Generic Limitations and Clarifications

This topic lists a few generic limitations and clarifications:

  • As the floating-point data types are approximate values by definition, Oracle GoldenGate Veridata UI may display slightly different values than expected (for example, extra decimal values)
  • For database with TIME data type that supports value outside range of 00:00:00 to 23:59.59, the compare will only work when Source and Target are the same database.
  • Whenever there is a precision for any datatype, such as REAL/FLOAT or Double on the database side, Oracle GoldenGate Veridata rounds off the lower precision present either on the source or the target database. Veridata considers the lower precision of data during comparison phase.
  • Oracle GoldenGate Veridata internally converts decimal numbers with a large scale by truncating tail digits. The limitations applies to all databases except Oracle.

    This is because Oracle GoldenGate Veridata utilizes a template to mask decimal numbers, which, in cases of a large scale, leaves few digits for representing the scale part. For example, a number like 6.710123456789e-26, with a scale of 38 digits, would be truncated by Oracle GoldenGate Veridata to retain the first 15 digits, converting it to 0. For scenarios where preserving all digits of the scale is crucial, Oracle recommends using the string or other compare formats instead of the number compare format when configuring the compare pair.

1.3.2 Satisfying Uniqueness Requirements

Oracle GoldenGate Veridata relies on a unique identifier to order rows for comparison.

  • Primary Key: By default, Oracle GoldenGate Veridata uses the primary key if one is available.

  • Unique Key: If no primary key is defined, then Oracle GoldenGate Veridata uses the smallest unique index.

  • User-defined Key: If a table doesn't have a primary or a unique key, then all columns are mapped for comparison. Later, you can define an existing index or a set of columns for comparison purposes when you define a compare pair. However, although primary or unique keys can be mapped automatically, user-defined keys must be mapped manually. If you prefer a different ordering method, then you can also use a user-defined key to override existing keys or indexes.

1.3.3 How Oracle GoldenGate Veridata Compares Data

Comparison activities consist of an initial comparison step and a confirmation step. To change the steps’ aspects, change the parameters in the Oracle GoldenGate Veridata web user interface.

1.3.3.1 Initial Comparison Step

In the initial comparison (or row hash) step, rows are retrieved from the source and target tables with a query. If the source and target databases are of different types, the columns are converted to a standardized data type format for accurate comparison. By default, Oracle GoldenGate Veridata compares rows by comparing all columns of the primary key literally (value-for-value) and by using a hash value for all non-key columns. The unique digital signature that is used to calculate the hash value shrinks the data to be transferred over the network for the comparison. The signature still provides a highly reliable (but not absolute) and efficient mechanism for determining whether two rows contain the same or different column values.

To ensure that you discover out-of-sync rows, you can configure Oracle GoldenGate Veridata to compare non-key rows column-for-column. Full-column comparisons reduce the processing performance in proportion to the number of columns, and they increase network usage.

For all supported databases, you can use the delta processing performance feature if you are using server-side sorting. For this processing, Oracle GoldenGate Veridata detects which data blocks were modified since a previous comparison and only compares the rows in those blocks. Rows in unchanged blocks are skipped. The default is to compare all rows.

Delta processing consists of two steps:

  • Collect the base modification time of the previous run for subsequent delta comparisons. This step is always included when you enable delta processing for a compare pair.

  • Compare data that was modified since the base comparison and use the information that was collected in the first step. To enable this step, click Enable Delta Processing on the Compare Pair Configuration page and the Run/Execute Job page in the Oracle GoldenGate Veridata web user interface. Disable delta comparison when modifications, such as table reorganizations, invalidate the collected delta base information.

After the initial comparison, rows that appear to be out-of-sync are stored in a maybe out-of-sync (MOOS) queue in memory, because the comparison is inconclusive. When a replication is working concurrently with a comparison, especially if there is replication latency, rows can appear to be out-of-sync. However, the current data is in flight (somewhere in the replication flow), and replication resynchronizes them.

1.3.3.2 Confirmation Step

The confirmation, or confirm-out-of-sync (COOS), step ensures accurate results by confirming the row status in a changing environment. This step involves predicated queries on the source or target database by using the rows extracted from the MOOS queue. The status is evaluated as one of the following:

  • in-flight: The row was out-of-sync in the initial comparison step, but it was updated. It is assumed that replication or another mechanism applied the change, but Oracle GoldenGate Veridata was unable to confirm that the rows were in-sync.

  • in-sync: The source row values were applied to the target row by replication or another method. An in-sync status does not guarantee that the rows are synchronized at any particular moment if the underlying tables are continuously changing, but it does indicate that replication is working.

  • persistently out-of-sync: The row was not updated since the initial comparison step, and can be assumed to be out-of-sync.

By default, confirmation processing occurs in a thread that is parallel to the initial comparison step, but the confirmation of each row waits until a specified replication latency threshold expires. For example, if latency is 60 seconds, and the initial comparison step revealed an out-of-sync row at 9:30, then the confirmation step for that row is not performed until 9:31 to allow replication to apply any change that was in-flight. After latency is accounted for, rows can be confirmed as persistently out-of-sync and are stored in one or more out-of-sync reports.

1.4 Viewing Comparison Results

When a job is completed, you can view the out-of-sync report and the comparison reports by using the Oracle GoldenGate Veridata web user interface or by viewing the files themselves.

If report encryption is enabled for the server, then use the reportutil tool to view the report files. See Encrypting Report Files. The Oracle GoldenGate Veridata web user interface automatically decrypts the file before displaying them.

You can store an OOS report in binary format, in XML format, or both (or none).
  • OOS file: When stored in binary form, the OOS report contains out-of-sync comparison results that you use for viewing row differences in the Oracle GoldenGate Veridata web user interface. You can also use the report to recompare out-of-sync rows later. To recompare rows, select run options to execute another confirmation step. The step compares the current state of just those rows and then reports which rows remain out-of-sync after replication or another restorative procedure was applied.

  • OOSXML file: When the OOS report is stored as XML, it is written to an OOSXML file and is stored to conform to an internal XML schema. Among the many advantages of XML, the most significant is that it can be easily manipulated by many tools. In its XML form, the file contains all information, including metadata, that is needed to select rows for resynchronization by external programs.

Each finished job, group, and compare pair generates a comparison report with the following type of information:
  • Comparison parameters used

  • Number of rows compared and out-of-sync

  • Timing of the comparison

  • Performance statistics

  • Source and target data values

By default, the OOS files are located in the following subdirectories of the server installation directory. You can change the default location by specifying another path for the server.veridata_data property in the veridata.cfg file:
  • OOS files: VERIDATA_DOMAIN_HOME/veridata/reports/oos

  • OOSXML files: VERIDATA_DOMAIN_HOME/veridata/reports/oosxml

These subdirectories are further organized by run ID, job name, group name, and compare pair. In the OOSXML directory, the .oosxml files are the control files. The files with sequential file extensions are the JSON chunks. The JSON data is spread into multiple files (called "chunks") for performance purposes. You can encrypt the comparison reports. For more information, see Encrypting Report Files.