Using Database Data Comparison

A data comparison operation compares data in a set of database objects in a candidate database with those in a reference database. To compare objects residing in the same database, select that database as both the reference and the candidate. You can create a comparison specifying which objects are to be compared and submit a Cloud Control job to compare them immediately or at a later time. On job completion, select the data comparison and view results. The results will be purged when you delete the comparison.

Cloud Control data comparison uses DBMS_COMPARISON package for comparison. It can compare the following types of database objects:

  • Tables

  • Single-table views

  • Materialized views

  • Synonyms for tables, single-table views, and materialized views

Database objects of different types can be compared at different databases. For example, a table at one database and a materialized view at another database can be compared.

Requirements for Database Data Comparisons

For data comparison, you will need to meet the requirements explained in this section.

The database character sets must be the same for the databases that contain the database objects being compared.

For index column, the number, timestamp, and interval columns datatypes are as follows:

  • Number columns are of the following datatypes: NUMBER, FLOAT, BINARY_FLOAT, and BINARY_DOUBLE.

  • Timestamp columns are of the following datatypes: TIMESTAMP, TIMESTAMP WITH TIME ZONE, and TIMESTAMP WITH LOCAL TIME ZONE.

  • Interval columns are of the following datatypes: INTERVAL YEAR TO MONTH and INTERVAL DAY TO SECOND.

The database objects must have one of the following types of indexes:

  • A single-column index on a number, timestamp, interval, DATE, VARCHAR2, or CHAR datatype column.

  • A composite index that only includes number, timestamp, interval, DATE, VARCHAR2, or CHAR columns. Each column in the composite index must either have a NOT NULL constraint or must be part of the primary key.

If the database objects do not have one of these types of indexes, then the EM data comparison does not support the database objects. For example, if the database objects only have a single index on an NVARCHAR2 column, then the data comparison does not support them. Or, if the database objects have only one index, and it is a composite index that includes a NUMBER column and an NCHAR column, then the data comparison does not support them.

The index columns in a comparison must uniquely identify every row involved in a comparison. The following constraints satisfy this requirement:

  • A primary key constraint.

  • A unique constraint on one or more non-NULL columns.

If you specify an index, then make sure the columns in the index meet these requirements for comparison.

Data Comparison feature in Cloud Control can compare data in columns of the following datatypes:

  • VARCHAR2

  • NVARCHAR2

  • NUMBER

  • FLOAT

  • DATE

  • BINARY_FLOAT

  • BINARY_DOUBLE

  • TIMESTAMP

  • TIMESTAMP WITH TIME ZONE

  • TIMESTAMP WITH LOCAL TIME ZONE

  • INTERVAL YEAR TO MONTH

  • INTERVAL DAY TO SECOND

  • RAW

  • CHAR

  • NCHAR

If a column with datatype TIMESTAMP WITH LOCAL TIME ZONE is compared, then the two databases must use the same time zone. Also, if a column with datatype NVARCHAR2 or NCHAR is compared, then the two databases must use the same national character set.

Data comparison feature cannot compare data in columns of the following datatypes:

  • LONG

  • LONG RAW

  • ROWID

  • UROWID

  • CLOB

  • NCLOB

  • BLOB

  • BFILE

  • User-defined types (including object types, REFs, varrays, and nested tables)

  • Oracle-supplied types (including any types, XML types, spatial types, and media types)

You can compare database objects that contain unsupported columns by excluding the unsupported columns when providing comparison specification. Edit the comparison item and include only the supported columns in the Columns To Include list of column names.

Since data comparison cannot compare LOB column values directly, their cryptographic hashes will instead be used for comparison. If you include LOB type columns to be compared, make sure that the database users connecting to the reference and candidate databases have EXECUTE privilege on SYS.DBMS_CRYPTO package. For more information about DBMS_COMPARISON, see Database PL/SQL Packages and Types Reference book for the database version of your reference database.

Note:

A Data Comparison job may fail with the error "ORA-28759: failure to open file error."

This failure occurs when data comparison tries to get data from the candidate database into the reference database over a database link in the reference database for comparing them.

The database server (candidate/source database) requires the use of TCPS protocol for connections, but the client (reference/destination database) does not have a valid wallet location. Connection over the database link fails since no wallet was specified on the client side.

This problem can be fixed by specifying a valid WALLET_LOCATION entry in sqlnet.ora file (which is by default located in the $ORACLE_HOME/network/admin directory). The following wallet location must be specified at the reference database:

WALLET_LOCATION = (SOURCE=(METHOD=FILE)(METHOD_DATA=(DIRECTORY=/net/slc05puy/scratch/dbwallets/s wallets)))

Comparing Database Data and Viewing Results

The following procedure enables you to specify which pairs of objects you want to compare in the reference and candidate databases, submit a job to process your choices, then view the differences after the job successfully completes.

  1. From the main Data Comparisons page, click Create. The Create Data Comparison page appears.

  2. Provide the required input:

    1. If you want to compare objects residing in two databases, select one database as the Reference and the other as the Candidate.

      • The Reference database always executes the comparison, so it must be version 11g or later. The Candidate database must be version 10g or later.

      • Be advised that the Reference database carries an additional processing load and requires some space to store the row IDs of differing rows (not the entire rows themselves). If you compare data between a production system and a test system, it might be appropriate to process and store the results on the test system.

    2. Click OK when you have finished. The Data Comparison Specification page appears.

      Tip:

      It is recommended that you define the comparison specification once and run it many times.

  3. Open the Actions menu, then select Add Object Pair or Add Multiple Objects. If you select Object Pair, continue with the following sub-steps. If you select Multiple Objects, go to the next step.

    • Adding an object pair consists of selecting one object from the reference database and one object from the candidate database. You can compare dissimilar object types, if desired, such as a table in the reference database and a materialized view in the candidate database.

    1. Specify the reference and candidate objects. The reference database can be the same as the candidate database. In this case, the objects are from the same database.

    2. Select one or more columns in the reference or candidate databases for comparison. The columns included must be common to both objects.

    3. Optionally select an index to be used for comparison. Columns in the comparison index must uniquely identify every row involved in a comparison. An index used for a primary key constraint or a unique constraint on one or more non-NULL columns satisfies this requirement. The comparison can use the specified index only if you select all of the columns in the list of Columns To Include.

      You can select a composite index if you want to add multiple index columns.

    4. Specify an optional Where Condition per pair of objects being compared.

    5. Either specify or let the system compute the maximum number of buckets and minimum rows per bucket.

    6. Specify the point in time you want to compare data.

      • The System Change Number (SCN) is a sequential counter that uniquely identifies a precise moment in the database. This is the most accurate way to identify a moment in time. Whenever you commit a transaction, Oracle records a new SCN. You can obtain SCNs from the alert log.

    7. When you have finished the configuration, click OK.

      The Data Comparison Specification page reappears, showing your selected objects in the list.

    8. Click OK, then go to Step 5.

  4. Open the Actions menu, then select Add Multiple Objects.

    • Adding multiple objects enables you to conveniently perform a bulk inclusion of multiple objects from the reference database into the specification. You can search and select multiple objects, such as many tables and views, from the reference database list of values, and then edit each item as needed.

      1. Specify the schema name, one or more object types, then click Search.

        The table populates with object names.

      2. Select the objects you want to compare, then click OK.

        The Data Comparison Specification page reappears, showing your selected objects in the list.

  5. Select your comparison name from the list, open the Actions menu, then select Submit Comparison Job. For information about privileges required for user credentials for the reference and candidate databases, see Overview of Change Management for Databases.

  6. Provide the required credentials in the page, schedule the job, then click OK.

    The Data Comparisons page reappears and displays the following confirmation message:

    "The job was submitted successfully. Click the link in the Job Status column to view job status."

    After the Job Status column shows Succeeded, go to the next step.

  7. Select your comparison name from the list, open the Actions menu, then select View Results. The Data Comparison Results page appears.

  8. Look for rows in the Result column with the =/= symbol, indicating that there are differences between reference row and candidate row data.

    • Data comparison attempts to compare all tables. If there is an error, you can see the error message by selecting the Messages tab. An error message is indicated with an X instead of the = or =/= symbol.

    • You can see the SQL statements that are running to perform the comparison by clicking the Executed Statements tab.

  9. Select a dissimilar Reference/Candidate row, then click View Row Differences to see a detailed, indexed list of reference-only, candidate-only, and non-identical changed rows on the Row Data Differences page.

    • The Row Source column indicates the origin of each row of data as a whole. Furthermore, data in a row differing between reference and candidate are displayed in contrasting colors, indicating whether the source of the data is the reference or candidate database.

    • The comparison is shown based on a key column (depending on a chosen unique index). If the key column value is different, the row appears as a candidate or reference-only row. If other columns are different, the row appears as a non-identical row.

Schema Mapping

By default, a reference object will be compared with a candidate object in the same-named schema as the reference schema. Using schema mapping, you can optionally compare objects in a reference schema with objects in a different candidate schema. Any schema can only be mapped once. Provide reference and candidate schema names for mapping under the Schema Mapping section of the Data Comparison Specification page. Default candidate schema will then be picked from schema mapping you specified.

You may further override the candidate schema of individual item by editing the item, clicking the Override button next to the Candidate Object field, and explicitly specifying the candidate object belonging to any schema. For such items whose candidate objects are overridden in this way, schema mapping will be ignored.

Usage of Buckets

A bucket is a range of rows in a database object that is being compared. Buckets improve performance by splitting the database object into ranges and comparing the ranges independently. Every comparison divides the rows being compared into an appropriate number of buckets. The number of buckets used depends on the size of the database object and is always less than the maximum number of buckets specified for the comparison by the maximum number of buckets specified.

When a bucket is compared, the following results are possible:

  • No differences are found —

    The comparison proceeds to the next bucket.

  • Differences are found —

    The comparison can split the bucket into smaller buckets and compare each smaller bucket. When differences are found in a smaller bucket, the bucket is split into still smaller buckets. This process continues until the minimum number of rows allowed in a bucket is reached, whereupon a comparison reports whether there are differences in the bucket and identifies each row difference.

You can adjust the maximum number of buckets and minimum rows per bucket to achieve the best performance when comparing a particular database object.

The comparison program uses the ORA_HASH function on the specified columns in all the rows in a bucket to compute a hash value for the bucket. If the hash values for two corresponding buckets match, the contents of the buckets are assumed to match. The ORA_HASH function efficiently compares buckets, because row values are not transferred between databases. Instead, only the hash value is transferred.

Note:

If an index column for a comparison is a VARCHAR2 or CHAR column, the number of buckets might exceed the value specified for the maximum number of buckets.