Oracle8i Replication
Release 2 (8.1.6)

Part Number A76959-01

Library

Product

Contents

Index

Go to previous page Go to next page

2
Master Concepts & Architecture

This chapter explains the concepts and architecture of Oracle's master replication sites in both a single master and multimaster replication environments. This chapter covers the following topics:

Master Replication Concepts

To understand the architectural details of master replication, you need to understand concepts of master replication. Knowing how and why replication is used provides you with a greater understanding of how the individual architectural elements work together to create a multimaster replication environment.

What is Master Replication?

Oracle has two forms of master replication: single master replication and multimaster replication.

Single master replication involves a master site supporting one or more snapshot sites. All of the connected snapshots update their master site; all data conflict detection and resolution is performed at the master site.

See Also:

Chapter 3, "Snapshot Concepts & Architecture" for more information about snapshots. 

Oracle's multimaster replication, also known as peer-to-peer or n-way replication, allows multiple sites, acting as equal peers, to manage groups of replicated database objects. Applications can update any replicated table at any site in a multimaster configuration. Figure 2-1 illustrates a multimaster replication system.

Oracle database servers operating as master sites in a multimaster replication environment automatically work to converge the data of all table replicas, and ensure global transaction consistency and data integrity.

Master Sites

A master site can be both a node in a multimaster replication environment and/or the master for one or more snapshot sites in a single master replication environment. The replicated objects are stored at the master site and are available for user access.

Master Definition Site

In a multimaster replication environment, a single master site operates as the master definition site for a master group. This particular site performs many of the administrative and maintenance tasks for the multimaster replication environment.

There can be only one master definition site for a master group, though the master definition site can be any of the master sites in the multimaster environment. Additionally, the master definition site can be changed to a different master site if necessary.

A single master site supporting snapshot replication is by default the master definition site.

Why Use Multimaster Replication?

From a very basic point of view, replication is used to make sure that information (data) is available when and where you need it. The following sections describe several different environments that have different information delivery requirements. Perhaps your replication environment also has one or more of the following requirements.

Failover

Multimaster replication can be used to protect the availability of a mission critical database. For example, a multimaster replication environment can replicate all of the data in your database to establish a failover site should the primary site become unavailable due to system or network outages. In contrast with Oracle's standby database feature, such a failover site also can serve as a fully functional database to support application access when the primary site is concurrently operational, whereas a standby database can only become fully functional if the primary site is unavailable.

Figure 2-1 Multimaster Replication


You can use Net8 to configure automatic connect-time failover, which enables Net8 to fail over to a different master site if the first master site fails. You configure automatic connect-time failover in your tnsnames.ora file by setting the FAILOVER option to ON and specifying multiple connect descriptors.

See Also:

Net8 Administrator's Guide for more information about configuring connect-time failover. 

Load Balancing

Multimaster replication is useful for transaction processing applications that require multiple points of access to database information for the purposes of distributing a heavy application load, ensuring continuous availability, or providing more localized data access.

Applications that have application load distribution requirements commonly include customer service oriented applications.

Figure 2-2 Multimaster Replication Supporting Multiple Points of Update Access



Parallel Server vs. Replication 

There are two major areas where you may need to consider whether Oracle replication or Oracle Parallel Server better serves your needs; these areas are load balancing and survivability.

Load Balancing: Oracle Parallel Server provides read and write load balancing over multiple instances. Oracle replication provides read load balancing over multiple databases, though the net effect of the write load balancing is that each write is performed at each master site.

Survivability: Replication provides greater survivability protection with regards to natural disasters, power outages, and/or sabotage because the remaining replication sites may be positioned in a geographically different region. Oracle Parallel Server operates on a cluster or other massively parallel system and is located in the same physical environment, and thus cannot protect against the physical problems that replication can protect against. 

Support for Disconnected Snapshot Environments

Snapshot replication allows users to remotely store all or a subset of replicated data from a master site in a disconnected environment. This scenario is typical of sales force automation systems where an individual's laptop (a disconnected device) stores a subset of data related to the individual salesperson.

Master sites operate as the target of the snapshot environment. Master site support can be:

Types of Master Replication

There are two types of master replication: multimaster and single master. Multimaster replication includes multiple master sites, where each master site operates as an equal peer. In single master replication, a single master site supporting snapshot replication provides the mechanisms to support potentially hundreds or thousands of snapshot sites. A single master site that supports one or more snapshot sites can also participate in a multiple master site environment, creating a hybrid replication environment (combination of multimaster and snapshot replication).

Multimaster Replication

Multimaster replication is comprised of multiple master sites equally participating in an update-anywhere model. Updates made to an individual master site are propagated to all other participating master sites.

Conflict resolution is independently handled at each of the master sites. Multimaster replication provides complete replicas of each replicated table at each of the master sites.

Single Master Replication

A single master site can also function as the target master site for one or more snapshot sites. Unlike multimaster replication, where updates to a single site are propagated to all other master sites, snapshots update only their target master site. If the replication environment is a hybrid environment (it has multiple master sites supporting one or more snapshot sites), the target master site propagates any of the snapshot updates to all other master sites in the multiple site replication environment.

Conflict resolution is handled only at master sites. Snapshot replication can contain complete or partial replicas of the replicated table.

See Also:

Chapter 3, "Snapshot Concepts & Architecture" for more information about snapshot replication with a master site. 

Multimaster Replication Process

There are two types of multimaster replication: asynchronous and synchronous.

Asynchronous replication, often referred to as store-and-forward replication, captures any local changes, stores them in a queue, and, at regular intervals, propagates and applies these changes at remote sites. With this form of replication, there is a period of time before all sites achieve data convergence.

Synchronous replication, also known as real-time replication, applies any changes or executes any replicated procedures at all sites participating in the replication environment as part of a single transaction. If the DML or procedure fails at any site, the entire transaction rolls back. Synchronous replication ensures data consistency at all sites in real-time.

You can change the propagation mode from asynchronous to synchronous or vice versa for a master site. If you change the propagation mode for a master site in a master group, you must regenerate replication support for all master group objects. Also, a multimaster replication environment may contain a mixture of both synchronous and asynchronous replication.

See Also:

"Understanding Mixed-Mode Multimaster Systems" for more information. 

Asynchronous Replication

Asynchronous replication independently propagates (sends) any DML or replicated procedure execution to all of the other master sites participating in the multimaster replication environment. Propagation occurs in a separate transaction after the DML or replication procedure has been executed locally.

Asynchronous replication is the default mode of replication. Asynchronous replication requires less networking and hardware resources than does synchronous replication, resulting in better availability and performance.

Asynchronous replication, however, means that the data sets at the different master sites in the replication environment may be different for a period of time before the changes have been propagated. Also, data conflicts may occur in an asynchronous replication environment.

The following describes the process of asynchronous replication:

Synchronous Replication

Synchronous replication propagates any changes made at a local site to other synchronously linked masters in a replication environment during the same transaction as the initial change. If the propagation fails at any of the master sites, the entire transaction, including the initial change at the local master site, rolls back. This strict enforcement ensures data consistency across the replication environment. Unlike asynchronous replication, there is never a period of time when the data at any of the master sites does not match.

See Also:

"Understanding Mixed-Mode Multimaster Systems" for a discussion on using both synchronous and asynchronous replication in a single environment. 

Synchronous replication also ensures that no data conflicts are introduced into the replication environment. These benefits have the cost of requiring many hardware and networking resources with no flexibility for downtime. Consider, if a single master site of a six node multimaster environment is unavailable, a transaction is not completed. Whereas in asynchronous replication, the deferred transaction is held until the downed site becomes available.

Additionally, while query performance remains high because they are performed locally, updates are slower because of the 2-phase commit protocol that ensures that any updates are successfully propagated and applied to the remote destination sites.

The following describes the process of synchronous replication:

Conflict Resolution Concepts

When Oracle replicates a table, any DML applied to the replicated table at any replication site (either master or snapshot site) that causes a data conflict is automatically detected by the Oracle server at a master site. Snapshot sites cannot detect nor resolve data conflicts. Any data conflicts introduced by a snapshot site are detected and resolved at the target master site of the snapshot.

For example, if the following master group is scheduled to propagate changes once an hour, consider what happens when:


Time  Master Site A  Master Site B  Status 

8:00 AM 

Propagate Changes to Master Site B 

Propagate Changes to Master Site A 

Data converges. 

8:15 AM 

Updates Row 1 

 

 

8:30 AM 

 

Updates Row 1 

 

9:00 AM 

Propagate Changes to Master Site B 

Propagate Changes to Master Site A 

Conflict Detected on Row 1 

If the time between propagations is considered an interval and two or more sites update the same row during a single interval, a conflict occurs.

In addition to the update conflict described above, there are insert and delete conflicts. Consider the following summaries of each type of conflict. Each conflict occurs when the conflicting actions occur within the same interval:

Once a data conflict is detected, the following actions occur:

If the data conflict is logged in the error queue, the database administrator is responsible for resolving the data conflict manually.

If you choose to use Oracle or user-defined conflict resolution methods, the Oracle server automatically tries resolve the data conflict. The conflict resolution methods that are implemented should conform to the business rules defined for your replicated environment and should work to guarantee data convergence.

Master Replication Architecture

Though you can build a replicated environment by following the procedures and examples described in this book, in the Replication Manager online help, and in the Oracle8i Replication Management API Reference, understanding the architecture of replication gives you valuable information for setting up your database environment to support replication, tuning your replication environment, and troubleshooting your replication environment when necessary. This section describes the architecture of replication in terms of mechanisms and processes.

Master Site Mechanisms

To support a replicated environment, Oracle uses the following mechanisms at each master site that is participating in either a multimaster replication or single master replication environment. Some of the following master site mechanisms are required only in special circumstances.

Master Site Roles/Users

Depending on your security requirements, the following three roles may be consolidated into a single replication administrator. In fact, most multimaster replication environments use a single user to perform the replication administration, propagation, and receiving roles. If you have more stringent security requirement, you may assign the following roles to different users.


Note:

The term "roles" in this context is not related to the SQL term "roles." The referenced replication roles are granted using stored PL/SQL procedures and/or individual privileges. 


Replication Administrator

The replication administrator performs all of the administrative functions relating to a master site in a replication environment. In general, it is preferable to have a single replication administrator for a replication environment. In addition to preparing a database to support replication, the replication administrator is responsible for building and maintaining the individual master replication groups, adding and removing participating master sites, managing the queues, and controlling the state of the replication environment (normal and quiesced). The default username for this administrator is REPADMIN, but you can use any username you wish.

Propagator

The propagator performs the task of propagating each transaction contained in the deferred transaction queue to the transaction's destinations. There is a single propagator for the database. In other words, it is possible for you to have multiple replication administrators to manage different schemas, but there can only be a single propagator per database.

Receiver

The receiver is responsible for receiving and applying the deferred transactions from the propagator. If the receiver does not have the appropriate privileges to apply a call in the deferred transaction, the entire deferred transaction is placed in the error queue at the destination.

Database Links

Database links provide the conduit to replicate data between master and snapshot sites. In a multimaster environment, there is a database link from each individual master site to all other master sites. Another way to look at the configuration of database links is that there are N - 1 database links for each master site, where N is the total number of master sites.

Figure 2-3 Each Arrow Represents a Database Link


In Figure 2-3, each master site has two database links to the other master sites (N-1 or in this case 3-1=2). This configuration ensures the bi-directional communication channels between master sites needed for multimaster replication.

The most basic setup would find a database link from the replication administrator at the individual master site to the replication administrators at the other participating master replication sites.

A common approach, however, adds an additional set of database links to your replication environment. Before creating any replication administrator database links, you would create public database links between all of the participating master sites, without specifying a CONNECT TO clause. The public database link would specify the target of each database link with the USING clause.

Once the public database links have been created, you can create the private replication administrator database links. Though you still must specify the CONNECT TO clause, setting up public database links eliminates the need to specify a USING clause.

The approach of using both public and private database links reduces the amount of administration needed to manage database links. Consider the following advantages:

As previously described, the replication administrator usually performs the tasks of administration and propagation in a multimaster environment. Because these tasks are performed by only a single user, only a set of private database links must be created for the replication administrator.

However, in multimaster replication environments where propagation is performed by users other than the replication administrator, the appropriate set of private database links must be created for these alternate users.

Connection Qualifiers

Connection qualifiers allow several database links pointing to the same remote database to establish connections using different paths. For example, a database named DBS1 can have two public database links named DBS1.WORLD that connect to the remote database using different paths.

For the purposes of replication, connection qualifiers can also enable you to more closely control the propagation characteristics for multiple master groups. Consider, if each master site contains three separate master groups and you are not using connection qualifiers, the scheduling characteristics for the propagation of the deferred transaction queue is the same for all master groups. This may be costly if one master group propagates deferred transactions once an hour while the other two master groups propagate deferred transactions once a day.

Associating a connection qualifier with a master group gives you the ability to define different scheduling characteristics for the propagation of the deferred transaction queue on a master group level versus on a database level as previously described.

See Also:

Chapter 2 of Oracle8i Distributed Database Systems to learn about defining connection qualifiers for a database link. 

When you create a new master group, you can indicate that you want to use a connection qualifier for all scheduled links that correspond to the group. However, when you use connection qualifiers for a master group, Oracle propagates information only after you have created database links with connection qualifiers at every master site. After a master group is created, you cannot remove, add, or change the connection qualifier for the group.


Caution:

To preserve transaction integrity in a multimaster environment that uses connection qualified links and multiple master groups, a transaction cannot manipulate replication objects in groups with different connection qualifiers. 



Caution:

If you plan to use connection qualifiers, you probably need to increase the value of the OPEN_LINKS initialization parameter at all master sites. The default is four open links per process. Estimate the required value based on your usage. See "Initialization Parameters", and see the Oracle8i Reference, for more information about OPEN_LINKS. 


Replication Objects

The most visible part of your replicated environment is the replicated objects themselves. Of these replicated objects, replicated tables are the foundation of your replicated environment. The following sections discuss replicating the related database objects. These discussions highlight the benefits and potential limitations of replicating particular types of database objects.

Tables

In most cases, replicated tables are the foundation of your replication environment. Once a table is selected for replication and has had replication support generated, it is monitored by internal triggers to detect any DML applied to it (see "Internal Triggers").

When you replicate a table, you have the option of replicating the table structure and table data to the remote data sites or just the table structure. Additionally, if a table of the same name and structure already exists at the target replication site, you have the option of using the existing object in your replication environment.


Note:

On tables with self-referential integrity constraints, Oracle replication cannot guarantee that the deletes will be performed in the correct order. To perform deletes on tables with self-referential integrity constraints, use procedural replication. See "Using Procedural Replication" for information. 


Alternative Uses for Table Replication 

Though replicating a table is intended for replicating any table data changes to all sites participating in the replication environment, there are other uses for replicating a table.

Object and Data Transport: After an object has been replicated to a target destination site, replication support is not automatically generated. You can use this approach as an easy way to distribute objects and data to remote destinations. If you do not drop the replication objects and do not generate replication support, the table (or other objects) and the data remains at the remote destination site, and any changes at the remote destination site are not replicated. This approach enables you to distribute a standard database environment and data set to a new database environment.

Object Transport: Similarly, you can replicate a table to a target destination site without copying the data. This approach creates the object at the destination site, but does not populate it with data. Therefore, you can quickly distribute an empty database environment. 

Indexes

Any index that is used to enforce a constraint in a table is automatically created at the remote destination sites when a table is selected for replication and created at the remote site. Any index that is used for performance reasons, however, must be explicitly selected for replication to be created at the other master sites participating in the replication environment. When an index is replicated to other sites, it operates as if the index was created locally. There are no replication mechanisms that monitor the trigger to capture any changes.

Packages/Package Bodies

Selecting packages and package bodies for replication and generating the needed replication support gives you the ability to do procedural replication. Procedural replication can offer performance advantages for large, batch-oriented operations on large numbers of rows that can be run serially within a replicated environment.

All parameters for a procedure with replication support must be IN parameters. OUT and IN/OUT modes are not supported. The datatypes supported for these parameters are: NUMBER, DATE, VARCHAR2, CHAR, ROWID, RAW, BLOB, CLOB, NCHAR, NVARCHAR, and NCLOB. A replicated procedure must be declared in a package. Stand-alone procedures cannot have replication support.

See Also:

"Using Procedural Replication" section for detailed information about using procedural replication. 


Note:

Similar to the concepts presented in the "Alternative Uses for Table Replication" sidebar, you can select a package and package body for replication but not generate replication support to use replication as an easy way to distribute the object to a remote site, though any calls made to the package are not replicated. 


Procedures/Functions

Procedures and functions not declared as part of a package cannot have replication support. Though you cannot create a procedural replication environment with stand-alone procedures and functions, you can still use replication to distribute these stand-alone procedures and functions to the sites in your replication environment. When the stand-alone procedure or function is created at the remote site using replication, the created object does not have replication support and operates as though the object was created locally.

Triggers

To make sure that any application or database logic is present at each master site, you can select triggers for replication. An important example of replicating a trigger is replicating a trigger that automatically inserts a timestamp into a table when any DML is applied to the table.

To avoid refiring of the trigger, it is important to insert an API call into the trigger to detect if the trigger is being fired through a local or remote call. This is to avoid the situation where the trigger updates a row that causes the trigger to fire again. Notice line 5 in the following code example:

1) CREATE TRIGGER scott.insert_time
2)    BEFORE
3)       INSERT OR UPDATE ON scott.emp FOR EACH ROW
4)    BEGIN
5)       IF DBMS_REPUTIL.FROM_REMOTE = FALSE THEN
6)          :NEW.TIMESTAMP := SYSDATE;
7)       END IF;

8) END;

If the DBMS_REPUTIL.FROM_REMOTE function determines that the insert or update was locally initiated, then the defined action (that is, assign timestamp) occurs. If this function determines that the insert or update is from a remote site, then the timestamp value is not assigned.

Sequences

Because two sequences at different databases can generate the same value, replicating sequences is not supported.

There are three alternatives to replicating sequences that guarantee to generate unique values and avoid any uniqueness data conflicts. Beginning with Oracle8i release 8.1.5, you can retrieve a unique identifier by executing the following select statement:

SELECT sys_guid() FROM dual;

This SQL operator returns a 16-byte globally unique identifier. This value is based on an algorithm that uses time and datestamp and machine identifier to generate a globally unique identifier. The globally unique identifier appears in a format similar to the following:

4595EF13AB785E73E03400400B40F58B

An alternate solution to using the sys_guid() function is to create a sequence at each of the master sites and concatenate the site name (or other globally unique value) with the local sequence. This approach helps you to avoid any potential duplicate sequence values and helps in preventing insert conflicts as described in the "Conflict Resolution Concepts" section.

Additionally, you can create a sequence at each of the master sites so that each generates a unique value in your replication environment. You can accomplish this by using a combination of starting, incrementing, and maximum values. For example, you might configure the following:

Parameter  Master Site A  Master Site B  Master Site C 

start_with 

increment_by 

10 

10 

10 

Range Example 

1, 11, 21, 31, 41,... 

3, 13, 23, 33, 43,... 

5, 15, 25, 35, 45,... 

Using a similar approach, you can define different ranges for each master site by specifying a START_WITH and MAXVALUE that would produce a unique range for each site.

Views/Synonyms

When you replicate a view or synonym, you are simply using replication to distribute these objects to the other master sites that are involved in the replication environment. Once the object is replicated to the other sites, it operates as if the view or synonym was created locally. There is no internal trigger or package that monitors the object to capture any changes. Because it is a replicated object, though, you can still drop or modify it using either Replication Manager or the replication management API.

Internal Triggers

Oracle uses internal triggers to capture and store information about updates to replicated data. Internal triggers build RPCs to reproduce data changes made to the local site at remote replication sites. These deferred RPCs are stored in the deferred transaction queue and are propagated to the other master sites participating in the replication environment. The internal triggers supporting data replication are essentially components within the Oracle server executable. Therefore, Oracle can capture and store updates to replicated data very quickly with minimal use of system resources.

Deferred Transactions

Oracle forwards data replication information by propagating (that is, sending and executing) the RPCs that are generated by the internal triggers described above. These RPCs are stored in the deferred transaction queue. In addition to containing the execution command for the internal procedure at the destination site, each RPC also contains the data to be replicated to the target site. Oracle uses distributed transaction protocols to protect global database integrity automatically and ensure data survivability.

Internal Procedure

When a deferred RPC created by an internal trigger is propagated to the other master sites participating in a replication environment, an internal procedure at the destination site is used to apply the deferred RPC at the remote site. These internal procedures are activated automatically when you generate replication support for a table. These internal procedures are executed based on the RPCs that are received from the deferred transaction queue of the originating site.

Queues

The following queues manage the transactions that are generated by Oracle replication:

Deferred Transaction Queue

This queue stores the transactions (for example, DML) that are bound for another destination in the master group. Oracle stores RPCs produced by the internal triggers in the deferred transaction queue of a site for later propagation. Oracle also records information about initiating transactions so that all RPCs from a transaction can be propagated and applied remotely as a transaction. Oracle's replication facility implements the deferred transaction queue using Oracle's advanced queueing mechanism.

Error Queue

The error queue stores information about deferred transactions that could not be applied successfully at the local site. The error queue does not display information about errors at other master sites in the replication environment. When the error condition has been resolved, the DBA can either re-execute the transaction or delete the transaction from the error queue.

Job Queue

Oracle manages the propagation process using Oracle's job queue mechanism and deferred transactions. Each server has a local job queue. A server's job queue is a database table storing information about local jobs such as the PL/SQL call to execute for a job, when to run a job, and so on. Typical jobs in a replication environment include jobs to push deferred transactions to remote master sites, jobs to purge applied transactions from the deferred transaction queue, and jobs to refresh snapshot refresh groups.

Administrative Mechanisms

Several mechanisms are required to handle the administrative tasks that are often performed to support a replicated environment. These mechanisms allow you to turn on and off a replication environment, as well as monitor the administrative tasks that are generated when you build or modify a replicated environment.

Replication Modes of Operation

There are three modes of operation for a replication environment.

Normal

A replicated environment in the normal mode allows replication to occur. Any transaction against a replicated object is allowed and is appropriately propagated.

Quiescing

Quiescing is the mode that transfers a replicated environment from the normal mode to the quiesced mode. While a replication environment is quiescing, the user is no longer able to execute a transaction against a replicated object, but any existing deferred transactions are propagated. Queries against a quiescing table are allowed. When all deferred transactions have been successfully propagated to their respective destinations, the replication environment proceeds to the quiesced mode.

Quiesced

A quiesced replication environment can be considered disabled for normal replication use and is used primarily for administrative purposes (that is, adding, removing, or altering replicated objects). A quiesced state prevents users from executing any transactions against a replicated object in the quiesced master group unless they turn off replication, which can result in divergent data once replication is resumed. Transactions include DML against a replicated table or the execution of a wrapper for a replicated procedure. If master tables are quiesced, snapshots based on those master tables cannot propagate their changes to the target master tables, but local changes to the snapshot can continue.

A replication environment is quiesced on a master group level. However all master sites participating in the replicated master group are affected. When a master group reaches a quiesced state, you can be certain that any transactions in the deferred transaction queue have been successfully propagated to the other master sites or put into the error queue. Quiescing one master group does not affect other master groups. A master group in normal mode can continue to process updates while other master groups are quiesced.

Replication Mode Control

Though there are three modes of replication operation, there are only two mechanisms to control these modes (recall that the quiescing mode is a transition from a normal to quiesced mode).

Suspend

Executing the suspend mechanism begins the quiescing mode that transfers the mode of replication operation for a master group from normal to quiesced. When the deferred transaction queue has no unpropagated deferred transactions for the master group, the replication environment proceeds to the quiesced mode.

The suspend mechanism can only be executed when the replication environment is in normal mode. Execute suspend when you need to modify the replication environment.

Resume

The resume mechanism transfers a master group from the quiesced replication mode to the normal mode. If you have been performing administrative work on your replication environment (for example, adding replicated objects), you should verify that the administrative queue (DBA_REPCATLOG) is empty before executing the resume mechanism.

Administration Requests

To configure and manage a replication environment, each participating server uses Oracle's replication management API. A server's replication management API is a set of PL/SQL packages encapsulating procedures and functions administrators can use to configure Oracle's replication features. Oracle Replication Manager also uses the procedures and functions of each site's replication management API to perform work.

An administration request is a call to a procedure or function in Oracle's replication management API. For example, when you use Replication Manager to create a new master group, Replication Manager completes the task by making a call to the DBMS_REPCAT.CREATE_MASTER_REPGROUP procedure. Some administration requests generate additional replication management API calls to complete the request.

The Administration Request Mechanisms

When you use Replication Manager or make a call to a procedure in the DBMS_REPCAT package to administer a replication system, Oracle uses its internal mechanisms to broadcast the request using synchronously. If a synchronous broadcast fails for any reason, Oracle returns an error message and rolls back the encompassing transaction.

When an Oracle server receives an administration request, it records the request in the DBA_REPCATLOG view and the corresponding DDL statement in a child table of the DBA_REPCATLOG view. When you view administration requests for a master group at a master site, you might observe requests that are awaiting a callback from another master site. Whenever you use Replication Manager to create an administration request for a replication group, Oracle automatically inserts a job into the local job queue, if one does not already exist for the group. This job periodically executes the DBMS_REPCAT.DO_DEFERRED_REPCAT_ADMIN procedure. Whenever you synchronously broadcast a request, Oracle attempts to start this job immediately in order to apply the replicated changes at each master site.

Assuming that Oracle does not encounter any errors, DO_DEFERRED_REPCAT_ ADMIN is run whenever a background process is available to execute the job. The initialization parameter JOB_QUEUE_INTERVAL determines how often the background process wakes up. You may experience a delay if you do not have enough background processes available to execute the outstanding jobs.


Note:

When JOB_QUEUE_PROCESSES is set to zero at a site, you must apply administration requests manually for all groups at the site. 


See Also:

"Initialization Parameters", and the Oracle8i Reference for more information about JOB_QUEUE_INTERVAL and JOB_QUEUE_PROCESSES. 

For each call of DO_DEFERRED_REPCAT_ADMIN at a master site, the site checks the DBA_REPCATLOG view to see if there are any requests that need to be performed. When one or more administration requests are present, Oracle applies the request and updates any local views as appropriate. This event can occur asynchronously at each master site.

DO_DEFERRED_REPCAT_ADMIN executes the local administration requests in the proper order. When DO_DEFERRED_REPCAT_ADMIN is executed at a master that is not the master definition site, it does as much as possible. Some asynchronous activities such as populating a replicated table require communication with the master definition site. If this communication is not possible, DO_DEFERRED_REPCAT_ADMIN stops executing administration requests to avoid executing requests out of order. Some communication with the master definition site, such as the final step of updating or deleting an administration request at the master definition site, can be deferred and will not prevent DO_DEFERRED_REPCAT_ADMIN from executing additional requests.

The success or failure of an administration request at each master site is noted in the DBA_REPCATLOG view at each site. For each master group, Replication Manager displays the corresponding status of each administration request. Ultimately, each master site propagates the status of its administration requests to the master definition site. If a request completes successfully at a master site, Oracle removes the callback for the site from the DBA_REPCATLOG view at the master definition site. If a request completes successfully at all sites, all entries in the DBA_REPCATLOG view at all sites, including the master definition site, are removed. If a request at a non master definition site fails, Oracle removes the request at the master site and updates the corresponding AWAIT_CALLBACK request at the master definition site with the reason for the failure.

By synchronously broadcasting the change, Oracle ensures that all sites are aware of the change, and thus are capable of remaining synchronized. By allowing the change to be applied at the site at a future point in time, Oracle provides you with the flexibility to choose the most appropriate time to apply changes at a site.

If an object requires automatically generated replication support, you must regenerate replication support after altering the object. Oracle then activates the internal triggers and regenerates the packages to support replication of the altered object at all master sites.


Note:

Although the DDL must be successfully applied at the master definition site in order for these procedures to complete without error, this does not guarantee that the DDL is successfully applied at each master site. Replication Manager displays the status of all administration requests. Additionally, the DBA_REPCATLOG view contains interim status and any asynchronous error messages generated by the request. 


Any snapshot sites that are affected by a DDL change are updated the next time you perform a refresh of the snapshot site. While all master sites can communicate with one another, snapshot sites can communicate only with their associated master site.

If you must alter the shape of a snapshot as the result of a change to its master, you must drop and recreate the snapshot.

Administrative Queue

Often referred to as the administrative queue, the DBA_REPCATLOG view stores administration requests that manage and modify your replication environment. Some DBMS_REPCAT procedures that are executed are listed in the administrative queue; for example, if you wanted to add an additional replicated table to an existing master group, you would see a request naming the DBMS_REPCAT.CREATE_MASTER_REPOBJECT procedure.

You can view the administrative queue by querying the DBA_REPCATLOG view with a tool such as SQL*Plus or view the Deferred Transactions by Destination node in Replication Manager.

Each request has a status that displays the state of the request. There are four states:

The administrative queue of each master site lists only the administration requests to be performed at that master site. The master definition site for a master group, however, lists administration requests to be performed at each of the master sites. The administrative queue at the master definition site lets the DBA monitor administration requests of all the master sites in the replication environment.

Organizational Mechanisms

Oracle uses several organizational mechanisms to organize the previously described master site and administrative mechanisms to create discrete replication groups. Most notable of these organizational mechanisms is the master group. An additional organization mechanism helps to group columns that are used to resolve conflicts for a replicated table.

Master Group

In a replication environment, Oracle manages replication objects using replication groups. A replication group is a collection of replication objects that are always updated in a transactional consistent manner.

By organizing related database objects within a replication group, it is easier to administer many objects together. Typically, you create and use a replication group to organize the schema objects necessary to support a particular database application. That is not to say that replication groups and schemas must correspond with one another. Objects in a replication group can originate from several database schemas and a schema can contain objects that are members of different replication groups. The restriction is that a replication object can be a member of only one group.

In a multimaster replication environment, the replication groups are called master groups. Corresponding master groups at different sites must contain the same set of replication objects (see "Replication Objects"). Figure 2-4 illustrates that master group "SCOTT_MG" contains an exact replica of the replicated objects at each master site.

Figure 2-4 Master Group SCOTT_MG contains same replication objects at all sites.


The master group organization at the master site plays an integral role in the organization of replication objects at a snapshot site. See the "Organizational Mechanisms" section for more information on the organizational mechanisms at a snapshot site.

Additionally, Figure 2-5 illustrates that each site may contain multiple replication groups, though each group must contain exactly the same set of objects at each master site.

Figure 2-5 Master Groups are Identical at Each Master Site


Column Groups

Column groups provide the organization mechanism to group all columns that are involved in a conflict resolution routine. If a conflict occurs in one of the columns of the group, the remainder of the group's columns may be used to resolve the conflict. For example, if a column group for a table contains a MIN_PRICE, LIST_PRICE, COST_PRICE, and TIMESTAMP field and a conflict arises for the list_price field, the timestamp field may be used to resolve the conflict, assuming that a timestamp conflict resolution routine has been used.

Initially, you might think that you should put all columns in the table into a single column group. Though this makes setup and administration easier, it may decrease the performance of your replicated table and may increase the potential for data conflicts. As you will learn in the "Performance Mechanisms" section, the Min Communications feature does not send extraneous data from other column groups when a conflict occurs. Placing all columns into a single column group may negate the advantages of the Min Communications feature, unless you use the DBMS_REPCAT.SEND_OLD_VALUES and DBMS_REPCAT.COMPARE_OLD_VALUES procedures.

See Also:

Chapter 5, "Conflict Resolution Concepts & Architecture" for more information about column groups. 

Propagation Process

The propagation process is the essence of replication because it is the mechanism that sends or distributes any actions to all other master sites in the replication environment.

Propagation Types

As the internal trigger captures any DML applied to a replicated table, the DML must be propagated or sent to the other master sites in the replication environment. Internal triggers are described in the section "Internal Triggers".

Oracle replication supports both asynchronous and synchronous replication.

Asynchronous

Typical replication configurations use asynchronous data replication. Asynchronous data replication occurs when an application updates a local replica of a table, stores replication information in a local queue, and then forwards the replication information to other replication sites at a later time. Consequently, asynchronous data replication is also called store-and-forward data replication.

As Figure 2-6 shows, Oracle uses its internal triggers, deferred transactions, deferred transaction queues, and job queues to propagate data-level changes asynchronously among master sites in a replication system, as well as from an updateable snapshot to its master table.

Figure 2-6 Asynchronous Data Replication Mechanisms


Synchronous

Oracle also supports synchronous data propagation for applications with special requirements. Synchronous data propagation occurs when an application updates a local replica of a table, and within the same transaction also updates at least one other replica of the same table. Consequently, synchronous data replication is also called real-time data replication. Use synchronous replication only when applications require that replicated sites remain continuously synchronized.

Figure 2-7 Synchronous Data Replication Mechanisms


As Figure 2-7 shows, Oracle uses the same internal triggers to generate RPCs that asynchronously replicate data-level changes to other replication sites to support synchronous, row-level data replication. However, Oracle does not defer the execution of such RPCs. Instead, data replication RPCs execute within the boundary of the same transaction that modifies the local replica. Consequently, a data-level change must be possible at all synchronously linked sites that manage a replicated table; otherwise, a transaction rollback occurs.

Understanding Synchronous Data Propagation

As shown in Figure 2-8, whenever an application makes a DML change to a local replicated table and the replication group is using synchronous row-level replication, the change is synchronously propagated to the other master sites in the replicated environment using internal triggers. When the application applies a local change, the internal triggers issue calls to generated procedures at the remote master sites in the security context of the replication propagator. Oracle ensures that all distributed transactions either commit or rollback in the event of a failure. See the discussion of distributed updates in the book Oracle8i Distributed Database Systems.

Figure 2-8 Propagating Changes using Synchronous Row-Level Replication

Restrictions

Because of the locking mechanism used by synchronous replication, deadlocks can occur. When an application performs a synchronous update to a replicated table, Oracle first locks the local row and then uses an AFTER ROW trigger to lock the corresponding remote row. Oracle releases the locks when the transaction commits at each site.


Note:

A replication system that uses real-time propagation of replication data is highly dependent on system and network availability because it can function only when all sites in the system are concurrently available. 


Destination of Synchronously Replicated Transactions

The necessary remote procedure calls to support synchronous replication are included in the internal trigger for each object. When you generate replication support for a replicated object, Oracle activates the triggers at all master sites to add the necessary remote procedure calls for the new site. Conversely, when you remove a master site from a master group, Oracle removes the calls from the internal triggers.

Conflict Detection

If all sites of a master group communicate synchronously with one another, applications should never experience replication conflicts. However, if even one site is sending changes asynchronously to another site, applications can experience conflicts at any site in the replicated environment.

If the change is being propagated synchronously, an error is raised and a rollback is required. If the change is propagated asynchronously, Oracle automatically detects the conflicts and either logs the conflict or, if you designate an appropriate resolution method, resolves the conflict.

See Also:

Chapter 5, "Conflict Resolution Concepts & Architecture"

Understanding Mixed-Mode Multimaster Systems

In some situations, you might decide to have a mixed-mode environment in which some master sites propagate a master group's changes asynchronously and others propagate changes synchronously. The order in which you add new master sites to a group with different data propagation modes can be important.

For example, suppose that you have three master sites: A, B, and C. If you first create site A as the master definition site, and then add site B with a synchronous propagation mode, site A sends changes to site B synchronously and site B sends changes to site A synchronously. There is no need to concern yourself with the scheduling of links at either site, because neither site is creating deferred transactions.

Now suppose that you create master site C with an asynchronous propagation mode. The propagation modes are now as illustrated in Figure 2-9.

Figure 2-9 Selecting a Propagation Mode

You must now schedule propagation of the deferred transaction queue from site A to site C, from site B to site C, and from site C to sites A and B.

As another example, consider what would happen if you created site A as the master definition site, then added site C with an asynchronous propagation mode, then added site B with a synchronous propagation mode? Now the propagation modes would be as shown in Figure 2-10.

Figure 2-10 Ordering Considerations

Each time that you add a new master site to a mixed-mode multimaster system, consider how the addition affects the data propagation modes to and from existing sites.

Initiating Propagation

When synchronous propagation is used, the propagation of the DML is handled immediately and is automatically initiated. If asynchronous propagation is used, there are two methods to propagate the deferred transactions. In most cases, use a job to automatically propagate the deferred transactions according to a scheduled interval of your choosing. You can also manually propagate the changes by executing a stored procedure or using Replication Manager. You may occasionally need to manually propagate your deferred transactions if you do not want to wait for the job queue to automatically propagate the deferred transactions.

Performance Mechanisms

As with any enterprise database solution, performance is always an important issue for the database administrator. Oracle replication provides several mechanisms to help increase the performance of your replicated environment.

Parallel Propagation

With parallel propagation, Oracle asynchronously propagates replicated transactions using multiple, parallel transit streams for higher throughput. When necessary, Oracle orders the execution of dependent transactions to ensure global database integrity.

Parallel propagation uses the pool of available parallel server processes. This is the same facility Oracle uses for other parallel operations such as parallel query, parallel load, and parallel recovery. Each server process propagates transactions through a single stream. A parallel coordinator process controls these server processes. The coordinator tracks transaction dependencies, allocates work to the server processes, and tracks their progress.

Parallel processes remain associated with a parallel operation on the server throughout the execution of that operation. When the operation is complete, those server processes become available to process other parallel operations. For example, when Oracle performs a parallel push of the deferred transaction queue to its destination, all parallel server processes used to push the queue remain dedicated to the push until it is complete.

To configure a pool of parallel server processes for a server properly, you must consider several issues related to the configuration of a replication system.

To configure a database server's pool of parallel query processes, use the following initialization parameters:

Implementing Parallel Propagation

For most users, setting the parallel propagation parameter to a value of 1 satisfies their replication performance needs. However, some users may want to further tune the parallel propagation value.

The following procedure is the recommended method that should be used to further tune the parallel propagation value:

  1. Set the parallel propagation value to 1.

  2. Test your database environment and carefully measure the propagation throughput.

    If you have achieved your performance goals with a parallel propagation value of 1, skip to Step 5.


    Note::

    As you increase the value of the parallel propagation parameter, be aware of the trade-offs between increased parallel propagation and the resources required to support the extra parallel slaves. 


  3. If you want to try and achieve greater propagation throughput than with a value of 1, set your parallel propagation value to 2.

  4. Test your database environment and carefully measure the propagation throughput.

    In many cases, you will experience propagation throughput degradation with a value of 2. This reduction is due to round-trip delays associated with the coordinator assigning dependent transactions to available slaves and waiting for the necessary commit acknowledgments before assigning additional transactions.

    Repeat Steps 3 and 4 with the parallel propagation value set to 4 and again with 8. If throughput still does not improve, it suggests that the transactions in your environment are highly dependent on each other. Reduce the parallel propagation value to 1 and proceed to Step 5.

    See Also:

    "Tuning Parallel Propagation" to learn about techniques to reduce transaction dependencies. 

    If your performance did improve with a value of 2, 4, or 8, it suggests that your transactions have a low degree of interdependence. You may even set your parallel propagation parameter to any value greater than 8. Just be sure to thoroughly test your environment and remain aware of the trade-offs between increased parallelism and the necessary resources to support those extra parallel slaves.

  5. You have completed adjusting the parallel propagation value to best meet the performance needs of your database environment.

Tuning Parallel Propagation

To gain the greatest amount of performance benefits from parallel propagation, you should try to reduce the amount of dependent transactions that are created. Remember that a dependent transaction cannot start until all of its dependent transactions have been committed.

Keep the following tips in mind when trying to reduce the number of dependent transactions:

Min Communication

To increase the replication performance for tables, be sure to enable the Min Communication setting when generating replication support for a replicated table.

To detect and resolve an update conflict for a row, the propagating site must send a certain amount of data about the new and old versions of the row to the receiving site. Depending on your environment, the amount of data that Oracle propagates to support update conflict detection and resolution can vary.

For example, when you create a replicated table and all participating sites are Oracle release 8.0 or greater databases, you can choose to minimize the amount of data that must be communicated to detect conflicts for each changed row in the table. In this case, Oracle propagates:

In general, you should choose to minimize data propagation in Oracle release 8.0 or greater replication environments to reduce the amount of data that Oracle transmits across the network. As a result, you can help to improve overall system performance.

Alternatively, when a replicated environment uses both Oracle7 and release 8.0 or greater sites, you cannot minimize the communication of row data for update conflict detection. In this case, Oracle must propagate the entire old and new versions of each changed row to perform conflict detection.

See Also:

"Performance Mechanisms" for more information about conflict resolution, min communication, and additional conflict resolution performance techniques. 

Delay Seconds

Though not directly a performance mechanism, properly configuring the DELAY_SECONDS parameter can give you greater control over the timing of your propagation of deferred transactions.

DELAY_SECONDS controls how long a job remains aware of the deferred transaction queue. The effects of the DELAY_SECONDS parameter can best be illustrated with the following two examples:

DELAY_SECONDS = 0 (default)

If a scheduled job with a 60 minute interval wakes up at 2:30 pm and checks the deferred transaction queue, any existing deferred transactions are propagated. The propagation takes 2 minutes and therefore the job is complete at 2:32 pm.

If a deferred transaction enters the queue at 2:34 pm, the deferred transaction is not propagated because the job is complete. In this scenario, the deferred transaction will be propagated at 3:30 pm.

DELAY_SECONDS = 300

If a scheduled job with a 60 minute interval wakes up at 2:30 pm and checks the deferred transaction queue, any existing deferred transactions are propagated. The propagation takes 2 minutes and therefore the job is complete at 2:32 pm.

If a deferred transaction enters the queue at 2:34 pm, the deferred transaction is propagated because the job remains aware of the deferred transaction queue for 300 seconds (5 minutes) after the job has completed propagating whatever was in the queue. In this scenario, the deferred transaction is propagated at 2:34 pm.

One question that you might ask is "why don't I just set the job to execute more often?" Starting and stopping the job has a greater amount of overhead than starting the job and keeping it aware for a set period of time. In addition to decreasing the overhead associated with starting and stopping these jobs, using the DELAY_SECONDS parameter can reduce the amount of redo logging required to support scheduled jobs.

As with most performance features, there is a point of diminishing returns. There are several reasons why you want to keep the length of the DELAY_SECONDS parameter in check:

As a general rule of thumb, there are few viewable benefits of setting the DELAY_SECONDS parameter to a value greater than 20 minutes (DELAY_SECONDS=1200).

Additionally, if you are only using serial propagation (parallel propagation=0), you need to consider the benefits of large DELAY_SECONDS values. Unlike parallel propagation, serial propagation only checks the queue after the duration of the delay seconds value has elapsed. If you set a value of 20 minutes for parallel propagation, the parallel push checks once a minute. If you can sacrifice the lock on the resources for 20 minutes (the SNP process is not available for other jobs), then you can set the delay seconds value to 1200.

If you cannot afford this resource lock, set the DELAY_SECONDS value to 10 or 20 seconds; though you will need to execute the jobs more often than if the value was set to 1200 seconds, you still gain many of the benefits of the delay seconds feature (versus a value of 0 seconds).

Conflict Resolution Mechanisms

The receiving master site in a replication environment detects update, uniqueness, and delete conflicts as follows:

Identifying Rows During Conflict Detection

To detect replication conflicts accurately, Oracle must be able to uniquely identify and match corresponding rows at different sites during data replication. Typically, Oracle's replication facility uses the primary key of a table to uniquely identify rows in the table. When a table does not have a primary key, you must designate an alternate key--a column or set of columns that Oracle can use to identify rows in the table during data replication.


Caution:

Do not permit applications to update the identity columns of a table. This ensures that Oracle can identify rows and preserve the integrity of replicated data. 


Resolve Data Conflicts

Oracle provides a mechanism that allows you to define a conflict resolution method that resolves a data conflict when detected. Oracle provides several built in conflict resolution methods:

If the built-in Oracle conflict resolution methods do not meet the needs of your replicated environment, you have the option of writing your own conflict resolution method using PL/SQL and implementing it as a user-defined conflict resolution method. See Chapter 5, "Conflict Resolution Concepts & Architecture" to learn how conflict resolution works.

See Also:

The online help for Replication Manager to learn how to implement conflict resolution with Replication Manager, and see Chapter 6, "Conflict Resolution" in the Oracle8i Replication Management API Reference to learn how to implement conflict resolution using the replication management API 


Go to previous page Go to next page
Oracle
Copyright © 1996-2000, Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index