Oracle7 Server Distributed Systems Manual, Vol. 2 Go to Product Documentation Library
Go to books for this product
Go to Contents for this book
Go to Index

Go to previous file in sequence Go to next file in sequence

Advanced Techniques

This chapter describes advanced techniques that you might want to use in implementing your replicated environment. This chapter describes the following topics:

Writing a Conflict Resolution Routine

A conflict resolution routine is a PL/SQL function that returns either TRUE or FALSE. TRUE indicates that the routine has successfully resolved all conflicting modifications for a column group. If the conflict cannot be successfully resolved, the routine should return FALSE. Oracle continues to evaluate available conflict resolution routines, in sequence order, until either a routine returns TRUE or there are no more routines available.

If the conflict resolution routine raises an exception, Oracle stops evaluation of the routine, and, if any other routines were provided to resolve the conflict (with a later sequence number), Oracle does not evaluate them.

Conflict Resolution Routine Parameters

The parameters needed by a conflict resolution routine are determined by the type of conflict being resolved (unique, update, or delete) and the columns of the table being replicated. All conflict resolution routines take some combination of old, new, and current column values for the table.

The old value represents the value of the row at the initiating site before you made the change. The new value represents the value of the row at the initiating site after you made the change. The current value represents the value of the equivalent row at the receiving site. Recall that Oracle uses the primary key (or the key specified by SET_COLUMNS) to determine which rows to compare.

The conflict resolution function should accept as parameters the values for the columns specified in the PARAMETER_COLUMN_NAME argument to the DBMS_REPCAT.ADD_conflicttype_CONFLICT procedures. The column parameters are passed to the conflict resolution routine in the order listed in the PARAMETER_COLUMN_NAME argument, or in ascending alphabetical order if you specified `*' for this argument. Where both old and new column values are passed as parameters (for update conflicts), the old value of the column immediately precedes the new value.

Attention: Type checking of parameter columns in user-defined conflict resolution routines is not performed until the DBMS_REPCAT.GENERATE_REPLICATION_SUPPORT procedure generates and compiles the packages.

Resolving Update Conflicts

For update conflicts, the function should accept the following values for each column in the column group:

The old, new, and current values for a column are received consecutively. The final argument to the conflict resolution routine should be a boolean flag. If this flag is FALSE, it indicates that you have updated the value of the IN OUT parameter, new, and that you should update the current column value with this new value. If this flag is TRUE, it indicates that the current column value should not be changed.

Resolving Uniqueness Conflicts

Uniqueness conflicts can occur as the result of an INSERT or UPDATE. Your uniqueness conflict resolution routine should accept the new column value from the initiating site in IN OUT mode for each column in the column group. The final parameter to the conflict resolution routine should be a BOOLEAN flag.

If the routine can resolve the conflict, it should modify the new column values so that the symmetric replication facility can insert or update the current row with the new column values. Your function should set the BOOLEAN flag to TRUE if it wants to discard the new column values, and FALSE otherwise.

Because your conflict resolution routine cannot guarantee convergence for uniqueness conflicts, your routine should include a notification mechanism.

Resolving Delete Conflicts

Delete conflicts occur when you successfully delete from the local site, but the associated row cannot be found at the remote site (for example, because it had been updated). For delete conflicts, the function should accept old column values in IN OUT mode for the entire row. The final parameter to the conflict resolution routine should be a BOOLEAN flag.

If the conflict resolution routine can resolve the conflict, it modifies the old column values so that the symmetric replication facility can delete the current row that matches all old column values. Your function should set the BOOLEAN flag to TRUE if it wants to discard these column values, and FALSE otherwise.

If you perform a delete at the local site and an update at the remote site, the remote site detects the delete conflict, but the local site detects an unresolvable update conflict. This type of conflict cannot be handled automatically. The conflict will raise a NO_DATA_FOUND exception and the transaction will be placed in the error table.

Designing a mechanism to properly handle these types of update/delete conflicts is difficult. It is far easier to avoid these types of conflicts entirely, by simply "marking" deleted rows, and then purging them using procedural replication, as described [*].


You should avoid the following commands in your conflict resolution routines. Use of these commands can result in unpredictable results.

Example Conflict Resolution Routine

The following examples show variations on the standard MAXIMUM and ADDITIVE conflict resolution methods. Unlike the standard methods, these user functions are designed to handle nulls in the columns used to resolve the conflict.

Maximum User Function

-- User function similar to MAXIMUM method.
-- If curr is null or curr < new, use new values.
-- If new is null or new < curr, use current values.
-- If both are null, no resolution.
-- Does not converge with > 2 masters, unless 
-- always increasing.
FUNCTION max_null_loses(old                 IN     NUMBER,
                        new                 IN OUT NUMBER,
                        cur                 IN     NUMBER,
                        ignore_discard_flag OUT    BOOLEAN)
    IF (new IS NULL AND cur IS NULL) OR new = cur THEN
    END IF;
      ignore_discard_flag := TRUE;
      ignore_discard_flag := FALSE;
    ELSIF new < cur THEN
      ignore_discard_flag := TRUE;
        ignore_discard_flag := FALSE;
    END IF;
END max_null_loses;

Additive User Function

-- User function similar to ADDITIVE method.
-- If old is null, old = 0.
-- If new is null, new = 0.
-- If curr is null, curr = 0.
-- new = curr + (new - old) -> just like ADDITIVE method.
FUNCTION additive_nulls(old                 IN     NUMBER,
                        new                 IN OUT NUMBER,
                        cur                 IN     NUMBER,
                        ignore_discard_flag OUT    BOOLEAN)
  old_val NUMBER := 0.0;
  new_val NUMBER := 0.0;
  cur_val NUMBER := 0.0;
      old_val := old;
    END IF;
      new_val := new;
    END IF;
      cur_val := cur;
    END IF;
    new := cur_val + (new_val - old_val);
    ignore_discard_flag := FALSE;
END additive_nulls;


Survivability provides the capability to continue running applications despite system or site failures. It allows applications to be run on a fail-over system, accessing the same, or very nearly the same, data as they were on the primary system when it failed. As shown in Figure 8 - 1, the Oracle Server provides two different technologies for accomplishing survivability: the Oracle Parallel Server and the symmetric replication facility.

Figure 8 - 1. Survivability Methods: Symmetric Replication vs. Parallel Server

Oracle Parallel Server versus Symmetric Replication

The Oracle Parallel Server supports fail-over to surviving systems when a system supporting an instance of the Oracle Server fails. The Oracle Parallel Server requires a cluster or massively parallel hardware platform, and thus is applicable for protection against processor system failures in the local environment where the cluster or massively parallel system is running.

In these environments, the Oracle Parallel Server is the ideal solution for survivability -- supporting high transaction volumes with no lost transactions or data inconsistencies in the event of an instance failure. If an instance fails, a surviving instance of the Oracle Parallel Server automatically recovers any incomplete transactions. Applications running on the failed system can execute on the fail-over system, accessing all of the data in the database.

The Oracle Parallel Server does not, however, provide survivability for site failures (such as flood, fire, or sabotage) that render an entire site, and thus the entire cluster or massively parallel system, inoperable. To provide survivability for site failures, you can use the symmetric replication facility to maintain a replicate of a database at a geographically remote location.

Should the local system fail, the application can continue to execute at the remote site. Symmetric replication, however, cannot guarantee that no transactions will be lost. Also, special care must be taken to prevent data inconsistencies when the primary site is recovered.

Designing for Survivability

If you choose to use the symmetric replication facility for survivability, you should consider the following issues:

Implementing a Survivable System

Oracle's symmetric replication facility can be used to provide survivability against site failures by using multiple replicated master sites. You must configure your system using one of the following methods. These methods are listed in order of increasing implementation difficulty.

Dynamic Ownership Conflict Avoidance

This section describes a more advanced method of designing your applications to avoid conflicts. This method, known as token passing, is similar to the workflow method described in Chapter 1. Although this section describes how to use this method to control the ownership of an entire row, you can use a modified form of this method to control ownership of the individual column groups within a row.

Both workflow and token passing allow dynamic ownership of data. With dynamic ownership, only one site at a time is allowed to update a row, but ownership of the row can be passed from site to site. Both work flow and token passing use the value of one or more "identifier" columns to determine who is currently allowed to update the row.


With workflow partitioning, you can think of data ownership as being "pushed" from site to site. Only the current owner of the row is allowed to push the ownership of the row to another site, by changing the value of the "identifier" columns.

Take the simple example of separate sites for ordering, shipping, and billing. Here, the identifier columns are used to indicate the status of an order. The status determines which site can update the row. After a user at the ordering site has entered the order, he or she updates the status of this row to SHIP. Users at the ordering site are no longer allowed to modify this row -- ownership has been pushed to the shipping site.

After shipping the order, the user at the shipping site will update the status of this row to BILL, thus pushing ownership to the billing site, and so on.

To successfully avoid conflicts, applications implementing dynamic data ownership must ensure that the following conditions are met:

With workflow partitioning, only the current owner of the row can push the ownership of the row to the next site by updating the "identifier" columns. No site is given ownership unless another site has given up ownership; thus ensuring there is never more than one owner.

Because the flow of work is ordered, ordering conflicts can be resolved by applying the change from the site that occurs latest in the flow of work. Any ordering conflicts can be resolved using a form of the Priority conflict resolution method, where the priority value increases with each step in the work flow process.

The PRIORITY conflict resolution method successfully converges for more than one master as long as the priority value is always increasing.

Token Passing

Token passing uses a more generalized approach to meeting these criteria. To implement token passing, instead of the "identifier" columns, your replicated tables must have owner and epoch columns. The owner column stores the global database name of the site currently believed to own the row.

This column should be used exclusively for establishing ownership and should not otherwise be updated. The epoch column is used to resolve ordering conflicts. This number is updated each time the ownership of the row changes. Thus the change associated with the highest epoch number is the most recent change.

Once you have designed a token passing mechanism, you can use it to implement a variety of forms of dynamic partitioning of data ownership, including workflow.

You should design your application to implement token passing for you automatically. You should not allow the owner or epoch columns to be updated outside this application.

Whenever you attempt to update a row, your application should

For example, Figure 8 - 2 illustrates how ownership of employee 100 passes from the ACCT_SF database to the ACCT_NY database.

Figure 8 - 2. Grabbing the Token

Locating the Owner of a Row

To grab ownership, the ACCT_NY database uses a simple recursive algorithm to locate the owner of the row. The pseudo code for this algorithm is shown below:

-- Pseudo code for locating the token owner.
-- This is for a table TABLE_NAME with primary key PK.
-- Initial call should initialize loc_epoch to 0 and loc_owner
-- to the local global name.
get_owner(PK IN primary_key_type, loc_epoch IN OUT NUMBER, 
          loc_owner IN OUT VARCHAR2)
    -- use dynamic SQL (dbms_sql) to perform a select similar to
    -- the following:
    select owner, epoch into rmt_owner, rmt_epoch
         from TABLE_NAME@loc_owner
         where primary_key = PK for update;
    if rmt_owner = loc_owner and rmt_epoch >= loc_epoch then
      loc_owner := rmt_owner;
      loc_epoch := rmt_epoch;
    elsif rmt_epoch >= loc_epoch then 
      get_owner(PK, rmt_epoch, rmt_owner);
      loc_owner := rmt_owner;
      loc_epoch := rmt_epoch;
      raise_application_error(-20000, 'No owner for row');
    end if;

Grabbing Ownership

After locating the owner of the row, the ACCT_NY site grabs ownership from the ACCT_SF site by completing the following steps:

Applying the Change

You should design your application to implement this method of token passing for you automatically whenever you perform an update. You should not allow the owner or epoch columns to be updated outside this application. The lock that you grab when you change ownership is released when you apply your actual update. The changed information, along with the updated owner and epoch information, will be asynchronously propagated to the other sites in the usual manner.

Using Procedural Replication

Procedural replication can offer performance advantages for large batch-oriented operations operating on large numbers of rows that can be run serially within your replicated environment.

A good example of an appropriate application is a purge operation (also referred to as an archive operation) run infrequently (for example, once per quarter) during off hours to remove old data, or data that was "logically" deleted from the online database. An example using procedural replication to purge deleted rows is described [*].

Restrictions on Procedural Replication

All parameters for a replicated procedure must be IN parameters; OUT and IN/OUT modes are not supported. The datatypes supported for these parameters are: NUMBER, DATE, VARCHAR2, CHAR, ROWID and RAW.

The symmetric replication facility cannot detect update conflicts produced by replicated procedures. Replicated procedures must detect and resolve conflicts themselves. Because of the difficulties involved in writing your own conflict resolution routines, it is best to simply avoid the possibility of conflicts.

Adhering to the following guidelines will help you ensure that your tables remain consistent at all sites.

Serialization of Transactions

Serial execution ensures that your data remains consistent. The symmetric replication facility propagates and executes replicated transactions one at a time. For example, assume that you have two procedures, A and B, that perform updates on local data. Now assume that you perform the following actions, in order:

The replicas of A and B on the other nodes are executed completely serially, in the same order that they were committed at the originating site. If A and B execute concurrently at the originating site, however, they may produce different results locally than they do remotely. Executing A and B serially at the originating site ensures that all sites have identical results. Alternatively, you could write the procedures carefully, to ensure serialization. For example, you could lock any tables being updated in EXCLUSIVE mode to ensure serialization at the originating site.

Generating Support for Replicated Procedures

You must disable row-level replication support at the start of your procedure, and then re-enable support at the end. This ensures that any updates that occur as a result of executing the procedure are not propagated to other sites. Row-level replication is enabled and disabled by calling DBMS_REPUTIL.REPLICATION_ON and DBMS_REPUTIL.REPLICATION_OFF, as described [*].

When you generate replication support for your replicated package, the symmetric replication facility creates a wrapper package. The wrapper package has the same name as the original, but is prefixed with the string that you supplied when you called DBMS_REPCAT.GENERATE_REPLICATION_SUPPORT.

If you did not supply a prefix, the default, "defer_", is used. The wrapper procedure has the same parameters as the original, along with two additional parameters: CALL_LOCAL and CALL_REMOTE. These two boolean parameters determine where the procedure gets executed. When CALL_LOCAL is TRUE, the procedure is executed locally. When CALL_REMOTE is TRUE, the procedure will ultimately be executed at all other sites in the replicated environment.

The remote procedures are called directly if you are propagating changes synchronously, or the calls to these procedures are added to the deferred transaction queue, if you are propagating changes asynchronously. By default, CALL_LOCAL is FALSE, and CALL_REMOTE is TRUE.

Replication support is generated in two phases. The first phase creates the package specification at all sites. Phase two generates the package body at all sites. These two phases are necessary to support synchronous replication.

For example, suppose that you create the package UPDATE containing the procedure UPDATE_EMP, which takes one argument, AMOUNT. You replicate this object to all master sites in your replicated environment by making the following calls:

    sname               => 'acct_rec', 
    oname               => 'update', 
    type                => 'package', 
    use_existing_object => FALSE,
    retry               => FALSE,
    copy_rows           => TRUE,
    gname               => 'acct');
    sname          => 'acct_rec', 
    oname          => 'update', 
    type           => 'package',
    package_prefix => 'defer_',);

You would now invoke the replicated procedure as shown below:

defer_update.update_emp( amount      => 1.05,
                         call_local  => TRUE,
                         call_remote => TRUE);

As shown in Figure 8 - 3, the logic of the wrapper procedure ensures that the procedure is called at the local site and then at all remote sites. The logic of the wrapper procedure also ensures that when the replicated procedure is called at the remote sites, CALL_REMOTE is FALSE, ensuring that the procedure is not further propagated.

If you are operating in a mixed replicated environment with static partitioning of data ownership (that is, if you are not preventing row-level replication), the replication facility will preserve the order of operations at the remote node, since both row-level and procedural replication use the same asynchronous queue.

Figure 8 - 3. Asynchronous Procedural Replication

Modifying Tables without Replicating the Modifications

There may be times when you want to make a modification to a replicated object, but you do not want this modification replicated to the other sites in the replicated environment. For example, you might want to disable replication in the following situations:

To modify tables without replicating the modifications, use the REPLICATION_ON and REPLICATION_OFF procedures in the DBMS_REPUTIL package. These procedures take no arguments and are used as flags by the generated replication triggers.

Note: You must be granted the EXECUTE privilege on the DBMS_REPUTIL package.

Disabling the Symmetric Replication Facility

The DBMS_REPUTIL.REPLICATION_OFF procedure sets the state of the DBMS_REPUTIL.REPLICATION_IS_ON package variable for the current session to FALSE. Because all replicated triggers check the state of this variable before queuing any transactions, modifications made to the replicated tables that use row-level replication do not result in any queued deferred transactions.

Attention: Because REPLICATION_IS_ON is a variable in a PL/SQL package specification, its state is session bound. That is, other users logged on to the same schema are not restricted from placing committed changes in the deferred transaction queue.

If you are using procedural replication, you should call REPLICATION_OFF at the start of your procedure, as shown in the following example. This ensures that the symmetric replication facility does not attempt to use row-level replication to propagate the changes that you make.

  PROCEDURE update_emp(adjustment IN NUMBER);
  PROCEDURE update_emp(adjustment IN NUMBER) IS
    -- turn off row-level replication for set update
    UPDATE emp . . .;
    -- re-enable replication
     . . . 

Re-enabling the Symmetric Replication Facility

After resolving any conflicts, or at the end of your replicated procedure, be certain to call DBMS_REPUTIL.REPLICATION_ON to resume normal replication of changes to your replicated tables or snapshots. This procedure takes no arguments. Calling REPLICATION_ON sets the package variable DBMS_REPUTIL.REPLICATION_IS_ON to TRUE.

Triggers and Replication

If you have defined a replicated trigger on a replicated table, you may need to ensure that the trigger fires only once for each change that you make. Typically, you will only want the trigger to fire when the change is first made, and you will not want the remote trigger to fire when the change is replicated to the remote site.

You should check the value of the DBMS_REPUTIL.FROM_REMOTE package variable at the start of your trigger. The trigger should update the table only if the value of this variable is FALSE.

Alternatively, you can disable replication at the start of the trigger and re-enable it at the end of the trigger when modifying rows other than the one that caused the trigger to fire. Using this method, only the original change is replicated to the remote sites. Then the replicated trigger will fire at each remote site. Any updates performed by the replicated trigger will not be pushed to any other sites.

Using this approach, conflict resolution is not invoked. Therefore, you must ensure that the changes resulting from the trigger do not affect the consistency of the data.

Enabling/Disabling Replication for Snapshots

To disable all local replication triggers for snapshots at your current site, set the I_AM_A_REFRESH package state to TRUE by calling SET_I_AM_A_REFRESH, as shown in the following example:


To re-enable the triggers set the package state to FALSE, as shown below:


To determine the value of the package variable REP$WHAT_AM_I.I_AM_A_SNAPSHOT, call the I_AM_A_REFRESH function as shown below:

ref_stat := dbms_snapshot.i_am_a_refresh;

To check if a snapshot is refreshing or if a master site has replication turned off, you can also call the ACTIVE function in each table's corresponding $TP package.

Handling Deletes

To avoid encountering delete conflicts, you might find it easiest to mark rows as deleted and purge them later. This section outlines a simple technique for purging these marked rows using procedural replication.

Suppose that your database contains the following MAIL_LIST table:

Name                           Null?    Type
------------------------------ -------- --------------
CUSTNO                         NOT NULL NUMBER(4)      PRIMARY KEY
CUSTNAME                                VARCHAR2(10)   
ADDR1                                   VARCHAR2(30)    
ADDR2                                   VARCHAR2(30)      
CITY                                    VARCHAR2(30)   
STATE                                   VARCHAR2(2)    
ZIP                                     NUMBER(9)    
PHONE                                   NUMBER(10)  
REMOVE_DATE                             DATE

Instead of deleting a customer when he or she requests to be removed from your mailing list, the REMOVE_DATE column would be used to indicate former customers; A NULL value would be used for current customers. After customers request removal from the mailing list, their rows are no longer updated. Such a convention avoids conflicts when the rows are actually deleted sometime later. A view of current customers could be defined as follows:

CREATE OR REPLACE VIEW corp.current_mail_list AS
  SELECT custno, custname, addr1, addr2, city, state, zip, phone 
    FROM corp.mail_list WHERE remove_date IS NULL;

Periodically, perhaps once a year after the holiday sales, the former customers would be purged from the table using the REMOVE_DATE field. Such a delete could be performed using row-level replication just by performing the following delete:

DELETE corp.mail_list WHERE remove_date IS NOT NULL AND 
                      remove_date < '01-JAN-95';

However, for a large company with an extensive mail order business, the number of former customers could be quite large resulting in a lot of undesired network traffic and database overhead. Instead, the procedural replication could be used using the following package:

  PROCEDURE remove_cust(purge_date IN DATE);
  PROCEDURE remove_cust(purge_date IN DATE) IS
    -- turn off row-level replication for set delete
    -- prevent phantom reads
    LOCK TABLE corp.mail_list IN EXCLUSIVE MODE;
    DELETE corp.mail_list WHERE remove_date IS NOT NULL AND 
                                remove_date < purge_date;

The DBMS_REPCAT.GENERATE_REPLICATION_SUPPORT procedure would have been used to generate the DEFER_PURGE package during the initial replication setup. Then, the procedural replication package could be called as follows by a single master site:


The procedure, PURGE.REMOVE_CUST, would be executed locally and asynchronously executed at each master, resulting in many rows being deleted with only minimal network traffic.

To ensure that there are no outstanding transactions against the rows to be purged, your application should be written to never update logically deleted rows and the REMOVE_DATE should be old enough to ensure that the logical delete of the row is propagated before the row is purged. Thus, in the previous example, it is probably not necessary to lock the table in EXCLUSIVE mode; although this is another method of guaranteeing that these rows not be updated during the purge.

Using Connection Qualifiers

Connection qualifiers provide a way to have several database links of the same type (for example, public) that point to the same remote database, yet establish those connections using different communications pathways (for example, an ethernet link or a modem link).

See Chapter 2 of Oracle7 Server Distributed Systems, Volume I for information about defining connection qualifiers for a database link.

In a replicated system, after the connection qualifier has been defined as described in Oracle7 Server Distributed Systems, Volume I, use the procedure CREATE_MASTER_REPGROUP in the DBMS_REPCAT package to create the master replication object group that will use that connection qualifier.

For example if you have defined the connection qualifier @ETHERNET and want to create the master replicated object group ACCT to use that qualifier:

         gname          => 'acct',
         group_comment  => 'created by '||user||' on '||SYSDATE,
         master_comment => 'created by '||user||' on '||SYSDATE,
         qualifier      => '@ethernet')

When you create the snapshot replication group, use the CREATE_SNAPSHOT_REPGROUP procedure:

     gname              => 'acct';
     master             => '',
     comment            => 'created on ...',
     propagation_mode   => 'asynchronous',
     qualifier          => '');

After you generate replication support, replication for this site will occur via the ethernet connection specified by the connection qualifier @ETHERNET. The complete database link, with qualifier would look like this in a select statement:


If you have a concurrent modem link to ACCT_HQ.HQ.COM, you can also define a connection qualifier, @MODEM, and use the procedures above to create a master replication group and snapshot sites that update via the modem link. A select statement using the @MODEM connection qualifier would look like:


Attention: If you plan to use connection qualifiers, you will probably need to increase the value of the INIT.ORA parameter, OPEN_LINKS. The default is four open links per process. You will need to estimate the required value based on your usage. See the Oracle7 Server Reference for more information about the parameter OPEN_LINKS.

Go to previous file in sequence Go to next file in sequence
Prev Next
Copyright © 1996 Oracle Corporation.
All Rights Reserved.
Go to Product Documentation Library
Go to books for this product
Go to Contents for this book
Go to Index