Oracle Fusion Middleware
Oracle WebLogic Server API Reference
11g Release 1 (10.3.6)

Part Number E13941-06

weblogic.cluster.singleton
Class ReplicatedLeasingBasis

java.lang.Object
  extended by weblogic.cluster.singleton.SimpleLeasingBasis
      extended by weblogic.cluster.singleton.ReplicatedLeasingBasis
All Implemented Interfaces:
LeasingBasis

public class ReplicatedLeasingBasis
extends SimpleLeasingBasis

LeasingBasis that delegates to replicated remote instances. Lampson and others advocate that high-performance, high-availability distributed systems utilize hierarchical lease managers. The reasoning is that consensus algorithms are generally slow and costly - this is true of both Paxos and our DatabaseLeasingBasis - and therefore cannot accomdate fast failover. Instead the primordial lease manager (PLM), is identified utilizing Paxos or some other consensus mechanism and long leases, and is used to manage short leases for some other master lease manager we shall call the hierarchical lease manager (HLM). The general principal is that the identification of the PLM is slow but does not rely on singleton state. The HLM is essentially singleton state that is managed through leases rather than consensus and can therefore be fast. The HLM owns the state for the duration of its lease. It may replicate that state elsewhere, or write it to stable storage, for fault-tolerance but it is still the owner for the duration of the lease. Other parties with some interest in the state agree to abide by the lease interval also avoiding split-brain syndrome.

In order for failover of the HLM to occur failure must be detected and a new HLM elected. The new HLM needs to have access to the old HLM's state. The HLM is leasing relatively quickly and so missed heartbeats can be detected relatively quickly. The new HLM can be elected by other candidate HLM's constantly trying to aquire the HLM lease from the PLM. The candidate HLM's can have access to the primary HLM's state through a replication framework.

In the caching scenario the HLM is not really a lease manager but instead contains partition location information. A client would manipulate a paritioned map by first inquiring of the HLM where a partition resides. Once the location information is known the partition is contacted directly. In order to provide fault tolerance the partition state needs to be synchronously replicated and the HLM needs to determine which copy is the copy-of-record. This can be achieved once again by the primary and secondary leasing against the HLM. This is exactly the scenario we just described for the HLM. One might wonder why we cannot collapse the partition leasing into the HLM and indeed we can, the issue is one of scalability.and fault-tolerance.

Let us consider the case of the partition owners leasing directly against the PLM first. Any cache operation first needs to determine the partition location and then perform the operation. Determining the partition location means that either (a) the caller contacts the PLM to get this or, (b) the partition table is replicated to all servers. (a) is clearly a single point-of-failure. The partition table would have to be maintained in stable storage and the failover time would equate to the server reboot time for the PLM. (b) is a non-sequitur, the state can be replicated, but it cannot be consulted since all copies are, apart from that of the PLM, not authoritative. For the replicated copies to be authoritative the consensus algorithm would have to be run on the partition table itself - a costly operation.

Let us now consider the case of the partition owners leasing against the HLM which in turn leases against the PLM. Partition location information would be determined by contacting the HLM. How does a server know where the HLM resides? In the first instance the server gets this from the PLM. This is not a lease, merely a bootstrap to the HLM. Once the HLM (and the HLM's secondary) is located a server does not need to contact the PLM again unless both the HLM and HLM secondary fail simultaneously. All well and good, but suppose the PLM fails instead. The master HLM's lease will expire and it will be unable to renew it. In this instance a failure detector can be used to determine the outcome of its current lease.

Failure detectors can be implemented in practice by having the failure detector probe each process regularly; an unresponsive process p is placed on the list and a broadcast message is sent to all processes (including p) announcing its death. If p is has not actually crashed, then it will eventually refute its death announcement. Chandra and Toueg show that this weak and unreliable model of failure detectors allows the consensus problem to be solved.

In our case it the master HLM should assume that the PLM has failed and announce it to interested parties. If the PLM has not failed then it will refute the claim, or if the PLM is not contactable by the master, but is contactable by the secondary then the secondary will refute the claim. In this case the master should relinquish its lease since the secondary will have already obtained the lease. If no-one refutes the claim then the master can continue to hold its lease until the consensus algorithm has elected a new PLM. For N HLM's we can tolerate N-1 failures. Thus if we want more reliability we need mor replicas.


Nested Class Summary
 
Nested classes/interfaces inherited from class weblogic.cluster.singleton.SimpleLeasingBasis
SimpleLeasingBasis.LeaseEntry
 
Field Summary
static String BASIS_NAME
           
 
Constructor Summary
ReplicatedLeasingBasis(String leaseType)
           
 
Method Summary
protected static Map getReplicatedMap(String leaseType)
           
 
Methods inherited from class weblogic.cluster.singleton.SimpleLeasingBasis
acquire, findExpiredLeases, findOwner, findPreviousOwner, getLeaseTable, release, renewAllLeases, renewLeases
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BASIS_NAME

public static final String BASIS_NAME
See Also:
Constant Field Values
Constructor Detail

ReplicatedLeasingBasis

public ReplicatedLeasingBasis(String leaseType)
                       throws IOException
Throws:
IOException
Method Detail

getReplicatedMap

protected static Map getReplicatedMap(String leaseType)
                               throws IOException
Throws:
IOException

Copyright 1996, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.

Oracle Fusion Middleware
Oracle WebLogic Server API Reference
11g Release 1 (10.3.6)

Part Number E13941-06