Managing Transaction Rollbacks

In the event that a new Master is elected, it is possible for a Replica to find that some of its logs are ahead of the logs held by the Master. While this is unlikely to occur, your code must still be ready to deal with the situation. When it happens, you must roll back the transactions represented by the logs that are ahead of the Master.

You do this by simply closing all your ReplicatedEnvironment handles, and then reopening. During the handshaking process that occurs when the Replica joins the replication group, the discrepancy in log files is resolved for you.

Note that the problem of logs on replicas being ahead of the log on the master is unlikely to occur because the election mechanism favors nodes with the most recent logs. When selecting a master, a simple majority of nodes are required to vote on the choice of master, and they will vote for the node with the most recent log files. When the problem does occur, though, it results in the updates reflected in the Replica's log being discarded when the log is rolled back.

Logs on a Replica can be ahead of the logs on the Master if network or node failures result in transactions becoming durable on fewer than a majority of the nodes in the replication group. This reduced durability is more likely in cases where one or more Replicas show large replication lags relative to the Master. Administrators should monitor replication lags and evaluate whether they are caused by issues with network or host performance. Applications can reduce the chance of transaction rollbacks by avoiding the use of weak durability requirements like ReplicaAckPolicy.NONE or a ReplicationMutableConfig.NODE_PRIORITY of zero.

JE HA lets your application know that a transaction must be rolled back by throwing RollbackException. This exception can by thrown by any operation that is performing routine database access.

    ReplicatedEnvironment repEnv = new ReplicatedEnvironment(...);
    boolean doWork = true;

    while doWork {
        try {
            // performSomeDBWork is the method that
            // performs your database access.
            doWork = performSomeDBWork();
        } catch (RollbackException rb) {
            if (repEnv != null) {
                repEnv.close();
                repEnv = new ReplicatedEnvironment(...);
        }
    }