Restoring Log Files

During normal operations, the nodes in a replication group communicate with one another to ensure that the JE cleaner does not reclaim log files still needed by the group. The tail end of the replication stream may still be needed by a lagging Replica in order to make it current with the Master, and so the replication group makes sure the trailing log files needed to bring lagging Replicas up-to-date are not reclaimed.

However, if a node becomes unavailable for a long enough period of time, then log files needed to bring it up to date will be reclaimed by the cleaner. The length of time that a node is unavailable before necessary log files are reclaimed is defined by REP_STREAM_TIMEOUT property, which you can manage using ReplicationConfig.setConfigParam(). The default value is 24 hours.

Once log files have been reclaimed by a cleaner, then the Replica can no longer be brought up to date using the normal replication stream. Your application code will know this has happened when the ReplicatedEnvironment constructor throws an InsufficientLogException.

When your code catches an InsufficientLogException, then you must bring the Replica up-to-date using a mechanism other than the normal replication stream. You do this using the NetworkRestore class. A call to NetworkRestore.execute() causes the Replica to copy the missing log files from a member of the replication group who owns the files and seems to be the least busy. Once the Replica has obtained the log files that it requires, it automatically re-establishes its replication stream with the Master so that the Master can finish bringing the Replica up-to-date.

For example:

  try {
     node = new ReplicatedEnvironment(envDir, repConfig, envConfig);
 } catch (InsufficientLogException insufficientLogEx) {

     NetworkRestore restore = new NetworkRestore();
     NetworkRestoreConfig config = new NetworkRestoreConfig();
     config.setRetainLogFiles(false); // delete obsolete log files.

     // Use the members returned by insufficientLogEx.getLogProviders() 
     // to select the desired subset of members and pass the resulting 
     // list as the argument to config.setLogProviders(), if the 
     // default selection of providers is not suitable.

     restore.execute(insufficientLogEx, config);

     // retry
     node = new ReplicatedEnvironment(envDir, repConfig, envConfig);
 } ...  

Note that the replication group does not maintain information about the log files needed by secondary nodes. Instead, the system retains a set of log files beyond those required for a network restore based on the NETWORK_RESTORE_OVERHEAD property, which you can manage using ReplicationConfig.setConfigParam(). The default value is 10, which means that the system uses the estimate of 10 percent for the additional amount of data that performing a network restore needs to send over the network as compared to using the same log files to perform replication. In this case, the system saves files containing an additional 10 percent of log data beyond the amount needed for a network restore.