In JE 7.0 the HA (replication) wire format also changed in order to support the TTL feature. Until all nodes in a replication group have been upgraded to JE 7.0, the TTL feature cannot be used. An exception will be thrown if a write with a non-zero TTL is attempted, and not all nodes have been upgraded. See further below for a description of the TTL feature.
A behavior change was made to DiskOrderedCursor that may require some applications to increase the JE cache size. To prevent applications from having to reserve memory in the Java heap for the DiskOrderedCursor, memory used by the DiskOrderedCursor is now subtracted from the JE cache budget. The maximum amount of such memory is specified, as before, using DiskOrderedCursorConfig.setInternalMemoryLimit. [#24291]
In order to perform write operations in such cases, the application must now call TransactionConfig.setLocalWrite(true) and use this configuration to create a Transaction for performing writes to the non-replicated database.
In addition, it is no longer possible to use a single transaction to write to both replicated and a non-replicated databases. IllegalOperationException will be thrown if this is attempted.
These changes were necessary to prevent corruption when a transaction contains write operations for both replicated and non-replicated databases, and a failover occurs that causes a rollback of this transaction. The probability of corruption is low, but it can occur under the right conditions.
For more information see the javadoc for TransactionConfig.setLocalWrite(true), and the "Non-replicated Databases in a Replicated Environment" section of the ReplicatedEnvironment class javadoc.
One of two utility programs must be used, which are available in the release package for JE 4.1.20, or a later release of JE 4.1. If you are currently running a release earlier than JE 4.1.20, then you must download the latest JE 4.1 release package in order to run these utilities.
The steps for upgrading are as follows.
java -jar je-4.1.20.jar DbPreUpgrade_4_1 -h <dir>If you are using a JE
java -jar je-4.1.20.jar DbRepPreUpgrade_4_1 -h <dir> -groupName <group name> -nodeName <node name> -nodeHostPort <host:port>
The second step -- running the utility program -- does not perform data conversion. This step simply performs a special checkpoint to prepare the environment for upgrade. It should take no longer than an ordinary startup and shutdown.
During the last step -- when the application opens the JE environment using the
current release (JE 5 or later) -- all databases configured for duplicates will
automatically be converted before the
ReplicatedEnvironment constructor returns. Note that a database
might be explicitly configured for duplicates using
DatabaseConfig.setSortedDuplicates(true), or implicitly configured
for duplicates by using a DPL MANY_TO_XXX relationship
The duplicate database conversion only rewrites internal nodes in the Btree, not leaf nodes. In a test with a 500 MB cache, conversion of a 10 million record data set (8 byte key and data) took between 1.5 and 6.5 minutes, depending on number of duplicates per key. The high end of this range is when 10 duplicates per key were used; the low end is with 1 million duplicates per key.
To make the duplicate database conversion predictable during deployment, users
should measure the conversion time on a non-production system before upgrading
a deployed system. When duplicates are converted, the Btree internal nodes are
preloaded into the JE cache. A new configuration option,
EnvironmentConfig.ENV_DUP_CONVERT_PRELOAD_ALL, can be set to false
to optimize this process if the cache is not large enough to hold the internal
nodes for all databases. For more information, see the javadoc for this
If an application has no databases configured for duplicates, then the last step simply opens the JE environment normally, and no data conversion is performed.
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program before opening an environment with JE 5 or later for the first time, an
exception such as the following will normally be thrown by the
com.sleepycat.je.EnvironmentFailureException: (JE 6.0.1) JE 4.1 duplicate DB entries were found in the recovery interval. Before upgrading to JE 5.0, the following utility must be run using JE 4.1 (4.1.20 or later): DbPreUpgrade_4_1. See the change log. UNEXPECTED_STATE: Unexpected internal state, may have side effects. at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:376) at com.sleepycat.je.recovery.RecoveryManager.checkLogVersion8UpgradeViolations(RecoveryManager.java:2694) at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:549) at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:198) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:610) ...
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program, but no exception is thrown when the environment is opened with JE 5
or later, this is probably because the application performed an
Environment.sync before last closing the environment with JE 4.1
or earlier, and nothing else happened to be written (by the application or JE
background threads) after the sync operation. In this case, running the
upgrade utility is not necessary.
New 'get', 'put' and 'delete' API methods have been added to support the TTL feature and expansion of the API in the future. Each 'get' method has a ReadOptions parameter, and each 'put' and 'delete' method has a WriteOptions parameter. WriteOptions includes TTL parameters so that a TTL can be assigned to a record. The return value for the new methods is an OperationResult, or null if the operation fails. OperationResult includes the record's expiration time, for records that have been assigned a TTL. The new methods are as follows.
Note that the Collections API does not have new method signatures, since it conforms to the standard Java collections interfaces. Therefore, it is not currently possible to specify a TTL using the Collection API. However, it is possible to use the DPL API for writing data with a TTL, and then use EntityIndex.map or sortedMap to additionally use the Collections API.
com.sleepycat.je.Database OperationResult get(Transaction txn, DatabaseEntry key, DatabaseEntry data, Get getType, ReadOptions options) OperationResult put(Transaction txn, DatabaseEntry key, DatabaseEntry data, Put putType, WriteOptions options) OperationResult delete(Transaction txn, DatabaseEntry key, WriteOptions options) com.sleepycat.je.Cursor OperationResult get(DatabaseEntry key, DatabaseEntry data, Get getType, ReadOptions options) OperationResult put(DatabaseEntry key, DatabaseEntry data, Put putType, WriteOptions options) OperationResult delete(WriteOptions options) com.sleepycat.je.SecondaryDatabase OperationResult get(Transaction txn, DatabaseEntry key, DatabaseEntry pKey, DatabaseEntry data, Get getType, ReadOptions options) OperationResult delete(Transaction txn, DatabaseEntry key, WriteOptions options) com.sleepycat.je.SecondaryCursor OperationResult get(DatabaseEntry key, DatabaseEntry pKey, DatabaseEntry data, Get getType, ReadOptions options) OperationResult delete(WriteOptions options) com.sleepycat.je.ForwardCursor com.sleepycat.je.JoinCursor com.sleepycat.je.DiskOrderedCursor OperationResult get(DatabaseEntry key, DatabaseEntry data, Get getType, ReadOptions options) // Get.NEXT and CURRENT only com.sleepycat.persist.PrimaryIndex OperationResult put(Transaction txn, E entity, Put putType, WriteOptions writeOptions) // Put.OVERWRITE and NO_OVERWRITE only com.sleepycat.persist.EntityIndex EntityResult get(Transaction txn, K key, Get getType, ReadOptions readOptions) // Get.SEARCH only, more types may be supported later OperationResult delete(Transaction txn, K key, WriteOptions writeOptions) com.sleepycat.persist.EntityCursor EntityResult get(Get getType, ReadOptions readOptions) // All Get types except SEARCH_*, which may be supported later OperationResult update(V entity, WriteOptions writeOptions) OperationResult delete(WriteOptions writeOptions)The 'put' methods are passed a Put enum value and the 'get' methods are passed a Get enum value. The enum values correspond to the methods of the older API. For example, Get.SEARCH corresponds to the older Cursor.getSearchKey method and Put.NO_OVERWRITE corresponds to the older Database.putNoOverwrite method. Future enhancements, like TTL, may be supported via the newer 'get' and 'put' methods, so we recommend that these methods are used instead of the older API methods. However, there are no plans to deprecate or remove the older methods at this time. In fact, the older methods still appear in most of the JE example programs and documentation.
ReadOptions and WriteOptions contain a CacheMode parameter for specifying the cache mode on a per-operation. ReadOptions also contains a LockMode property, which corresponds to the LockMode parameter of the older 'get' and 'put' methods. To ease the translation of existing code, a LockMode.toReadOptions method is provided.
Another API change has to do with key-only 'get' operations, where returning the record data is not needed. Previously, returning the data and its associated overhead could be avoided only by calling DatabaseEntry.setPartial. Now, null may be passed for the data parameter instead. In fact, null may now be passed for all "output parameters", in both the new and old versions of the 'get' and 'put' methods. For more information, see the "Input and Output Parameters" section of the DatabaseEntry class javadoc.
The JE cleaner has also been enhanced to perform purging of expired data. For each data file, a histogram of expired data sizes is stored and used by the cleaner. Along with the obsolete size information that the cleaner already maintains, the histogram allows knowing when a file is ready for cleaning. New related cleaner statistics are as follows:
Another indication of expired data is shown by the DbSpace utility. This now
outputs minimum and maximum utilization and the total expired bytes. A new
option for this utility,
-t DATE-TIME, shows the utilization and
expired bytes for a specified time.
The DbCacheSize utility now has a
-ttl option. Specifying this
option causes the estimated cache size to include space for an expiration time
for each record.
The RecoveryProgress.POPULATE_EXPIRATION_PROFILE phase was added to indicate that the cleaner is reading the stored histograms into cache.
EnvironmentConfig.ENV_EXPIRATION_ENABLED is a new config param that is true by default, meaning that expired data is filtered from queries and purged by the cleaner. It might be set to false to recover data after an extended down time.
In addition, the cleaner "backlog" mechanism has been removed, meaning that EnvironmentStats.getCleanerBacklog and EnvironmentConfig.CLEANER_MAX_BATCH_FILES are now deprecated. The backlog mechansim has not been beneficial for some time and was due for removal. When using TTL, because two-pass cleaning can occur even when true utilization is below EnvironmentConfig.CLEANER_MIN_UTILIZATION, the cleaner backlog statistic would have been misleading.
Caused by: java.lang.ArrayIndexOutOfBoundsException: -96 at com.sleepycat.je.tree.BINDeltaBloomFilter.setBit( BINDeltaBloomFilter.java:257) at com.sleepycat.je.tree.BINDeltaBloomFilter.add( BINDeltaBloomFilter.java:113) at com.sleepycat.je.tree.BIN.createBloomFilter(BIN.java:1863) at com.sleepycat.je.tree.IN.serialize(IN.java:6037) at com.sleepycat.je.tree.IN.writeToLog(IN.java:6021) at com.sleepycat.je.log.entry.INLogEntry.writeEntry(INLogEntry.java:349) at com.sleepycat.je.log.LogManager.marshallIntoBuffer(LogManager.java:731) at com.sleepycat.je.log.LogManager.log(LogManager.java:346) ...[#24896] (7.0.0)
The following assertion would rarely occur, and only if assertions were enabled of course.
com.sleepycat.je.EnvironmentFailureException: (JE 6.4.10) UNEXPECTED_STATE: Unexpected internal state, may have side effects. at com.sleepycat.je.EnvironmentFailureException.unexpectedState( EnvironmentFailureException.java:397) at com.sleepycat.je.tree.IN.getKnownChildIndex(IN.java:782) at com.sleepycat.je.evictor.OffHeapCache.freeRedundantBIN( OffHeapCache.java:1974) at com.sleepycat.je.tree.IN.updateLRU(IN.java:695) at com.sleepycat.je.tree.IN.latchShared(IN.java:600) at com.sleepycat.je.recovery.DirtyINMap.selectDirtyINsForCheckpoint( DirtyINMap.java:277) at com.sleepycat.je.recovery.Checkpointer.doCheckpoint( Checkpointer.java:816) at com.sleepycat.je.recovery.Checkpointer.onWakeup(Checkpointer.java:593) at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:184) at java.lang.Thread.run(Thread.java:745)
The following assertion would occur more often, whether or not assertions were enabled.
com.sleepycat.je.EnvironmentFailureException: (JE 6.4.10) UNEXPECTED_STATE_FATAL: Failed adding new IN ... at com.sleepycat.je.EnvironmentFailureException.unexpectedState( EnvironmentFailureException.java:441) at com.sleepycat.je.dbi.INList.add(INList.java:204) at com.sleepycat.je.tree.IN.addToMainCache(IN.java:2966) at com.sleepycat.je.tree.IN.postLoadInit(IN.java:2939) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:2513) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:2279) at com.sleepycat.je.tree.Tree.searchSplitsAllowed(Tree.java:1919) at com.sleepycat.je.tree.Tree.searchSplitsAllowed(Tree.java:1857) at com.sleepycat.je.tree.Tree.searchSplitsAllowed(Tree.java:1775) at com.sleepycat.je.tree.Tree.findBinForInsert(Tree.java:1746) at com.sleepycat.je.dbi.CursorImpl.insertRecordInternal( CursorImpl.java:1381) at com.sleepycat.je.dbi.CursorImpl.insertOrUpdateRecord( CursorImpl.java:1280) at com.sleepycat.je.Cursor.putNoNotify(Cursor.java:2504) at com.sleepycat.je.Cursor.putNotify(Cursor.java:2365) at com.sleepycat.je.Cursor.putNoDups(Cursor.java:2223) at com.sleepycat.je.Cursor.putInternal(Cursor.java:2060) at com.sleepycat.je.Cursor.put(Cursor.java:730)[#24564] (6.4.11)
Data was missed by preload when BIN-deltas were present in cache. If the preload was performed immediately after opening the Environment, this would normally happen only after a crash-recovery (a normal shutdown did not occur). If the preload was performed later on, BIN-deltas might also be in cache due to eviction.
In addition, the list of waiters will now contain the locker or Transaction requesting the lock, for which the LockConflictException is thrown.
The fix does NOT apply to the information output when EnvironmentConfig.TXN_DUMP_LOCKS is set to true. This information is by nature somewhat inaccurate, because normal locking operations are not frozen when this dump is occurring, so changes to the state of the lock table are occurring concurrently.
The fix also does NOT apply to the deadlock information that is sometimes included in the exception message. This information can also be inaccurate due to concurrent locking operations. This is a larger problem that will be fixed in a future release.
Example exception: Exception in thread "main" com.sleepycat.je.EnvironmentFailureException: (JE 6.4.9) ... last LSN=0x20c7b6/0xa986dc LOG_INTEGRITY: Log information is incorrect, problem is likely persistent. Environment is invalid and must be closed. at com.sleepycat.je.recovery.RecoveryManager.traceAndThrowException( RecoveryManager.java:3176) at com.sleepycat.je.recovery.RecoveryManager.readINs( RecoveryManager.java:1039) at com.sleepycat.je.recovery.RecoveryManager.buildINs( RecoveryManager.java:842) at com.sleepycat.je.recovery.RecoveryManager.buildTree( RecoveryManager.java:757) at com.sleepycat.je.recovery.RecoveryManager.recover( RecoveryManager.java:387) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit( EnvironmentImpl.java:717) at com.sleepycat.je.dbi.DbEnvPool.getEnvironment( DbEnvPool.java:254) at com.sleepycat.je.Environment.makeEnvironmentImpl( Environment.java:287) at com.sleepycat.je.Environment.Thanks to Alexander Kharichev for reproducing this bug and capturing the data files that allowed us to find the problem. This took many months of persistence, and special instrumentation for using with the CLEANER_EXPUNGE option in a production environment.
(Environment.java:268) at com.sleepycat.je.Environment. (Environment.java:212) at com.sleepycat.je.util.DbDump.openEnv(DbDump.java:422) at com.sleepycat.je.util.DbDump.listDbs(DbDump.java:316) at com.sleepycat.je.util.DbDump.main(DbDump.java:296) Caused by: com.sleepycat.je.EnvironmentFailureException: (JE 6.4.9) ... fetchIN of 0x20c756/0x4e81bd parent IN=2785507 IN class=com.sleepycat.je.tree.IN lastFullLsn=0x20c7af/0xc81b2d lastLoggedLsn=0x20c7af/0xc81b2d parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed. at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:2523) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:2293) at com.sleepycat.je.tree.Tree.getParentINForChildIN(Tree.java:1418) at com.sleepycat.je.recovery.RecoveryManager.recoverChildIN( RecoveryManager.java:1338) at com.sleepycat.je.recovery.RecoveryManager.recoverIN( RecoveryManager.java:1166) at com.sleepycat.je.recovery.RecoveryManager.replayOneIN( RecoveryManager.java:1130) at com.sleepycat.je.recovery.RecoveryManager.readINs( RecoveryManager.java:1021) ... 11 more Caused by: java.io.FileNotFoundException: .../0020c756.jdb (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile. (RandomAccessFile.java:241) at java.io.RandomAccessFile. (RandomAccessFile.java:122) at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile. ( FileManager.java:3226) at com.sleepycat.je.log.FileManager$6.createFile( FileManager.java:3254) at com.sleepycat.je.log.FileManager.openFileHandle( FileManager.java:1333) at com.sleepycat.je.log.FileManager.getFileHandle( FileManager.java:1204) at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1136) at com.sleepycat.je.log.LogManager.getLogEntry( LogManager.java:823) at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery( LogManager.java:788) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:2345) ... 17 more
In addition, a change was made to allow off-heap LNs to be evicted sooner, to to delay eviction of off-heap BINs (or their mutation to BIN-deltas). Previously, when a BIN was evicted from main cache and moved off-heap, its off-heap LNs were made "hot" in the off-heap cache. This no longer occurs.
com.sleepycat.je.EnvironmentFailureException.unexpectedException( EnvironmentFailureException.java:351) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:496) at com.sleepycat.je.log.LogManager.logItem(LogManager.java:438) at com.sleepycat.je.log.LogManager.log(LogManager.java:350) at com.sleepycat.je.tree.LN.logInternal(LN.java:752) at com.sleepycat.je.tree.LN.optionalLog(LN.java:473) at com.sleepycat.je.dbi.CursorImpl.updateRecordInternal( CursorImpl.java:1689) at com.sleepycat.je.dbi.CursorImpl.insertOrUpdateRecord( CursorImpl.java:1321) at com.sleepycat.je.Cursor.putNoNotify(Cursor.java:2509) at com.sleepycat.je.Cursor.putNotify(Cursor.java:2370) at com.sleepycat.je.Cursor.putForReplay(Cursor.java:2038) at com.sleepycat.je.DbInternal.putForReplay(DbInternal.java:186) at com.sleepycat.je.rep.impl.node.Replay.applyLN(Replay.java:1012) ... 2 more Caused by: java.lang.NullPointerException at com.sleepycat.je.rep.vlsn.VLSNIndex.decrement(VLSNIndex.java:526) at com.sleepycat.je.rep.impl.RepImpl.decrementVLSN(RepImpl.java:840) at com.sleepycat.je.log.LogManager.serialLogWork(LogManager.java:710) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:481) ... 13 more[#24281] (6.4.0)
However, other utilities that read the entire log (e.g., DbPrintLog) must perform a directory listing to skip over gaps in the sequence of files numbers caused by log file deletion (cleaning). Therefore, when a large data set is expected or possible, the file size (EnvironmentConfig.LOG_FILE_MAX) should be configured to a larger size. A file size of one GB is recommended for large data sets.
com.sleepycat.je.EnvironmentFailureException: (JE 6.3.7) Problem in ReadWindow.fill, reading from = 0 UNEXPECTED_EXCEPTION: Unexpected internal Exception, may have side effects. MasterFeederSource fetching vlsn=5,096,275 waitTime=1000 Uncaught exception in feeder thread:Thread[Feeder Output for rg1-rn5,5,main] Originally thrown by HA thread: MASTER rg1-rn1(1) at com.sleepycat.je.EnvironmentFailureException.unexpectedException( EnvironmentFailureException.java:366) at com.sleepycat.je.rep.stream.FeederReader$SwitchWindow.fillNext( FeederReader.java:572) at com.sleepycat.je.log.FileReader.readData(FileReader.java:822) at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions( FileReader.java:379) at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:276) at com.sleepycat.je.rep.stream.FeederReader.scanForwards( FeederReader.java:308) at com.sleepycat.je.rep.stream.MasterFeederSource.getWireRecord( MasterFeederSource.java:100) at com.sleepycat.je.rep.impl.node.Feeder$OutputThread.writeAvailableEntries( Feeder.java:1219) at com.sleepycat.je.rep.impl.node.Feeder$OutputThread.run(Feeder.java:1109) Caused by: java.io.FileNotFoundException: /scratch/suitao/dctesting/kvroot/mystore/sn3/rg1-rn1/env/00000000.jdb (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:122) at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>( FileManager.java:3201) at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3229) at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1308) at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1179) at com.sleepycat.je.rep.stream.FeederReader$SwitchWindow.fillNext( FeederReader.java:511) ... 7 more[#24299] (6.4.0)
Please be aware of the following limitations in the initial release of this feature:
The following additional API additions are associated with the off-heap cache.
INFOto emphasize that the protection of files from deletion is expected behavior
EnvironmentFailureExceptionbeing thrown from the method
Environment.beginTransaction(), when a replicated environment was closed at a master while new transactions were being concurrently initiated. The following representative stack trace is symptomatic of this problem (the specifics of the stack trace may vary depending on the JE release):
... at com.sleepycat.je.EnvironmentFailureException.unexpectedException(EnvironmentFailureException.java:351) at com.sleepycat.je.rep.utilint.RepUtils$ExceptionAwareCountDownLatch.awaitOrException(RepUtils.java:268) at com.sleepycat.je.rep.utilint.SizeAwaitMap.sizeAwait(SizeAwaitMap.java:106) at com.sleepycat.je.rep.impl.node.FeederManager.awaitFeederReplicaConnections(FeederManager.java:528) at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureReplicasForCommit(DurabilityQuorum.java:74) at com.sleepycat.je.rep.impl.RepImpl.txnBeginHook(RepImpl.java:944) at com.sleepycat.je.rep.txn.MasterTxn.txnBeginHook(MasterTxn.java:158) at com.sleepycat.je.txn.Txn.initTxn(Txn.java:365) at com.sleepycat.je.txn.Txn.<init>(Txn.java:275) at com.sleepycat.je.txn.Txn.<init>(Txn.java:254) at com.sleepycat.je.rep.txn.MasterTxn.<init>(MasterTxn.java:114) at com.sleepycat.je.rep.txn.MasterTxn$1.create(MasterTxn.java:102) at com.sleepycat.je.rep.txn.MasterTxn.create(MasterTxn.java:380) at com.sleepycat.je.rep.impl.RepImpl.createRepUserTxn(RepImpl.java:924) at com.sleepycat.je.txn.Txn.createUserTxn(Txn.java:301) at com.sleepycat.je.txn.TxnManager.txnBegin(TxnManager.java:182) at com.sleepycat.je.dbi.EnvironmentImpl.txnBegin(EnvironmentImpl.java:2366) at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1437) at com.sleepycat.je.Environment.beginTransaction(Environment.java:1319) ... Caused by: java.lang.IllegalStateException: FeederManager shutdown at com.sleepycat.je.rep.impl.node.FeederManager.shutdownFeeders(FeederManager.java:498) at com.sleepycat.je.rep.impl.node.FeederManager.runFeeders(FeederManager.java:462) at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1479)[#23970] (6.3.0)
Normally, records (key-value pairs) are stored on disk as individual byte sequences called LNs (leaf nodes) and they are accessed via a Btree. Specifically, the bottom layer nodes of the Btree (called BINs) contain an array of slots, where each slot represents an associated data record. Among other things, it stores the key of the record and the most recent disk address of that record. Records and BTree nodes share the disk space (are stored in the same kind of files), but LNs are stored separately from BINs, i.e., there is no clustering or co-location of a BIN and its child LNs.
With embedded LNs, a whole record may be stored inside a BIN (i.e., a BIN slot may contain both the key and the data portion of a record). A record will be "embedded" if the size (in bytes) of its data portion is less than or equal to the value of the new EnvironmentConfig.TREE_MAX_EMBEDDED_LN configuration parameter. The decision to embed a record or not is taken on a record-by-record basis. As a result, a BIN may contain both embedded and non-embedded records. The "embeddedness" of a record is a dynamic property: a size-changing update may turn a non-embedded record to an embedded one or vice-versa.
The performance trade-offs of embedding or not embedding records are described in the javadoc for the TREE_MAX_EMBEDDED_LN configuration parameter.
To exploit embedded LNs during disk ordered scans, a new "binsOnly" mode has been added in DiskOrderedCursorConfig. In this mode, only the BINs of a database will be accessed (not the LNs). As a result, the scan will be faster, but the data portion of a record will be returned only if the record is embedded. This is most useful when we expect that all the records in a database will be embedded.
Finally, a new statistic has been added to the PreloadStats class. It is the number of embedded LNs encountered during the preload() operation, and is accessible via the getNEmbeddedLNs() method.
First, we clarified the documented definition of partial comparators, although the actual behavior of partial comparators did not change. The documentation change is subtle and will only be interesting to those currently using the PartialComparator interface. See the PartialComparator javadoc for details.
The second change is a fix for a bug that could occur only if a PartialComparator was used (and as a result record keys were updatable). In this case and under some rare situations, updates done on keys could be lost.
1. The only potential benefit of KEEP_HOT, as compared to DEFAULT, is that KEEP_HOT attempts to keep the record's leaf-node (LN) and its containing bottom internal node (BIN) in cache even if it is not accessed frequently. We don't know of a use case for this behavior.
2. There are currently implementation problems with KEEP_HOT. The current implementation of the cache evictor is based on an LRU list, and there is no practical way to keep all BINs accessed with KEEP_HOT at the hot end of the LRU list. The current implementation moves it to the hot end when it reaches the cold end (as other BINs are accessed and moved to the hot end), if the BIN has not been accessed since it was made "keep hot". But if the BIN again moves to the cold end, it is evicted to try to prevent the cache from overflowing when KEEP_HOT is used for many operations. This approach does not really guarantee that the cache won't overflow, and also does not really force the node to stay hot.
1. MAKE_COLD was originally added in an attempt to avoid perturbing the cache for full Database scans, etc. The UNCHANGED mode should really be used for this purpose, especially given the improvements made to this mode (discussed above).
2. The main difference between MAKE_COLD and the new behavior of UNCHANGED is that MAKE_COLD always evicts the LN and BIN, regardless of whether they have been made "hot" by other operations. Again, we don't know of a use case for this behavior.
Also, the javadoc for Environment.close now talks about performing an extra checkpoint prior to calling close and disabling the cleaner threads. This is related to the "batch cleaning" process described in the cleanLogFile javadoc.
Caused by: com.sleepycat.je.util.LogVerificationException: Log is invalid, fileName: 00038369.jdb fileNumber: 0x38369 logEntryOffset: 0x84 verifyState: INVALID reason: Header prevOffset=0x26 but prevEntryStart=0x45[#24211] (6.3.4)
Caused by: java.lang.IllegalArgumentException: Host and port pair was missing at com.sleepycat.je.rep.utilint.HostPortPair.getSocket(HostPortPair.java:29) at com.sleepycat.je.rep.utilint.HostPortPair.getSockets(HostPortPair.java:56) at com.sleepycat.je.rep.impl.RepImpl.getHelperSockets(RepImpl.java:1499) at com.sleepycat.je.rep.impl.node.RepNode.findMaster(RepNode.java:1214) at com.sleepycat.je.rep.impl.node.RepNode.startup(RepNode.java:787) at com.sleepycat.je.rep.impl.node.RepNode.joinGroup(RepNode.java:1988) at com.sleepycat.je.rep.impl.RepImpl.joinGroup(RepImpl.java:523) at com.sleepycat.je.rep.ReplicatedEnvironment.joinGroup(ReplicatedEnvironment.java:525) at com.sleepycat.je.rep.ReplicatedEnvironment.When an empty string is specified for the helper host/port, the parameter is not used by JE. [#24234] (6.3.6)
For backgroud and previous work in this area, see the changelog for the 6.1 release. In this release we have extended the set of CRUD operations that are performed in BIN-deltas, without the need to mutate them to full BINs (and thus saving the disk reads that would be required to fetch the full BINs in memory). Specifically, the following additional operations can now exploit BIN-deltas:
Insertions and updates, when no tree node splits are required and the key of the record to be inserted/updated is found in a BIN-delta.
Blind operations: we say that a record operation (insertion, update, or deletion) is performed "blindly" in a BIN-delta, when the delta does not contain a slot with the operation's key and we don't need to access the full BIN to check whether such a slot exists there or to extract any information from the full-BIN slot, if it exists. The condition that no tree node splits are required applies to blind operations as well. The following operations can be performed blindly: - Replay of insertions at replica nodes. - Insertions during recovery redo. - Updates and deletes during recovery redo, for databases with duplicates.
A new statistic has been added to count the number blind operations performed,
including the blind put operations described below. This count can be obtained
Normally, blind puts are not possible: we need to know whether the put is actually an update or an insertion, i.e., whether the key exists in the full BIN or not. Furthermore, in case of update we also need to know the location of the previous record version to make the current update abortable. However, it is possible to answer at least the key existence question by adding a small amount of extra information in the deltas. If we do so, puts that are actual insertions can be done blindly.
To answer whether a key exists in a full BIN or not, each BIN-delta stores a bloom filter, which is a very compact, approximate representation of the set of keys in the full BIN. Bloom filters can answer set membership questions with no false negatives and very low probability of false positives. As a result, put operations that are actual insertions can almost always be performed blindly.
To make possible the blind puts optimization in JE databases that use custom
BTree and/or duplicates comparators, these comparators must perform "binary
equality", that is, they must consider two keys (byte arrays) to be equal if
and only if they have the same length and they are equal byte-per-byte. To
communicate to the JE engine that a comparator does binary equality, the
comparator must implement the new
Exception in thread "main" com.sleepycat.je.DatabaseNotFoundException: (JE 6.1.5) Attempted to remove non-existent database ... at com.sleepycat.je.dbi.DbTree.lockNameLN(DbTree.java:869) at com.sleepycat.je.dbi.DbTree.doRemoveDb(DbTree.java:1130) at com.sleepycat.je.dbi.DbTree.dbRemove(DbTree.java:1183) at com.sleepycat.je.Environment$1.runWork(Environment.java:947) at com.sleepycat.je.Environment$DbNameOperation.runOnce(Environment.java:1172) at com.sleepycat.je.Environment$DbNameOperation.run(Environment.java:1155) at com.sleepycat.je.Environment.removeDatabase(Environment.java:941) ...A workaround for the problem in earlier releases is to avoid using read-committed for a transaction used to perform a DB remove or truncate operation.
com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 6.1.0) ... at com.sleepycat.je.EnvironmentFailureException.unexpectedException(EnvironmentFailureException.java:315) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:477) at com.sleepycat.je.log.LogManager.logItems(LogManager.java:419) at com.sleepycat.je.log.LogManager.multiLog(LogManager.java:324) at com.sleepycat.je.log.LogManager.log(LogManager.java:272) at com.sleepycat.je.log.LogManager.log(LogManager.java:261) at com.sleepycat.je.log.LogManager.log(LogManager.java:223) at com.sleepycat.je.dbi.EnvironmentImpl.rewriteMapTreeRoot(EnvironmentImpl.java:1285) at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:701) at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:274) at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:137) at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:148) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.ArrayIndexOutOfBoundsException: 111 at com.sleepycat.util.PackedInteger.writeInt(PackedInteger.java:188) at com.sleepycat.je.log.LogUtils.writePackedInt(LogUtils.java:155) at com.sleepycat.je.cleaner.DbFileSummary.writeToLog(DbFileSummary.java:79) at com.sleepycat.je.dbi.DatabaseImpl.writeToLog(DatabaseImpl.java:2410) at com.sleepycat.je.dbi.DbTree.writeToLog(DbTree.java:2050) at com.sleepycat.je.log.entry.SingleItemEntry.writeEntry(SingleItemEntry.java:114) at com.sleepycat.je.log.LogManager.marshallIntoBuffer(LogManager.java:745) at com.sleepycat.je.log.LogManager.serialLogWork(LogManager.java:611) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:461) ... 11 moreAnother instance of the same problem with a slightly different stack trace is below:
java.nio.BufferOverflowException UNEXPECTED_EXCEPTION_FATAL: Unexpected internal Exception, unable to continue. Environment is invalid and must be closed. at com.sleepycat.je.EnvironmentFailureException.unexpectedException(EnvironmentFailureException.java:315) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:481) at com.sleepycat.je.log.LogManager.logItems(LogManager.java:423) at com.sleepycat.je.log.LogManager.multiLog(LogManager.java:325) at com.sleepycat.je.log.LogManager.log(LogManager.java:273) at com.sleepycat.je.tree.LN.logInternal(LN.java:600) at com.sleepycat.je.tree.LN.log(LN.java:411) at com.sleepycat.je.cleaner.FileProcessor.processFoundLN(FileProcessor.java:1070) at com.sleepycat.je.cleaner.FileProcessor.processLN(FileProcessor.java:884) at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:673) at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:278) at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:137) at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:148) Caused by: java.nio.BufferOverflowException at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189) at java.nio.ByteBuffer.put(ByteBuffer.java:859) at com.sleepycat.je.log.LogUtils.writeBytesNoLength(LogUtils.java:350) at com.sleepycat.je.log.entry.LNLogEntry.writeBaseLNEntry(LNLogEntry.java:371) at com.sleepycat.je.log.entry.LNLogEntry.writeEntry(LNLogEntry.java:333) at com.sleepycat.je.log.entry.BaseReplicableEntry.writeEntry(BaseReplicableEntry.java:48) at com.sleepycat.je.log.entry.LNLogEntry.writeEntry(LNLogEntry.java:52) at com.sleepycat.je.log.LogManager.marshallIntoBuffer(LogManager.java:751) at com.sleepycat.je.log.LogManager.serialLogWork(LogManager.java:617) at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:465)
Here is the specific scenario:
In addition, the EnvironmentConfig.ENV_LATCH_TIMEOUT parameter has been exposed to provide control over the timeout interval for atypical applications. This parameter has been present internally since latch timeouts were added in JE 6.0.3; however, the parameter was previously undocumented.
... LockManager.findDeadlock1 LockManager.findDeadlock LockManager.makeTimeoutMsgInternal ...
"THREAD-USING-READ-COMMITTED": at com.sleepycat.je.txn.Txn.setState(Txn.java:2039) - waiting to lock <0x000000078953b720> (a com.sleepycat.je.txn.Txn) at com.sleepycat.je.txn.Txn.setOnlyAbortable(Txn.java:1887) at com.sleepycat.je.txn.BuddyLocker.setOnlyAbortable(BuddyLocker.java:158) at com.sleepycat.je.OperationFailureException.
(OperationFailureException.java:200) at com.sleepycat.je.LockConflictException. (LockConflictException.java:135) at com.sleepycat.je.LockTimeoutException. (LockTimeoutException.java:48) at com.sleepycat.je.txn.LockManager.newLockTimeoutException(LockManager.java:665) at com.sleepycat.je.txn.LockManager.makeTimeoutMsgInternal(LockManager.java:623) at com.sleepycat.je.txn.SyncedLockManager.makeTimeoutMsg(SyncedLockManager.java:97) - locked <0x000000079068eaa8> (a com.sleepycat.je.latch.Latch) at com.sleepycat.je.txn.LockManager.lockInternal(LockManager.java:390) at com.sleepycat.je.txn.LockManager.lock(LockManager.java:276) ... "ANOTHER-THREAD-LOCKING-THE-SAME-RECORD": at com.sleepycat.je.txn.SyncedLockManager.attemptLock(SyncedLockManager.java:73) - waiting to lock <0x000000079068eaa8> (a com.sleepycat.je.latch.Latch) at com.sleepycat.je.txn.LockManager.lockInternal(LockManager.java:292) at com.sleepycat.je.txn.LockManager.lock(LockManager.java:276) - locked <0x000000078953b720> (a com.sleepycat.je.txn.Txn) ...
(JE 6.2.6) ... Latch not held: BIN17923 currentThread: ... currentTime: ... exclusiveOwner: -none- UNEXPECTED_STATE_FATAL: Unexpected internal state, unable to continue. Environment is invalid and must be closed. at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:405) at com.sleepycat.je.latch.LatchImpl.release(LatchImpl.java:109) at com.sleepycat.je.tree.IN.releaseLatch(IN.java:519) at com.sleepycat.je.dbi.CursorImpl.skipInternal(CursorImpl.java:2737) at com.sleepycat.je.dbi.CursorImpl.skip(CursorImpl.java:2612) at com.sleepycat.je.Cursor.countHandleDups(Cursor.java:4055) at com.sleepycat.je.Cursor.countInternal(Cursor.java:4028) at com.sleepycat.je.Cursor.count(Cursor.java:1804) at ...The last line above is a call to Cursor.count. The same problem could happen if Cursor.skipNext or skipPrev is called, and only the last few lines of the stack trace above would be different.
java.lang.AssertionError at com.sleepycat.je.dbi.CursorImpl.getCurrentKey(CursorImpl.java:500) at com.sleepycat.je.dbi.CursorImpl.getCurrentKey(CursorImpl.java:483) at com.sleepycat.je.Cursor.dupsGetNextOrPrevDup(Cursor.java:2882) at com.sleepycat.je.Cursor.retrieveNextHandleDups(Cursor.java:2836) at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:2816) at com.sleepycat.je.Cursor.getNextDup(Cursor.java:1150) [ app specific portion ... ]In the stack trace above the Cursor.getNextDup method is being called. There are other operations where the same thing could happen. The common factor is the call to the internal CursorImpl.getCurrentKey method, which fires the assertion.
com.sleepycat.je.EnvironmentFailureException: (JE 6.2.29) ... last LSN=0x533/0x41f59 LOG_INTEGRITY: Log information is incorrect, problem is likely persistent. Environment is invalid and must be closed. at com.sleepycat.je.recovery.RecoveryManager.traceAndThrowException(RecoveryManager.java:3031) at com.sleepycat.je.recovery.RecoveryManager.readINs(RecoveryManager.java:1010) at com.sleepycat.je.recovery.RecoveryManager.buildINs(RecoveryManager.java:804) at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:717) at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:352) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:670) at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:208) at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:251) at com.sleepycat.je.Environment.[#23990] (6.2.31)
(Environment.java:232) at com.sleepycat.je.Environment. (Environment.java:188) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:573) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:443) [ app specific portion ... ] Caused by: com.sleepycat.je.EnvironmentFailureException: (JE 6.2.29) ... fetchIN of 0x35c/0x3f7f9 parent IN=11688 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x533/0x5d47d lastLoggedVersion=0x533/0x5d47d parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed. at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1866) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1764) at com.sleepycat.je.tree.Tree.getParentINForChildIN(Tree.java:1346) at com.sleepycat.je.recovery.RecoveryManager.recoverChildIN(RecoveryManager.java:2025) at com.sleepycat.je.recovery.RecoveryManager.recoverIN(RecoveryManager.java:1834) at com.sleepycat.je.recovery.RecoveryManager.replayOneIN(RecoveryManager.java:1099) at com.sleepycat.je.recovery.RecoveryManager.readINs(RecoveryManager.java:988) ... 16 more Caused by: java.io.FileNotFoundException: .../0000035c.jdb (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile. (RandomAccessFile.java:241) at java.io.RandomAccessFile. (RandomAccessFile.java:122) at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile. (FileManager.java:3260) at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3288) at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1311) at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1183) at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1135) at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:822) at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:787) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1801) ... 22 more
com.sleepycat.je.EnvironmentFailureException: (JE 6.2.9) ... fetchIN of 0x10cbc/0x696373 parent IN=84363 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x10e00/0x82006e lastLoggedVersion=0x10e00/0x82006e parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed. at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1866) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1752) at com.sleepycat.je.tree.Tree.search(Tree.java:2293) at com.sleepycat.je.tree.Tree.search(Tree.java:2193) at com.sleepycat.je.tree.Tree.getParentBINForChildLN(Tree.java:1481) at com.sleepycat.je.cleaner.FileProcessor.processLN(FileProcessor.java:836) ... 5 more Caused by: java.io.FileNotFoundException: /local/pyrox/DS2/asinst_1/OUD/db/Europe/00010cbc.jdb (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.[#24046] (6.2.31)
(RandomAccessFile.java:241) at java.io.RandomAccessFile. (RandomAccessFile.java:122) at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile. (FileManager.java:3208) at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3236) at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1305) at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1177) at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1151) at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:843) at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:808) at com.sleepycat.je.tree.IN.fetchINWithNoLatch(IN.java:1801) ... 10 more
For Oracle NoSQL DB users only, record versions are now discarded using a separate eviction step. This means that the record versions can be discarded to free cache memory without discarding the entire BIN (bottom internal node). In general, this makes better use of memory and reduces IO for some workloads.
The improvements to DbCacheSize are as follows.
-je.rep.preserveRecordVersion trueis passed on the command line, more information is output by the utility. See the new Record Versions and Oracle NoSQL Database section of the DbCache javadoc for more information.
-je.log.fileMax LENGTHon the command line as described in the javadoc.
ReplicaWriteException. Previously an attempt to serialize this exception could fail with the following characteristic stack trace when the
StateChangeEventobject was encountered during serialization:
Caused by: java.io.NotSerializableException: com.sleepycat.je.rep.StateChangeEvent at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1181) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:439) at java.util.logging.LogRecord.writeObject(LogRecord.java:470) ...[#23578] (6.1.1)
Before JE 6.0, BIN-deltas were used as a disk optimization only: to reduce the amount of bytes written to disk every time a new BIN version had to to be logged. BIN-deltas would never appear in the in-memory BTrees, and if the most recently logged version of a BIN was a delta, fetching that BIN into the in-memory tree required 2 disk reads: one for the delta and one for the most recent full-BIN version.
Starting with JE 6.0, BIN-deltas can appear in the in-memory BTree. Specifically, if a full dirty BIN is selected for eviction, rather than evicting the whole BIN (and incurring a disk write), the BIN is converted to a delta that stays in the cache. If a subsequent operation needs the full BIN and the delta is still in the cache, only one disk read will be done.
Further disk-read savings can be realized, because many operations can (under certain conditions) be performed directly on the BIN-delta, without the need for the full BIN. However, in 6.0, only a small subset of background operations were modified to exploit BIN-deltas. In JE 6.1, the set of operations that can be performed on BIN-deltas has been extended. Examples of such operations include key searches in BTrees, if the search key is found on a BIN delta and deletion or update of the record a cursor is located on, if the cursor is located on a BIN-delta. These changes affect both internal operations as well as the search, delete, and putCurrent methods of the Database and Cursor API classes.
Typically, thread synchronization during BTree searches is done via latch coupling: at most 2 tree nodes (a parent and a child) are latched at a time. Furthermore, a node is latched in shared (SH) mode, unless it is expected that it will be updated, in which case it is latched in exclusive (EX) mode. Finally, SH latches are not upgradeable to EX latches (to avoid deadlocks and reduce latching overhead).
JE follows this general latch-coupling technique. However, it also has to deal with the JE-specific fact that fetching a missing child node into the cache requires that its memory-resident parent be updated (because the parent points to its children via direct Java object references). As a result, during a JE BTree search every node is potentially updated, which precludes the use of SH latches. To cope with this complication, JE has been using one of the following approaches during its various kinds of BTree searches: (a) use SH latches, but if a missing child needs to be fetched, release the SH latch on the parent and restart the search from the beginning, using EX latches on all nodes this time, (b) do grandparent latching: use SH latches but keep a latch on the grandparent node so that if we need to fetch a missing child of the parent node, the SH latch on the parent can be released, and then the parent can be relatched in EX mode, (c) do latch-coupling with EX latches only. Obviously, (c) is the worst choice, but all of the 3 approaches result in more and longer-held EX latches than necessary. As a result, some JE applications have experienced performance problems due to excessive latch contention during BTree searches.
In JE 6.1, a new latching algorithm has been implemented to replace all of (a), (b), and (c) above. The new algorithm uses SH latches, but if a missing child needs to be fetched, it first "pins" the parent (to prevent its eviction), then releases the SH latch on the parent, and finally reads the child node from the log (without any latches held). After the child is fetched, it latches the remembered parent in EX mode, unpins it, and checks whether it is still the correct parent for the search and for the child node that was fetched. If so, the search continues down the tree. If not, it restarts the search from the beginning. Compared to approach (a) above, this new algorithm may restart a search multiple times, however the probability of even a single restart is less than (a), and each restart uses SH latches. Furthermore, no latches are held during the long random disk read done to fetch a missing child.
com.sleepycat.je.EnvironmentFailureException: Node5(5):... VLSN 3,182,883 should be held within this tracker.or
com.sleepycat.je.EnvironmentFailureException: Node5(5):...end of last bucket should match end of range ...[#23491]
Counting the number of records in a database is now implemented using a disk-ordered-scan (DOS), similar to the one used by DiskOrderedCursor. DOS may consume a large amount of memory, and to avoid OutOfMemoryErrors, it requires that a limit on its memory consumption is provided. As a result, a new method, Database.count(long memoryLimit), has been implemented that takes this memory limit as a parameter. The existing Database.count() method is still available and uses an internally established limit.
This change fixes two problems of the previous implementation (based on the SortedLSNTreeWalker class): 1. There was no upper bound on the memory consumption of the previous implementation and 2. It was buggy in the case where concurrent thread activity could cause full BINs to be mutated to deltas or vice versa.
Iterating over the records of a database via a DiskOrderedCursor would cause a crash if a BIN delta was encountered in the in-memory BTree (because in this case a copy of the BIN delta was created and cached for later use, but the copy did not contain all the needed information from the original). This bug was introduced in JE 6.0.11.
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.97) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) ... java.io.FileNotFoundException: ...\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed. at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:210) at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1594) at com.sleepycat.je.dbi.DiskOrderedCursorImpl.checkEnv(DiskOrderedCursorImpl.java:234) at com.sleepycat.je.DiskOrderedCursor.checkState(DiskOrderedCursor.java:367) at com.sleepycat.je.DiskOrderedCursor.getNext(DiskOrderedCursor.java:324) ...[#23676] (6.1.3)
In order to perform write operations in such cases, the application must now call TransactionConfig.setLocalWrite(true).
In addition, it is no longer possible to use a single transaction to write to both a replicated and a non-replicated databases. IllegalOperationException will be thrown if this is attempted.
These changes were necessary to prevent corruption when a transaction contains write operations for both replicated and non-replicated databases, and a failover occurs that causes a rollback of this transaction. The probability of corruption is low, but it can occur under the right conditions.
For more information see the javadoc for TransactionConfig.setLocalWrite(true), and the "Non-replicated Databases in a Replicated Environment" section of the ReplicatedEnvironment class javadoc.
Durability.READ_ONLY_TXN has been deprecated and TransactionConfig.setReadOnly should be used instead.
The conditions that cause the bug are:
If this bug is encountered, it can be corrected by upgrading to the JE release containing this fix, and no data loss will occur.
This bug is similar to another bug that was fixed in JE 5.0.70 [#22052]. This bug differs in that the transaction must write records in multiple databases, and at least one but not all of the databases must be removed or truncated between the two abnormal shutdowns.
com.sleepycat.je.EnvironmentFailureException: (JE 6.1.3) Node1(-1):... last LSN=0x3/0x4427 LOG_INTEGRITY: Log information is incorrect, problem is likely persistent. Environment is invalid and must be closed. at com.sleepycat.je.recovery.RecoveryManager.traceAndThrowException(RecoveryManager.java:3012) at com.sleepycat.je.recovery.RecoveryManager.undoLNs(RecoveryManager.java:1253) at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:741) at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:352) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:654) at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:208) at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:252) at com.sleepycat.je.Environment.[#22071] (6.1.3)
(Environment.java:232) at com.sleepycat.je.Environment. (Environment.java:188) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:573) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:443) ... [app creates a new ReplicatedEnvironment here] ... Caused by: java.lang.NullPointerException at com.sleepycat.je.log.entry.LNLogEntry.postFetchInit(LNLogEntry.java:412) at com.sleepycat.je.txn.TxnChain. (TxnChain.java:133) at com.sleepycat.je.txn.TxnChain. (TxnChain.java:84) at com.sleepycat.je.recovery.RollbackTracker$RollbackPeriod.getChain(RollbackTracker.java:1009) at com.sleepycat.je.recovery.RollbackTracker$Scanner.rollback(RollbackTracker.java:483) at com.sleepycat.je.recovery.RecoveryManager.undoLNs(RecoveryManager.java:1182) ... 11 more
Changes include adding the
enumeration constant, and the
ReplicationGroup.getDataNodes methods. [#22482] (6.0.1)
As part of the performance improvement work, the following statistics were added.
nCachedBINDeltas: EnvironmentStats.getNCachedBINDeltas-- Number of BIN-deltas (partial BINs) in cache.
nBINDeltasFetchMiss: EnvironmentStats.getNBINDeltasFetchMiss-- Number of BIN-deltas fetched to satisfy btree operations.
nBINsMutated: EnvironmentStats.getNBINsMutated-- The number of BINs mutated to BIN-deltas by eviction.
lastCheckpointInterval: EnvironmentStats.getLastCheckpointInterval-- Byte length from last checkpoint start to the previous checkpoint start.
In addition, the EnvironmentConfig.TREE_MAX_DELTA param has been deprecated. As of JE 5.0, the benefit from logging BIN-deltas is unrelated to the number of deltas that have been logged since the last full BIN. To configure BIN-delta logging, use EnvironmentConfig.TREE_BIN_DELTA.
As described under 'Upgrading from JE 5.0 or earlier' at the top of this document, to support this cleaner optimization a change was made involving partial Btree and duplicate comparators. Partial comparators are an advanced feature that few applications use. As of JE 6.0, using partial comparators is not recommended. Applications that do use partial comparators must now change their comparator classes to implement the new PartialComparator tag interface, before running the application with JE 6. Failure to do so may cause incorrect behavior during transaction aborts. See the PartialComparator javadoc for more information.
ReplicationConfig.REP_STREAM_TIMEOUTparameter. The system does not store information about replication progress for secondary replicas, though, so a different approach has been added.
The modified algorithm estimates the costs of replication replay and network restore, and protects log files from deletion that could be used for replay if there is sufficient disk space and replay would be less expensive than network restore. These computations apply to all replicas, but are particularly useful for secondary replicas, for which log files will not otherwise be retained if the replicas become temporarily unreachable. Note that disk space calculations are only performed when running with Java 7 or later.
ReplicationConfig parameters were added:
REPLAY_COST_PERCENT- The cost of replaying the replication stream as compared to the cost of performing a network restore.
REPLAY_FREE_DISK_PERCENT- The target amount of free disk space to maintain when selecting log files to retain for use in replay.
To prevent these problems, the size of each logged record is now stored in the Btree BINs (bottom internal nodes), so that utilization can be calculated correctly during record updates and deletions, while still avoiding a fetch of the old version of the record. With this change, the utilization adjustment facility in the log cleaner, which attempted to compensate for this problem by estimating utilization, is no longer needed by most applications.
Therefore the EnvironmentConfig.CLEANER_ADJUST_UTILIZATION parameter is now false by default rather than true, and will be disabled completely in a future version of JE. For more information, see the javadoc for this parameter.
In addition, the new eviction approach implements a more accurate LRU which ensures that dirty nodes are evicted last and thereby reduces unnecessary logging.
As part of this change, the following configuration parameters were deprecated and are ignored by JE:
EnvironmentConfig.EVICTOR_NODES_PER_SCAN EnvironmentConfig.EVICTOR_LRU_ONLYAnd the following configuration parameter was added:
Caused by: com.sleepycat.je.EnvironmentFailureException: (JE 5.0.97) node2(2):foo\node2 Read invisible log entry at 0x0/0xcb776 hdr type="INS_LN_TX/8" vlsn v="19,373" isReplicated="1" isInvisible="1" prev="0xcb74c" size="17" cksum="2626620732" LOG_INTEGRITY: Log information is incorrect, problem is likely persistent. fetchTarget of 0x0/0xcb776 parent IN=29 IN class=com.sleepycat.je.tree.BIN lastFullVersion=0x0/0xf154c lastLoggedVersion=0x0/0xf588e parent.getDirty()=true state=3 at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:1054) at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:906) at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:867) at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1427) at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1250) at com.sleepycat.je.recovery.RecoveryManager.undo(RecoveryManager.java:2415) at com.sleepycat.je.recovery.RecoveryManager.rollbackUndo(RecoveryManager.java:2268) ...[#22848] (6.0.10)