Oracle Berkeley DB Java Edition 12c R1 Change Log

Release 6.1.5

Log File On-Disk Format Changes:

In JE 6.0 the on-disk file format moved to 9. The file format did not change in JE 6.1.

The JE 6.0 file format change is forward compatible in that JE files created with release 5.0 and earlier can be read when opened with JE 6.0 or later. The change is not backward compatible in that files created with JE 6.0 or later cannot be read by earlier releases. Note that if an existing environment is opened read/write, a new log file is written by JE 6.0 or later, and the environment can no longer be read by earlier releases.

Upgrading from JE 6.0 or earlier

Although no file format change was made in JE 6.1, there is a mandatory API change for HA applications that use non-replicated databases under certain conditions. See the change log entry [#23330] for more information.

Upgrading from JE 5.0 or earlier

In addition to a file format change, a change was made involving partial Btree and duplicate comparators. Partial comparators are an advanced feature that few applications use. As of JE 6.0, using partial comparators is not recommended. Applications that do use partial comparators must change their comparator classes to implement the new PartialComparator tag interface, before running the application with JE 6. Failure to do so may cause incorrect behavior during transaction aborts. See the PartialComparator javadoc for more information.

Upgrading from JE 4.1 or earlier

There are two important notes about the file format change in JE 5.0.
  1. The file format change enabled significant improvements in operation performance, memory and disk footprint, and concurrency of databases with duplicate keys. Due to these changes, an upgrade utility must be run before opening an environment with this release, if the environment was created using JE 4.1 or earlier. See the Upgrade Procedure below for more information.
  2. An application which uses JE replication may not upgrade directly from JE 4.0 to JE 5.0 or later. Instead, the upgrade must be done from JE 4.0 to JE 4.1 and then to JE 5.0 or later. Applications already at JE 4.1 are not affected. Upgrade guidance can be found in the new chapter, "Upgrading a JE Replication Group", in the "Getting Started with BDB JE High Availability" guide.
Due to the format changes in JE 5, a special utility program must be run for an environment created with JE 4.1 or earlier, prior to opening the environment with JE 5.0 or later. The utility program is part of JE 4.1. JE 4.1.20, or a later version of JE 4.1, must be used.

One of two utility programs must be used, which are available in the release package for JE 4.1.20, or a later release of JE 4.1. If you are currently running a release earlier than JE 4.1.20, then you must download the latest JE 4.1 release package in order to run these utilities.

The steps for upgrading are as follows.

  1. Stop the application using BDB JE.
  2. Run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility. If you are using a regular non-replicated Environment:
        java -jar je-4.1.20.jar DbPreUpgrade_4_1 -h <dir>
    If you are using a JE ReplicatedEnvironment:
        java -jar je-4.1.20.jar DbRepPreUpgrade_4_1
             -h <dir>
             -groupName <group name>
             -nodeName <node name>
             -nodeHostPort <host:port>
  3. Finally, start the application using the current JE 5.0 (or later) release of BDB JE.

The second step -- running the utility program -- does not perform data conversion. This step simply performs a special checkpoint to prepare the environment for upgrade. It should take no longer than an ordinary startup and shutdown.

During the last step -- when the application opens the JE environment using the current release (JE 5 or later) -- all databases configured for duplicates will automatically be converted before the Environment or ReplicatedEnvironment constructor returns. Note that a database might be explicitly configured for duplicates using DatabaseConfig.setSortedDuplicates(true), or implicitly configured for duplicates by using a DPL MANY_TO_XXX relationship (Relationship.MANY_TO_ONE or Relationship.MANY_TO_MANY).

The duplicate database conversion only rewrites internal nodes in the Btree, not leaf nodes. In a test with a 500 MB cache, conversion of a 10 million record data set (8 byte key and data) took between 1.5 and 6.5 minutes, depending on number of duplicates per key. The high end of this range is when 10 duplicates per key were used; the low end is with 1 million duplicates per key.

To make the duplicate database conversion predictable during deployment, users should measure the conversion time on a non-production system before upgrading a deployed system. When duplicates are converted, the Btree internal nodes are preloaded into the JE cache. A new configuration option, EnvironmentConfig.ENV_DUP_CONVERT_PRELOAD_ALL, can be set to false to optimize this process if the cache is not large enough to hold the internal nodes for all databases. For more information, see the javadoc for this property.

If an application has no databases configured for duplicates, then the last step simply opens the JE environment normally, and no data conversion is performed.

If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility program before opening an environment with JE 5 or later for the first time, an exception such as the following will normally be thrown by the Environment or ReplicatedEnvironment constructor: (JE 6.0.1) JE 4.1 duplicate DB
  entries were found in the recovery interval. Before upgrading to JE 5.0, the
  following utility must be run using JE 4.1 (4.1.20 or later):
  DbPreUpgrade_4_1.  See the change log.
  UNEXPECTED_STATE: Unexpected internal state, may have side effects.

If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility program, but no exception is thrown when the environment is opened with JE 5 or later, this is probably because the application performed an Environment.sync before last closing the environment with JE 4.1 or earlier, and nothing else happened to be written (by the application or JE background threads) after the sync operation. In this case, running the upgrade utility is not necessary.

Changes in 6.1.5

  1. Made an improvement to eviction for Oracle NoSQL DB users, and several improvements to the DbCacheSize utility.

    For Oracle NoSQL DB users only, record versions are now discarded using a separate eviction step. This means that the record versions can be discarded to free cache memory without discarding the entire BIN (bottom internal node). In general, this makes better use of memory and reduces IO for some workloads.

    The improvements to DbCacheSize are as follows.

    [#23550] (6.1.0)

  2. Fixes a bug which prevented serialization of ReplicaWriteException. Previously an attempt to serialize this exception could fail with the following characteristic stack trace when the StateChangeEvent object was encountered during serialization:
     Caused by:
        at java.util.logging.LogRecord.writeObject(
    [#23578] (6.1.1)

  3. The JE HA replica replay mechanism now uses a separate thread to write replica acknowledgements and heartbeat responses to the network. This change results in two improvements:
    1. The replay of changes sent by the master can make progress even in the presence of brief network stalls, thus increasing replica replay throughput; improvements in the range of 5 to 10% have been observed in internal test scenarios.
    2. This new thread is also used to send spontaneous heartbeat response messages, making the heartbeat mechanism, used to detect node failures, more robust.
    [#23195] (6.1.1)

  4. Performance enhancement: executing a subset of CRUD and internal operations on memory-resident BIN-deltas.

    Before JE 6.0, BIN-deltas were used as a disk optimization only: to reduce the amount of bytes written to disk every time a new BIN version had to to be logged. BIN-deltas would never appear in the in-memory BTrees, and if the most recently logged version of a BIN was a delta, fetching that BIN into the in-memory tree required 2 disk reads: one for the delta and one for the most recent full-BIN version.

    Starting with JE 6.0, BIN-deltas can appear in the in-memory BTree. Specifically, if a full dirty BIN is selected for eviction, rather than evicting the whole BIN (and incurring a disk write), the BIN is converted to a delta that stays in the cache. If a subsequent operation needs the full BIN and the delta is still in the cache, only one disk read will be done.

    Further disk-read savings can be realized, because many operations can (under certain conditions) be performed directly on the BIN-delta, without the need for the full BIN. However, in 6.0, only a small subset of background operations were modified to exploit BIN-deltas. In JE 6.1, the set of operations that can be performed on BIN-deltas has been extended. Examples of such operations include key searches in BTrees, if the search key is found on a BIN delta and deletion or update of the record a cursor is located on, if the cursor is located on a BIN-delta. These changes affect both internal operations as well as the search, delete, and putCurrent methods of the Database and Cursor API classes.

    [#23428] (6.1.1)

  5. Performance enhancement: Reduced latch contention during BTree searches.

    Typically, thread synchronization during BTree searches is done via latch coupling: at most 2 tree nodes (a parent and a child) are latched at a time. Furthermore, a node is latched in shared (SH) mode, unless it is expected that it will be updated, in which case it is latched in exclusive (EX) mode. Finally, SH latches are not upgradeable to EX latches (to avoid deadlocks and reduce latching overhead).

    JE follows this general latch-coupling technique. However, it also has to deal with the JE-specific fact that fetching a missing child node into the cache requires that its memory-resident parent be updated (because the parent points to its children via direct Java object references). As a result, during a JE BTree search every node is potentially updated, which precludes the use of SH latches. To cope with this complication, JE has been using one of the following approaches during its various kinds of BTree searches: (a) use SH latches, but if a missing child needs to be fetched, release the SH latch on the parent and restart the search from the beginning, using EX latches on all nodes this time, (b) do grandparent latching: use SH latches but keep a latch on the grandparent node so that if we need to fetch a missing child of the parent node, the SH latch on the parent can be released, and then the parent can be relatched in EX mode, (c) do latch-coupling with EX latches only. Obviously, (c) is the worst choice, but all of the 3 approaches result in more and longer-held EX latches than necessary. As a result, some JE applications have experienced performance problems due to excessive latch contention during BTree searches.

    In JE 6.1, a new latching algorithm has been implemented to replace all of (a), (b), and (c) above. The new algorithm uses SH latches, but if a missing child needs to be fetched, it first "pins" the parent (to prevent its eviction), then releases the SH latch on the parent, and finally reads the child node from the log (without any latches held). After the child is fetched, it latches the remembered parent in EX mode, unpins it, and checks whether it is still the correct parent for the search and for the child node that was fetched. If so, the search continues down the tree. If not, it restarts the search from the beginning. Compared to approach (a) above, this new algorithm may restart a search multiple times, however the probability of even a single restart is less than (a), and each restart uses SH latches. Furthermore, no latches are held during the long random disk read done to fetch a missing child.

    [#18617] (6.1.1)

  6. Fixed a bug that could result in the following exception in a JE HA application: 
     Node5(5):... VLSN 3,182,883 should be held within this tracker.
     Node5(5):...end of last bucket should match end of range ...

  7. Improved the Monitor's ability to discover group status changes, which should improve the robustness of notifications after the monitor is down or when it has lost network connectivity.

    [#23631] (6.1.2)

  8. A new implementation for Database.count() and a new variant of Database.count() that takes a memoryLimit as input.

    Counting the number of records in a database is now implemented using a disk-ordered-scan (DOS), similar to the one used by DiskOrderedCursor. DOS may consume a large amount of memory, and to avoid OutOfMemoryErrors, it requires that a limit on its memory consumption is provided. As a result, a new method, Database.count(long memoryLimit), has been implemented that takes this memory limit as a parameter. The existing Database.count() method is still available and uses an internally established limit.

    This change fixes two problems of the previous implementation (based on the SortedLSNTreeWalker class): 1. There was no upper bound on the memory consumption of the previous implementation and 2. It was buggy in the case where concurrent thread activity could cause full BINs to be mutated to deltas or vice versa.

    [#23646] (6.1.2)

  9. Fixed bug in DiskOrderedCursor.

    Iterating over the records of a database via a DiskOrderedCursor would cause a crash if a BIN delta was encountered in the in-memory BTree (because in this case a copy of the BIN delta was created and cached for later use, but the copy did not contain all the needed information from the original). This bug was introduced in JE 6.0.11.

    [#23646] (6.1.2)

  10. Fixed a bug in DiskOrderedCursor for DeferredWrite databases. An example of the stack trace when the bug occurs is below. Note that although the exception message indicates that a file is missing, actually the problem was transient and no file was missing. Upgrading to the current JE release will fix the problem without requiring data conversion or restoring from a backup.
    (JE 5.0.97) Environment must be closed, caused by:
    Environment invalid because of previous exception:
    (JE 5.0.97) ... ...\ffffffff.jdb
    (The system cannot find the file specified) LOG_FILE_NOT_FOUND:
    Log file missing, log is likely invalid.
    Environment is invalid and must be closed.
    [#23676] (6.1.3)

  11. An API change requires application changes if write operations are performed on a non-replicated database in a replicated environment. A code change is necessary for applications with the following characteristics:

    In order to perform write operations in such cases, the application must now call TransactionConfig.setLocalWrite(true).

    In addition, it is no longer possible to use a single transaction to write to both a replicated and a non-replicated databases. IllegalOperationException will be thrown if this is attempted.

    These changes were necessary to prevent corruption when a transaction contains write operations for both replicated and non-replicated databases, and a failover occurs that causes a rollback of this transaction. The probability of corruption is low, but it can occur under the right conditions.

    For more information see the javadoc for TransactionConfig.setLocalWrite(true), and the "Non-replicated Databases in a Replicated Environment" section of the ReplicatedEnvironment class javadoc.

    [#23330] (6.1.3)

  12. Read-only transactions are now supported. A read-only transaction prohibits write operations, and more importantly in a replicated environment it automatically uses Durability.ReplicaAckPolicy.NONE. A read-only transaction on a Master will thus not be held up, or throw InsufficientReplicasException, if the Master is not in contact with a sufficient number of Replicas at the time the transaction is initiated. To configure a read-only transaction, call TransactionConfig.setReadOnly(true). See this method's javadoc for more information.

    Durability.READ_ONLY_TXN has been deprecated and TransactionConfig.setReadOnly should be used instead.

    [#23330] (6.1.3)

  13. Fixed a bug that could cause a NullPointerException, such as the one below, when a ReplicatedEnvironment is opened on an HA replica node. This prevents the environment from being opened.

    The conditions that cause the bug are:

    1. a replica has been restarted after an abnormal shutdown (ReplicatedEnvironment.close was not called),
    2. a transaction writing records in multiple databases was in progress at the time of the abnormal shutdown,
    3. one of the databases, but not all of them, is then removed or truncated, and finally
    4. another abnormal shutdown occurs.

    If this bug is encountered, it can be corrected by upgrading to the JE release containing this fix, and no data loss will occur.

    This bug is similar to another bug that was fixed in JE 5.0.70 [#22052]. This bug differs in that the transaction must write records in multiple databases, and at least one but not all of the databases must be removed or truncated between the two abnormal shutdowns. (JE 6.1.3) Node1(-1):...
    last LSN=0x3/0x4427 LOG_INTEGRITY: Log information is incorrect, problem is
    likely persistent. Environment is invalid and must be closed.
        ... [app creates a new ReplicatedEnvironment here] ...
    Caused by: java.lang.NullPointerException
        ... 11 more
    [#22071] (6.1.3)

  14. Fixed a bug where a transaction configured for no-wait (using TransactionConfig.setNoWait(true)) behaved as a normal (wait) transction when the ReadCommitted isolation mode was also used. Due to this bug, a LockTimeoutException was thrown when a LockNotAvailableException should have been thrown instead, and the transaction was invalidated when it should not have been. [#23653] (6.1.4)

  15. Fixed eviction bug for shared-cache environments. The bug caused LRU corruption and potential memory leaks in certain cases. The bug was introduced in JE 6.0. Note that the bug has no impact for environments that are not using a shared cache (EnvironmentConfig.setSharedCache(true)). [#23696] (6.1.4)