Chapter 7. Databases

Table of Contents

Opening Databases
Deferred Write Databases
Temporary Databases
Closing Databases
Database Properties
Administrative Methods
Database Example

In Berkeley DB Java Edition, a database is a collection of records. Records, in turn, consist of key/data pairings.

Conceptually, you can think of a Database as containing a two-column table where column 1 contains a key and column 2 contains data. Both the key and the data are managed using DatabaseEntry class instances (see Database Records for details on this class ). So, fundamentally, using a JE Database involves putting, getting, and deleting database records, which in turns involves efficiently managing information encapsulated by DatabaseEntry objects. The next several chapters of this book are dedicated to those activities.

Note that on disk, databases are stored in sequentially numerically named log files in the directory where the opening environment is located. JE log files are described Databases and Log Files.

Also, note that in the previous section of this book, Programming with the Direct Persistence Layer, we described the DPL The DPL handles all database management for you, including creating all primary and secondary databases as is required by your application. That said, if you are using the DPL you can access the underlying database for a given index if necessary. See the Javadoc for the DPL for more information.

Opening Databases

You open a database by using the Environment.openDatabase() method (environments are described in Database Environments). This method creates and returns a Database object handle. You must provide Environment.openDatabase() with a database name.

You can optionally provide Environment.openDatabase() with a DatabaseConfig() object. DatabaseConfig() allows you to set properties for the database, such as whether it can be created if it does not currently exist, whether you are opening it read-only, and whether the database is to support transactions.

Note that by default, JE does not create databases if they do not already exist. To override this behavior, set the creation property to true.

Finally, if you configured your environment and database to support transactions, you can optionally provide a transaction object to the Environment.openDatabase(). Transactions are described in the Berkeley DB, Java Edition Getting Started with Transaction Processing guide.

The following code fragment illustrates a database open:

package je.gettingStarted;

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;

import java.io.File;
...

Environment myDbEnvironment = null;
Database myDatabase = null;

...

try {
    // Open the environment. Create it if it does not already exist.
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    myDbEnvironment = new Environment(new File("/export/dbEnv"), 
                                      envConfig);

    // Open the database. Create it if it does not already exist.
    DatabaseConfig dbConfig = new DatabaseConfig();
    dbConfig.setAllowCreate(true);
    myDatabase = myDbEnvironment.openDatabase(null, 
                                              "sampleDatabase", 
                                              dbConfig); 
} catch (DatabaseException dbe) {
    // Exception handling goes here
}

Deferred Write Databases

By default, JE database operations that modify the database are written (logged) at the time of the operation. For transactional databases, changes become durable when the transaction is committed.

However, deferred write databases operations are not written at the time of the operation. Writing is deferred for as long as possible. The changes are only guaranteed to be durable after the Database.sync() method is called or the database is properly closed.

Deferring writes in this manner has two performance advantages when performing database modifications:

  1. When multiple threads are performing writes, Concurrency is increased because the bottleneck of writing to the log is avoided.

  2. Less total writing takes place. If a single record is modified more than once, or modified and deleted, then only the final result must be written. If a record is inserted and deleted before a database sync or close occurs, nothing at all is written to disk. The same advantage holds for writing internal index information.

Deferred write databases are useful for applications that perform a great deal of database modifications, record additions, deletions, and so forth. By delaying the data write, you delay the disk I/O. Depending on your workload, this can improve your data throughput by quite a lot.

While the durability of a deferred write database is only guaranteed when Database.sync() is called or the database is properly closed, writing may also occur at other times. For example, a JE checkpoint will effectively perform a Database.sync() on all deferred write databases that are open at the time of the checkpoint. If you are using deferred write to load a large data set, and you want to reduce writing as much as possible during the load, consider disabling the JE checkpointer.

Also, if the JE cache overflows as database modifications occur, information discarded from the cache is written to disk in order to avoid losing the changes. If you wish to reduce this writing to a minimum, configure your cache to be large enough to hold the entire data set being modified, or as large as possible.

Note

Despite the examples noted in the previous paragraphs, there is no guarantee that changes to a deferred write database are durable unless Database.sync() is called or the database is closed. If you need guaranteed durability for an operation, consider using transactions instead of deferred write.

You should also be aware that Database.sync() is a relatively expensive operation because all outstanding changes to the database are written, including internal index information. If you find that you are calling Database.sync() frequently, consider using transactions.

All other rules of behavior pertain to deferred write databases as they do to normal databases. Deferred write databases must be named and created just as you would a normal database. If you want to delete the deferred write database, you must remove it just as you would a normal database. This is true even if the deferred write database is empty because its name persists in the environment's namespace until such a time as the database is removed.

Note that determining whether a database is deferred write is a configuration option. It is therefore possible to switch a database between "normal" mode and deferred write database. You might want to do this if, for example, you want to load a lot of data to the database. In this case, loading data to the database while it is in deferred write state is faster than in "normal" state, because you can avoid a lot of the normal disk I/O overhead during the load process. Once the load is complete, sync the database, close it, and and then reopen it as a normal database. You can then continue operations as if the database had been created as a "normal" database.

To configure a database as deferred write, set DatabaseConfig.setDeferredWrite() to true and then open the database with that DatabaseConfig option.

Note

If you are using the DPL, then you configure your entire store to be deferred write using StoreConfig.setDeferredWrite(). You can also sync every database in your store using EntityStore.sync().

For example, the following code fragment opens and closes a deferred write database:

package je.gettingStarted;

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;

import java.io.File;
...

Environment myDbEnvironment = null;
Database myDatabase = null;

...

try {
    // Open the environment. Create it if it does not already exist.
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    myDbEnvironment = new Environment(new File("/export/dbEnv"), 
                                      envConfig);

    // Open the database. Create it if it does not already exist.
    DatabaseConfig dbConfig = new DatabaseConfig();
    dbConfig.setAllowCreate(true);
    // Make it deferred write
    dbConfig.setDeferredWrite(true);
    myDatabase = myDbEnvironment.openDatabase(null, 
                                              "sampleDatabase", 
                                              dbConfig); 

    ...
    // do work
    ...
    // Do this when you want the work to be persistent at a
    // specific point, prior to closing the database.
    myDatabase.sync();

    // then close the database and environment here
    // (described later in this chapter).

} catch (DatabaseException dbe) {
    // Exception handling goes here
}

Temporary Databases

By default, all JE databases are durable; that is, the data that you put in them will remain in them across program runs, unless you explicitly delete the data. However, it is possible to configure a temporary database that is not durable. A temporary database is automatically deleted when it is closed or after a crash occurs.

Temporary databases are essentially in-memory only databases. Therefore, they are particularly useful for applications that want databases which are truly temporary.

Note that temporary databases do not always avoid disk I/O. It is particularly important to realize that temporary databases can page to disk if the cache is not large enough to hold the database's entire contents. Therefore, temporary database performance is best when your in-memory cache is large enough to hold the database's entire data-set.

A temporary database operates internally in deferred write mode and has the same performance advantages as described above for deferred write databases (see Deferred Write Databases). However, unlike deferred write databases, a temporary database is not written during checkpoints and this provides an additional performance advantage.

Temporary databases must be named and created just as you would a normal database. To configure a database as temporary, set DatabaseConfig.setTemporary to true and then open the database with that DatabaseConfig instance.

For example:

package je.gettingStarted;

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;

import java.io.File;
...

Environment myDbEnvironment = null;
Database myDatabase = null;

...

try {
    // Open the environment. Create it if it does not already exist.
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    myDbEnvironment = new Environment(new File("/export/dbEnv"), 
                                      envConfig);

    // Open the database. Create it if it does not already exist.
    DatabaseConfig dbConfig = new DatabaseConfig();
    dbConfig.setAllowCreate(true);
    // Make it a temporary database
    dbConfig.setTemporary(true);
    myDatabase = myDbEnvironment.openDatabase(null, 
                                              "sampleDatabase", 
                                              dbConfig); 

    ...
    // do work
    ...

    // then close the database and environment here
    // (see the next section)

} catch (DatabaseException dbe) {
    // Exception handling goes here
}

Closing Databases

Once you are done using the database, you must close it. You use the Database.close() method to do this.

Closing a database causes it to become unusable until it is opened again. If any cursors are opened for the database, JE warns you about the open cursors, and then closes them for you. Active cursors during a database close can cause unexpected results, especially if any of those cursors are writing to the database in another thread. You should always make sure that all your database accesses have completed before closing your database.

It is recommended that you close all your databases before closing the environment to which they belong.

Cursors are described in Using Cursors later in this manual.

The following illustrates database and environment close:

import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Database;
import com.sleepycat.je.Environment;

...

try {
        if (myDatabase != null) {
            myDatabase.close();
        }

        if (myDbEnvironment != null) {
            myDbEnvironment.close();
        }
} catch (DatabaseException dbe) {
    // Exception handling goes here
}