External File support

The external file threshold
Creating external files
External file access
External file storage
External Files and Replication

External file support is designed for efficient storage of large objects. An object is considered to be large if it is more than a third of the size of a page. Without external file support, large objects must be broken up into smaller pieces, and then reassembled and/or disassembled every time the record is read or updated. Berkeley DB external file support avoids this assembly/disassembly process by storing the large object in a special directory set aside for the purpose. The data itself is not kept in the database, nor is it placed into the in-memory cache.

external files can only be stored using the data portion of a key/data pair. They are supported only for Btree, Hash, and Heap databases, and only so long as the database is not configured for duplicate records, or duplicate sorted records. In addition, the DBT that you use to access the external file data cannot be configured as a partial DBT if you want to access the data using the external file's streaming interface (introduced below).

Note that if the environment is transactionally-protected, then all access to the external file is also transactionally protected.

The external file threshold

The external file threshold is a positive integer, in bytes, which indicates how large an object must be before it is considered an external file. By default, the external file threshold for any given database is 0, which means that no object will ever be considered an external file. This means that the external file feature is not used by default for Berkeley DB databases.

In order to use the external file feature, you must set the external file threshold to a non-zero, positive integer value. You do this for a given database using the DB->set_blob_threshold() method. Note that this value must be set before you create the database. At any point after database creation time, this method is ignored.

In addition, if you are using an environment, you can change the default threshold for databases created in that environment to something other than 0 by using the DB_ENV->set_blob_threshold() method.

You can retrieve the external file threshold set for a database using the DB->get_blob_threshold(). You can retrieve the default external file threshold set for your environment using the DB_ENV->get_blob_threshold().

Creating external files

There are two ways to create an external file. Before you can use either mechanism, you must set the external file threshold to a non-zero positive integer value (see the previous section for details). Once the external file threshold has been set, you create an external file using one of the two following mechanisms:

  • Configure the DBT used to access the external file data (that is, the DBT used for the data portion of the record) with the DB_DBT_BLOB flag. This causes the data to be stored as an external file regardless of its size, so long as the database otherwise supports external files.

  • Alternatively, creating a data item with a size greater than the external file threshold will cause that data item to be automatically stored as an external file.

External file access

external files may be accessed in the same way as other DBT data, so long as the data itself will fit into memory. More likely, you will find it necessary to use the external file streaming API to read and write external file data. You open an external file stream using the DBC->db_stream() method, close it with the DB_STREAM->close() method, write to it using the the DB_STREAM->write() method, and read it using the DB_STREAM->read() method.

The following example code fragment can be found in your DB distribution at .../db/examples/c/ex_external_file.c.

...
    /* Some necessary variable declarations */
    DBC *dbc;       /* Cursor handle */
    DB_ENV *dbenv;  /* Environment handle */
    DB *dbp;        /* Database handle */
    DB_STREAM *dbs; /* Stream handle */
    DB_TXN *txn;    /* Transaction handle */
    DBT data, key;  /* DBT handles */
    int ret;
    db_off_t size;

    ...

    /* Environment creation skipped for brevity's sake */

    ...

    /* Enable external files and set the size threshold. */
    if ((ret = dbenv->set_ext_file_threshold(dbenv, 1000, 0)) != 0) {
        dbenv->err(dbenv, ret, "set_ext_file_threshold");
        goto err;
    }

    ...

    /* Database and DBT creation skipped for brevity's sake */

    ...

    /* 
        Access the external file using the DB_STREAM API. 
    */
    if ((ret = dbenv->txn_begin(dbenv, NULL, &txn, 0)) != 0){
        dbenv->err(dbenv, ret, "txn");
        goto err;
    }

    if ((ret = dbp->cursor(dbp, txn, &dbc, 0)) != 0) {
        dbenv->err(dbenv, ret, "cursor");
        goto err;
    }

    /*
     * Set the cursor to an external file.  Use DB_DBT_PARTIAL with
     * dlen == 0 to avoid getting any external file data.
     */
    data.flags = DB_DBT_USERMEM | DB_DBT_PARTIAL;
    data.dlen = 0;
    if ((ret = dbc->get(dbc, &key, &data, DB_FIRST)) != 0) {
        dbenv->err(dbenv, ret, "Not an external file");
        goto err;
    }
    data.flags = DB_DBT_USERMEM;

    /* Create a stream on the external file the cursor points to.  */
    if ((ret = dbc->db_stream(dbc, &dbs, DB_STREAM_WRITE)) != 0) {
        dbenv->err(dbenv, 0, "Creating stream.");
        goto err;
    }

    /* Get the size of the external file.  */
    if ((ret = dbs->size(dbs, &size, 0)) != 0) {
        dbenv->err(dbenv, 0, "Stream size.");
        goto err;
    }
    /* Read from the external file. */
    if ((ret = dbs->read(dbs, &data, 0, (u_int32_t)size, 0)) != 0) {
        dbenv->err(dbenv, 0, "Stream read.");
        goto err;
    }
    /* Write data to the external file, increasing its size. */
    if ((ret = dbs->write(dbs, &data, size/2, 0)) != 0) {
        dbenv->err(dbenv, 0, "Stream write.");
        goto err;
    }
    /* Close the stream. */
    if ((ret = dbs->close(dbs, 0)) != 0) {
        dbenv->err(dbenv, 0, "Stream close.");
        goto err;
    }
    dbs = NULL;
    dbc->close(dbc);
    dbc = NULL;
    txn->commit(txn, 0);
    txn = NULL;
    free(data.data);
    data.data = NULL; 

    ...

    /* Handle clean up skipped. */ 

External file storage

external files are not stored in the normal database files on disk in the same way as is other data managed by DB. Instead, they are stored as binary files in a special directory set aside for the purpose.

If you are not using environments, this special external file directory is created relative to the current working directory from which your application is running. You can modify this default location using the DB->set_blob_dir() method, and retrieve the current external file directory using DB->get_blob_dir().

If you are using an environment, then by default the external file directory is created within the environment's home directory. You can change this default location using DB_ENV->set_blob_dir() and retrieve the current default location using DB_ENV->get_blob_dir(). (Note that DB_ENV->get_blob_dir() can successfully retrieve the external file directory only if DB_ENV->set_blob_dir() was previously called.)

Note that because external files are stored outside of the Berkeley DB database files, they are not confined by the four gigabyte limit used for Berkeley DB key and data items. The external file size limit is system dependent. It can be the maximum value in bytes of a signed 32 bit integer (if the Berkeley DB-defined type db_off_t is four bytes in size), or a signed 64 bit integer (if db_off_t is eight bytes in size).

External Files and Replication

Replication supports external files without any special requirements. However, enabling external files in a replicated environment can result in long synchronization times between the client and master sites. To avoid this, execute a transaction checkpoint after updating or deleting one or more external file records.