Allocating Storage and Compressing Data

In This Section:

Storage Allocation

Data Compression

Storage Allocation

Essbase uses a data file to store data blocks. By default, a data file is located in its associated database folder. Data files follow the naming convention essn.pag, where n is greater than or equal to one and less than or equal to 65,535.

Essbase uses an index file to store the index for a database. By default, an index file is located in its associated database folder. Index files follow the naming convention essn.ind, where n is greater than or equal to 1 and less than or equal to 65,535.

Essbase automatically allocates storage for data and index files. You can use disk volumes to control how storage is allocated for these files.

  To specify disk volumes so that you control how storage is allocated:

  1. Verify how much space Essbase uses to store index and data files. See Checking Index and Data File Sizes for information about how to check sizes.

  2. Choose a technique to control storage:

    • Specify which volumes (drives) Essbase uses to store these files. See Specifying Disk Volumes.

    • Install Essbase on one volume and store files on another.

Checking Index and Data File Sizes

  To view index file (.ind file) and data file (.pag file) names, counts, sizes, and totals, and to determine whether each file is open in Essbase, use a tool:

Tool

Topic

Location

Administration Services

Checking Index and Data File Sizes

Oracle Essbase Administration Services Online Help

ESSCMD

LISTFILES

Oracle Essbase Technical Reference

Note:

The file size information that is provided by Windows for index and data files that reside on NTFS volumes may not be accurate. The file size information provided by Administration Services and by LISTFILES is accurate.

Specifying Disk Volumes

Use disk volumes to specify where you want to store Essbase index files (essn.ind) and data files (essn.pag).

Files are written to the disk volume in the following directory structure:

.../app/app_name/db_name

If you do not use the disk volumes setting, Essbase stores data only on the volume where the ARBORPATH directory resides. If the ARBORPATH variable is not set, Essbase stores data only on the volume where the server was started.

For new files, changes to the disk volumes setting take effect when you next start the database. Existing files and volumes are not affected.

Note:

For information about how to check the size of the index and data files, see Checking Index and Data File Sizes.

You can specify disk volumes using Administration Services, MaxL, or ESSCMD. When you use disk volumes, Essbase provides the following options for each disk volume:

  • Volume name

  • Maximum space to use on the volume (called Partition Size in Administration Services and Volume Size in ESSCMD)

  • File type

    You can specify index files, data files, or both. The default is index and data files on the same volume.

  • Maximum file size

    The default and recommended value is 2,097,152 KB (2 GB). When Essbase reaches the maximum file size, it creates a file and names it incrementally. For example, when ess00001.ind is filled to maximum size, Essbase creates ess00002.ind.

    Caution!

    If you specify a volume name but not a volume size, Essbase uses all available space on the volume.

Essbase creates data files and index files in these situations:

  • If the total sizes of all files reach the maximum size that you specified in the disk volumes setting

    By default, the total is the sum of all index and data file sizes. If you specify Index as the file type, the total refers to the sum of all index files on a volume. If you specify Data as the file type, the total refers to the sum of all data files on a volume.

    For example, suppose you want to use up to 12 GB for Essbase files on volume E, 16 GB on volume F, and 16 GB on volume G. Essbase creates a file on volume F when, on volume E, the sizes of the index and data files reach 12 GB and more data needs to be written to disk.

  • If the size of an individual index or data file on any volume reaches 2 GB

    Using the previous example (12 GB for volume E; 16 GB for volume F; and 16 GB for volume G), suppose volumes E and F have reached their capacities and Essbase is using volume G. On volume G, Essbase creates ess00001.ind, filling it to the default limit of 2 GB, and creates ess00001.pag, filling it to 1 GB. 3 GB of the 16 GB on volume G have been used. Because ess00001.ind has reached the maximum file size of 2 GB, the next time Essbase needs storage space when writing index files to disk, it creates ess00002.ind. When ess00002.ind reaches the 2 GB limit, Essbase creates ess00003.ind. Essbase follows the same procedures for data files.

Essbase names files consecutively, starting with ess00001.xxx, where xxx is ind for an index file and pag for a data file, and continuing up to ess65535.xxx. This naming convention applies to each volume, so in the above example, volumes E, F, and G each have files named ess00001.pag and ess00001.ind.

Keep in mind the following guidelines when specifying disk volumes:

  • Specify the disk volumes in the order in which you want the volumes to be used. You do not need to specify the volume on which Essbase is installed as one of the volumes; you can install on one volume and store data on other volumes.

  • If a volume reaches capacity, Essbase moves to the next volume.

  • If all specified volumes reach capacity, Essbase stops ongoing database operations, issues an error message, and performs fatal error handling. See Understanding Fatal Error Handling. If these events occur, shut down the database, allocate more disk space, and restart the database.

  • You can tell Essbase to stop storing files on a volume. Essbase can still access the volume as needed, but it no longer stores additional index and data information on the volume. To stop storing information on a volume, select the volume definition that you want to remove and click Delete.

  • You set disk volumes on a per-database basis. Multiple databases can use space on the same volume, so allocate space carefully. For example, if you specify 7 GB on Volume A for Database 1 and 7 GB on Volume A for Database 2, you have allocated 14 GB for Essbase files on Volume A.

Specifying Disk Volumes with Administration Services

  To specify disk volumes with Administration Services, see “Setting Disk Volumes” in Oracle Essbase Administration Services Online Help.

Specifying Disk Volumes with ESSCMD

  To allocate a new volume, see the ESSCMD SETDBSTATEITEM 23 in the Oracle Essbase Technical Reference.

ESSCMD prompts you for the number of new disk volumes you want to add, unless you supply the number on the command line.

Then, for each new volume, ESSCMD prompts you for the following values, unless you supply them on the command line.

  • Volume name (for each volume)

  • Volume size (maximum space to use on the volume)

    The default value is Unlimited; the minimum setting is 8 MB.

When you use ESSCMD, you can specify volume size in bytes (B), kilobytes (K), megabytes (M), gigabytes (G), or terabytes (T). ESSCMD displays minimum, maximum, and current values and 0 for unlimited.

  • File type

    You can specify index files, data files, or both. The default is 3 - Index + Data (index and data files on the same volume).

  • File size (maximum size for each file specified in file type before Essbase creates a new file)

    The default value is 2 GB; the minimum setting is 8 MB.

The following example allocates up to 10 GB on Volume E, sets a maximum file size of 2 GB, and specifies that data files should be stored only on E:

SETDBSTATEITEM 23 "SAMPLE" "BASIC" "1" "E" "10G" "2" "2G"

  To change the settings on an allocated volume, enter SETDBSTATEITEM 24 in ESSCMD and either follow the prompts or supply the required values on the command line.

ESSCMD prompts you for the following values, unless you supply them on the command line:

  • Volume number

    Use the GETDBSTATE command in ESSCMD to see a list of the currently defined disk volumes and to see the number assigned to each volume.

  • Volume name

  • Volume size

  • File type

  • File size

The following example allocates up to 20 GB on Volume C and sets a maximum file size of 2 GB:

SETDBSTATEITEM 24 "SAMPLE" "BASIC" "1" "C" "20G" "3" "2G"

  To stop Essbase from storing additional files on a volume, enter SETDBSTATEITEM 25 in ESSCMD and either follow the prompts or supply the required values on the command line. Essbase continues accessing files on the deallocated volume but does not write new files to it.

ESSCMD prompts you for the following value, unless you supply it on the command line—Delete which volume definition. Use the GETDBSTATE command in ESSCMD to see a list of the currently defined disk volumes and to see the number assigned to each volume.

The following example deallocates the volume that is specified as fourth:

SETDBSTATEITEM 25 "SAMPLE" "BASIC" "4"

Note:

If you delete an application or database, Essbase does not remove the directory containing the application or database on a disk volume. The computer's operating system still shows the folder and file labels on the disk. However, you can reuse the same name of the application or database that you had removed on the disk volume.

For more syntax information, see the Oracle Essbase Technical Reference.

On UNIX, volume_name is a mounted UNIX file system. You must enter a fully qualified pathname to the Essbase installation directory (ESSBASEPATH). Essbase automatically appends the app directory to the path; you do not specify the app directory.

Consider the following example:

/vol2/EssbaseServer 10M

Volume size is the maximum space, in KB, allocated to the volume. The default value is unlimited—Essbase uses all available space on that volume.

Reviewing an Example of Specifying Volumes to Control Storage

Assume you want to use up to 20 GB for Essbase files on Volume E, 25 GB on Volume F, and 25 GB on Volume G. You are using the default file size limit of 2 GB. When you load data, Essbase stores up to 20 GB on Volume E; if the database is larger than 20 GB, Essbase stores the next 25 GB on Volume F, and so on.

Table 154 shows and example of information about each disk volume.

Table 154. Disk Volume Information

Disk VolumePartition SizeFile TypeFile Size
E20971520KIndex+Data20971520K
F26214400KIndex+Data20971520K
G26214400KIndex+Data20971520K

Data Compression

Essbase allows you to choose whether data blocks that are stored on disk are compressed, as well as which compression scheme to use. When data compression is enabled, Essbase compresses data blocks when it writes them out to disk. Essbase fully expands the compressed data blocks, including empty cells, when the blocks are swapped into the data cache.

Generally, data compression optimizes storage use. You can check compression efficiency by checking the compression ratio statistic. See Checking the Compression Ratio.

Essbase provides several options for data compression:

  • Bitmap compression, the default

    Essbase stores only nonmissing values and uses a bitmapping scheme.

  • Run-length encoding (RLE)

    Essbase compresses repetitive, consecutive values, including zeros and #MISSING values.

  • zlib compression

    Essbase builds a data dictionary based on the actual data being compressed.

  • Index Value Pair compression

    Essbase applies this compression if the block density is less than 3%.

  • No compression

    Essbase does not compress data blocks when they are written to disk.

Because Essbase compresses data blocks as they are written to disk, it is possible for bitmap, RLE, and uncompressed data blocks to coexist in the same data file. Keep in mind the following rules:

  • When a compressed data block is brought into the data cache, Essbase expands the block to its full size, regardless of the scheme that was used to compress it.

  • When Essbase stores a block on disk, Essbase treats the block the same whether it was compressed or uncompressed when it was brought into the data cache. In either case, Essbase compresses the block according to the specified compression type (including not compressing it if no compression is specified).

  • If compression is not enabled, Essbase writes out the fully expanded block to disk.

You may want to disable data compression if blocks have very high density (90% or greater) and have few consecutive, repeating data values. Under these conditions, enabling compression consumes resources unnecessarily.

Bitmap Data Compression

When a data block is not compressed, Essbase uses 8 bytes to store every nonmissing cell (cells with data values). Essbase also uses a 72-byte block header for each block, whether or not the block is compressed.

With bitmap compression, Essbase uses a bitmap to represent data cells and stores only the bitmap, the block header, and the other control information. A bitmap uses one bit (represented as “1” or “0”) for each cell in the data block, whether the cell value is missing (“0”) or nonmissing (“1”). Therefore, the bitmap scheme provides a fixed overhead for data compression.

When using bitmap compression, Essbase stores only nonmissing values and does not compress repetitive values or zeros (contrast with RLE compression, described in RLE Data Compression). When Essbase places a data block into the data cache, it fully expands the data block, using the bitmap to recreate the missing values.

The following is a representation of a portion of an uncompressed data block, with eight cells (cells 1–4 in the first row and cells 5–8 in the second row). In the first row, the data values are: 25, #MISSING, #MISSING, and #MISSING. In the second row, the data values are: #MISSING, 16, 7, and #MISSING.

Uncompressed data block

25         #MISSING   #MISSING   #MISSING
#MISSING   16         7          #MISSING

When the data block is fully expanded in memory, Essbase uses 64 bytes (8 bytes * 8 cells). When the data is stored uncompressed on disk, Essbase uses 24 bytes (8 bytes * 3 cells with data—cells 1, 6, 7).

When the data is stored compressed on disk, Essbase uses 1 byte (1 bit * 8 cells; 8 bits = 1 byte) to store the bitmap. The following is a representation of the bitmap of the uncompressed data block described above:

Bitmap of compressed data block

1          0          0          0
0          1          1          0

In most cases, bitmap compression conserves disk space more efficiently. However, much depends on the configuration of the data.

RLE Data Compression

When using the run-length encoding (RLE) compression scheme, Essbase compresses consecutive, repetitive values—any value, including zero, that repeats three or more times consecutively. Essbase tracks each repeating value and the number of times it repeats consecutively.

With RLE compression, Essbase uses 8 bytes, plus a 16-byte repetition factor, for a total of 24 bytes for the set of three or more consecutive, repetitive cells. For values that are not repeated or are repeated only twice, Essbase uses 8 bytes for each value.

In the following representation of a data block, Essbase considers the three consecutive #MISSING values in the first row (cells 2, 3, and 4) and the left-most #MISSING value in the second row (cell 5) as repeating values. The right-most #MISSING value in the second row (cell 8) is not a repeating value because it is separated from the cell 5 #MISSING value by cells 6 and 7, which have data values (16 and 7, respectively).

Data values in data block

25         #MISSING   #MISSING   #MISSING
#MISSING   16         7          #MISSING

When the data block is fully expanded in memory, Essbase uses 64 bytes (8 bytes * 8 cells). When the data is stored uncompressed on disk, Essbase uses 24 bytes (8 bytes * 3 cells with data—cells 1, 6, 7).

When the data is stored compressed on disk, Essbase uses 56 bytes:

  • Cell 1—8 bytes

  • Cells 2-5—24 bytes (8 bytes + 16 bytes)

  • Cell 6—8 bytes

  • Cell 7—8 bytes

  • Cell 8—8 bytes

Essbase also uses a 72-byte block header for each block, whether or not the block is compressed.

zlib Compression

This method is used in packages such as PNG, Zip, and gzip. Calculation and data loading are faster with direct I/O and zlib compression than with buffered I/O and zlib compression. If data storage is your greatest limiting factor, use zlib, but be aware that, under some circumstances, data loads may be up to 10% slower than bitmap compression. The size of the database, however, is generally significantly smaller when you use zlib compression.

In contrast to bitmap compression, which uses an algorithm to track which values are missing and does not interact with any other type of data, zlib compression builds a data dictionary based on the actual data being compressed (including any missing values). Therefore, zlib compression should provide greater compression ratios over bitmap compression, given extremely dense data. However, because the effectiveness of the zlib algorithm is dependent (at the bit level) on the actual data being compressed, general guidelines about when zlib compression provides greater compression than bitmap compression based solely on density are not available. Unlike other compression methods, the storage space saved has little or no relationship to the number of missing cells or the number of contiguous cells of equal value. Generally, the more dense or heterogeneous the data is, the better zlib will compress it in comparison to bitmap or RLE compression. However, under some circumstances, it is possible that zlib will not yield better results than using bitmap or RLE compression. It is best to test with a representative data sample.

To estimate the storage savings you may obtain with zlib, create a small database using your usual compression technique (bitmap or RLE) with a small sampling of real data and shut down Essbase Server. Note the size of the created data files. Then clear the data in the sample database, change the compression setting to zlib, reload the same sample data, and shut down Essbase Server again. Now note the difference in the storage used. You can also use the small sample database to estimate any changes in calculation or data loading speed.

Index Value Pair Compression

Index Value Pair addresses compression on databases with larger block sizes, where the blocks are highly sparse. This compression algorithm is not selectable but is automatically used whenever appropriate by the database. The user must still choose between the compression types None, bitmap, RLE, and zlib through Administration Services.

Table 155 lists the available compression types the user can choose and the compression types that Essbase evaluates and then applies.

Table 155. Index Value Pair Compression

Chosen Compression Type

Evaluated Compression Type

None

None

Bitmap

Bitmap, Index Value Pair

RLE

RLE, Bitmap, Index Value Pair

zlib

zlib

For example, if the user selects RLE, Essbase reviews each block and evaluates the following compression types for highest compression: RLE, bitmap, or Index Value Pair. If the user chooses zlib, zlib is the only compression type applied.

Deciding Which Compression Type to Use

You can choose from four compression settings: bitmap (the default), RLE, zlib, or None.

In most cases, you need not worry about choosing a setting. Bitmap compression almost always provides the best combination of fast performance and small data files. However, much depends on the configuration of the data.

Data compression is CPU-intensive. Consider the trade-offs of computation costs versus I/O costs and disk space costs when choosing a compression setting.

In general, a database compresses better using the RLE setting than the bitmap setting if a large number of repeated nonmissing data cells for a given block have the same value. Using RLE compression is computationally more expensive than using bitmap compression. If your database shrinks significantly using RLE compression, however, you may see a performance improvement due to decreased I/O costs.

Databases usually shrink when using zlib compression, but not always. Using zlib compression significantly increases CPU processing. For most databases, this extra processing outweighs the benefits of the decreased block size. But if your database shrinks significantly using zlib compression, you may see a performance improvement due to decreased I/O costs.

The None compression setting does not reduce the disk usage of a database compared to bitmap compression. In fact, no compression may make no difference to improve the performance of the database, because bitmap compression is extremely fast.

Remember that each database is unique, and the previous statements are general characteristics of compression types. Although the default bitmap compression works well for most databases, the best way to determine the best compression setting for your database is to try each one.

Changing Data Compression Settings

Changes to the data compression setting take effect immediately as Essbase writes data blocks to disk. For blocks already on disk, Essbase does not change compression schemes or enable or disable compression. When you change the data compression settings of blocks already on disk, Essbase uses the new compression scheme the next time Essbase accesses, updates, and stores the blocks.

  To view or change the current settings, use a tool:

Tool

Topic

Location

Administration Services

Selecting a Data Compression Method

Oracle Essbase Administration Services Online Help

MaxL

alter database

Oracle Essbase Technical Reference

ESSCMD

To enable or disable data compression: SETDBSTATE

or:

SETDBSTATEITEM 14

To set the data compression type: SETDBSTATEITEM 15

Oracle Essbase Technical Reference

Example of Using SETDBSTATEITEM

  To enable or disable data compression, enter SETDBSTATEITEM 14 in ESSCMD and either follow the prompts or supply the required values on the command line.

ESSCMD prompts you for the following values, unless you supply them on the command line:

  • Data Compression on Disk? Enter Y (Yes, the default) or N (No).

  • Data Compression Type. Enter 1 (run-length encoding) or 2 (bitmap, the default).

  To specify the data compression type, enter SETDBSTATEITEM 15 in ESSCMD and either follow the prompts or supply the required values on the command line. ESSCMD prompts you for a value of “1” (run length encoding) or “2” (bitmap, the default).

The following example enables Bitmap compression:

SETDBSTATEITEM 14 "SAMPLE" "BASIC" "Y" "2"

For more syntax information, see the Oracle Essbase Technical Reference.

Checking the Compression Ratio

The compression ratio represents the ratio of the compressed block size (including overhead) to the uncompressed block size, regardless of the compression type in effect. Overhead is the space required by mechanisms that manage compression/expansion.

  To check the compression ratio, use a tool:

Tool

Topic

Location

Administration Services

Checking the Compression Ratio

Oracle Essbase Administration Services Online Help

ESSCMD

GETDBSTATS

Oracle Essbase Technical Reference

Note:

The larger the number, the more compression. The compression ratio can vary widely from block to block.

Data Block Size

Data block size is determined by the amount of data in a particular combination of dense dimensions. For example, when you change the dense or sparse configuration of one or more dimensions in the database, the data block size changes. Data block size is 8n bytes, where n is the number of cells that exist for that combination of dense dimensions.

Note:

The optimum size range is 8 KB to 100 KB.

For information about determining the size of a data block, see Size of Expanded Data Block.

  To view the block size for a database, use a tool:

Tool

Topic

Location

Administration Services

Checking Data Block Statistics

Oracle Essbase Administration Services Online Help

ESSCMD

GETDBSTATS

Oracle Essbase Technical Reference