15 Serialization Paged Cache
This chapter includes the following sections:
- Understanding Serialization Paged Cache
Coherence provides support for efficient caching of huge amounts of automatically-expiring data using potentially high-latency storage mechanisms such as disk files. - Configuring Serialization Paged Cache
The primary configuration for the Serialization Paged Cache is composed of two parameters: The number of pages that the cache manages, and the length of time a page is active. - Optimizing a Partitioned Cache Service
Coherence provides an optimization for the partitioned cache service which takes advantage of the fact that the data being stored in any of the serialization maps and caches is entirely binary in form. - Configuring for High Availability
Serialization Paged Cache includes support for the high-availability features of the partitioned cache service, by providing a configuration that can be used for the primary storage of the data and a configuration that is optimized for the backup storage of the data. - Configuring Load Balancing and Failover
When using a serialization paged cache with the distributed cache service, special considerations should be made for load balancing and failover purposes. - Supporting Huge Caches
To support huge caches (for example, terabytes) of expiring data, the expiration processing is performed concurrently on a daemon thread with no interruption to the cache processing.
Parent topic: Using Caches
Understanding Serialization Paged Cache
Serialization Paged Cache is defined as follows:
-
Serialization implies that objects stored in the cache are serialized and stored in a Binary Store; refer to the existing features Serialization Map and Serialization Cache.
-
Paged implies that the objects stored in the cache are segmented for efficiency of management.
-
Cache implies that there can be limits specified to the size of the cache; in this case, the limit is the maximum number of concurrent pages that the cache manages before expiring pages, starting with the oldest page.
The result is a feature that organizes data in the cache based on the time that the data was placed in the cache, and then can efficiently expire that data from the cache, an entire page at a time, and typically without having to reload any data from disk.
Parent topic: Serialization Paged Cache
Configuring Serialization Paged Cache
Each page of data in the cache is managed by a separate Binary Store. The cache requires a Binary Store Manager, which provides the means to create and destroy these Binary Stores. Coherence provides Binary Store Managers for all of the built-in Binary Store implementations, including Berkley DB.
Serialization paged caches are configured within the <external-scheme>
and <paged-external-scheme>
element in the cache configuration file. See external-scheme and paged-external-scheme.
Parent topic: Serialization Paged Cache
Optimizing a Partitioned Cache Service
Parent topic: Serialization Paged Cache
Configuring for High Availability
Parent topic: Serialization Paged Cache
Configuring Load Balancing and Failover
Parent topic: Serialization Paged Cache
Supporting Huge Caches
When the cache is used for large amounts of data, the pages are typically disk-backed. Since the cache eventually expires each page, thus releasing the disk resources, the cache uses a virtual erase optimization by default. Data that is explicitly removed or expired from the cache is not actually removed from the underlying Binary Store, but when a page (a Binary Store) is completely emptied, it is erased in its entirety. This reduces I/O by a considerable margin, particularly during expiry processing and during operations such as load-balancing that have to redistribute large amounts of data within the cluster. The cost of this optimization is that the disk files (if a disk-based Binary Store option is used) tends to be larger than the data that they are managing would otherwise imply; since disk space is considered to be inexpensive compared to other factors such as response times, the virtual erase optimization is enabled by default, but it can be disabled. Note that the disk space is typically allocated locally to each server, and thus a terabyte cache partitioned over one hundred servers would only use about 20GB of disk space per server (10GB for the primary store and 10GB for the backup store, assuming one level of backup.)
Parent topic: Serialization Paged Cache