This chapter provides information on coherence storage using backing maps. The following sections are included in this chapter:
Partitioned (Distributed) cache service in Coherence has three distinct layers:
Client View – The client view represents a virtual layer that provides access to the underlying partitioned data. Access to this tier is provided using the
NamedCache interface. In this layer you can also create synthetic data structures such as
Storage Manager – The storage manager is the server-side tier that is responsible for processing cache-related requests from the client tier. It manages the data structures that hold the actual cache data (primary and backup copies) and information about locks, event listeners, map triggers etc.
Backing Map – The Backing Map is the server-side data structure that holds actual data.
Coherence allows users to configure a number of out-of-the-box backing map implementations as well as custom ones. Basically, the only constraint that all these Map implementation have to be aware of, is the understanding that the Storage Manager provides all keys and values in internal (Binary) format. To deal with conversions of that internal data to and from an Object format, the Storage Manager can supply Backing Map implementations with a
Figure 12-1 shows a conceptual view of backing maps.
Local storage refers to the data structures that actually store or cache the data that is managed by Coherence. For an object to provide local storage, it must support the same standard collections interface,
java.util.Map. When a local storage implementation is used by Coherence to store replicated or distributed data, it is called a backing map, because Coherence is actually backed by that local storage implementation. The other common uses of local storage is in front of a distributed cache and as a backup behind the distributed cache.
Typically, Coherence uses one of the following local storage implementations:
Safe HashMap: This is the default lossless implementation. A lossless implementation is one, like Java's Hashtable class, that is neither size-limited nor auto-expiring. In other words, it is an implementation that never evicts ("loses") cache items on its own. This particular
HashMap implementation is optimized for extremely high thread-level concurrency. (For the default implementation, use class
com.tangosol.util.SafeHashMap; when an implementation is required that provides cache events, use
com.tangosol.util.ObservableHashMap. These implementations are thread-safe.)
Local Cache: This is the default size-limiting and/or auto-expiring implementation. The local cache is covered in more detail below, but the primary points to remember about it are that it can limit the size of the cache, and it can automatically expire cache items after a certain period. (For the default implementation, use
com.tangosol.net.cache.LocalCache; this implementation is thread safe and supports cache events,
CacheStore and configurable/pluggable eviction policies.)
Read/Write Backing Map: This is the default backing map implementation for caches that load from a database on a cache miss. It can be configured as a read-only cache (consumer model) or as either a write-through or a write-behind cache (for the consumer/producer model). The write-through and write-behind modes are intended only for use with the distributed cache service. If used with a near cache and the near cache must be kept synchronous with the distributed cache, it is possible to combine the use of this backing map with a Seppuku-based near cache (for near cache invalidation purposes); however, given these requirements, it is suggested that the versioned implementation be used. (For the default implementation, use class
Versioned Backing Map: This is an optimized version of the read/write backing map that optimizes its handling of the data by utilizing a data versioning technique. For example, to invalidate near caches, it simply provides a version change notification, and to determine whether cached data must be written back to the database, it can compare the persistent (database) version information with the transient (cached) version information. The versioned implementation can provide very balanced performance in large scale clusters, both for read-intensive and write-intensive data. (For the default implementation, use class
com.tangosol.net.cache.VersionedBackingMap; with this backing map, you can optionally use the
com.tangosol.net.cache.VersionedNearCache as a near cache implementation.)
Binary Map (Java NIO): This is a backing map implementation that can store its information in memory but outside of the Java heap, or even in memory-mapped files, which means that it does not affect the Java heap size and the related JVM garbage-collection performance that can be responsible for application pauses. This implementation is also available for distributed cache backups, which is particularly useful for read-mostly and read-only caches that require backup for high availability purposes, because it means that the backup does not affect the Java heap size yet it is immediately available in case of failover.
Serialization Map: This is a backing map implementation that translates its data to a form that can be stored on disk, referred to as a serialized form. It requires a separate
com.tangosol.io.BinaryStore object into which it stores the serialized form of the data; usually, this is the built-in LH disk store implementation, but the Serialization Map supports any custom implementation of
BinaryStore. (For the default implementation of Serialization Map, use
Serialization Cache: This is an extension of the
SerializationMap that supports an LRU eviction policy. This can be used to limit the size of disk files, for example. (For the default implementation of Serialization Cache, use
Overflow Map: An overflow map doesn't actually provide storage, but it deserves mention in this section because it can tie together two local storage implementations so that when the first one fills up, it will overflow into the second. (For the default implementation of
There are number of operation types performed against the Backing Map:
Natural access and update operations caused by the application usage. For example,
NamedCache.get() call naturally causes a
Map.get() call on a corresponding Backing Map; the
NamedCache.invoke() call may cause a sequence of
Map.get() followed by the
NamedCache.keySet(filter) call may cause an
Map.entrySet().iterator() loop, and so on.
Remove operations caused by the time-based expiry or the size-based eviction. For example, a
NamedCache.size() call from the client tier could cause a
Map.remove() call due to an entry expiry timeout; or
NamedCache.put() call causing a number of
Map.remove() calls (for different keys) caused by the total amount data in a backing map reaching the configured high water-mark value.
Insert operations caused by a
CacheStore.load() operation (for backing maps configured with read-through or read-ahead features)
Synthetic access and updates caused by the partition distribution (which in turn could be caused by cluster nodes fail-over or fail-back). In this case, without any application tier call, a number of entries could be inserted or removed from the backing map.
Depending on the actual implementation, the Backing Map stores the cache data in one of the following ways:
disk (memory-mapped files or in-process DB)
combination of any of the above
Keeping data in memory naturally provides dramatically smaller access and update latencies and is most commonly used.
More often than not, applications need to ensure that the total amount of data placed into the data grid does not exceed some predetermined amount of memory. It could be done either directly by the application tier logic or automatically using size- or expiry-based eviction. Quite naturally, the total amount of data held in a Coherence cache is equal to the sum of data volume in all corresponding backing maps (one per each cluster node that runs the corresponding partitioned cache service in a storage enabled mode).
Consider following cache configuration excerpts:
<backing-map-scheme> <local-scheme/> </backing-map-scheme>
The backing map above is an instance of
com.tangosol.net.cache.LocalCache and does not have any pre-determined size constraints and has to be controlled explicitly. Failure to do so could cause the JVM to go out-of-memory.
<backing-map-scheme> <local-scheme> <high-units>100m</high-units> <unit-calculator>BINARY</unit-calculator> <eviction-policy>LRU</eviction-policy> </local-scheme> </backing-map-scheme>
This backing map above is also a
com.tangosol.net.cache.LocalCache and has a capacity limit of 100MB. As the total amount of data held by this backing map exceeds that high watermark, some entries will be removed from the backing map, bringing the volume down to the low watermark value (
<low-units> configuration element, witch defaults to 75% of the
<high-units>). The choice of the removed entries will be based on the LRU (Least Recently Used) eviction policy. Other options are LFU (Least Frequently Used) and Hybrid (a combination of the LRU and LFU). The value of
<high-units> is limited to 2GB. To overcome that limitation (but maintain backward compatibility) Coherence uses the
<unit-factor> element. For example, the
<high-units> value of
8192 with a
1048576 will result in a high watermark value of
<backing-map-scheme> <local-scheme> <expiry-delay>1h</expiry-delay> </local-scheme> </backing-map-scheme>
The backing map above automatically evicts any entries that have not been updated for more than an hour. Note, that such an eviction is a "lazy" one and can happen any time after an hour since the last update happens; the only guarantee Coherence provides is that entries more than one hour old are not returned to a caller.
<backing-map-scheme> <external-scheme> <high-units>100</high-units> <unit-calculator>BINARY</unit-calculator> <unit-factor>1048576</unit-factor> <nio-memory-manager> <initial-size>1MB</initial-size> <maximum-size>100MB</maximum-size> </nio-memory-manager> </external-scheme> </backing-map-scheme>
This backing map above is an instance of
com.tangosol.net.cache.SerializationCache which stores values in the extended (nio) memory and has a capacity limit of 100MB (100*1048576). Configure a backup storage for this cache being
<backup-storage> <type>off-heap</type> <initial-size>1MB</initial-size> <maximum-size>100MB</maximum-size> </backup-storage>
Prior to Coherence 3.5, a backing map would contain entries for all partitions owned by the corresponding node. (During partition transfer, it could also hold "in flight" entries that from the clients' perspective are temporarily not owned by anyone).
Figure 12-2 shows a conceptual view of the conventional backing map implementation.
Coherence 3.5 introduced a concept of partitioned backing map, which is basically a multiplexer of actual Map implementations, each of which would contain only entries that belong to the same partition.
Figure 12-3 shows a conceptual view of the partitioned backing map implementation.
To configure a partitioned backing map, add a
<partitioned> element with a value of
true. For example:
<backing-map-scheme> <partitioned>true</partitioned> <external-scheme> <high-units>8192</high-units> <unit-calculator>BINARY</unit-calculator> <unit-factor>1048576</unit-factor> <nio-memory-manager> <initial-size>1MB</initial-size> <maximum-size>50MB</maximum-size> </nio-memory-manager> </external-scheme> </backing-map-scheme>
This backing map is an instance of
com.tangosol.net.partition.PartitionSplittingBackingMap, with individual partition holding maps being instances of
com.tangosol.net.cache.SerializationCache that each store values in the extended (nio) memory. The individual nio buffers have a limit of 50MB, while the backing map as whole has a capacity limit of 8GB (8192*1048576). Again, you would need to configure a backup storage for this cache being