14 Storage and Backing Map

This chapter provides information on coherence storage using backing maps. The following sections are included in this chapter:

14.1 Cache Layers

Partitioned (Distributed) cache service in Coherence has three distinct layers:

  • Client View – The client view represents a virtual layer that provides access to the underlying partitioned data. Access to this tier is provided using the NamedCache interface. In this layer you can also create synthetic data structures such as NearCache or ContinuousQueryCache.

  • Storage Manager – The storage manager is the server-side tier that is responsible for processing cache-related requests from the client tier. It manages the data structures that hold the actual cache data (primary and backup copies) and information about locks, event listeners, map triggers, and so on.

  • Backing Map – The Backing Map is the server-side data structure that holds actual data.

Coherence allows users to configure out-of-the-box backing map implementations and custom implementations. Basically, the only constraint that all these Map implementation have to be aware of, is the understanding that the Storage Manager provides all keys and values in internal (Binary) format. To deal with conversions of that internal data to and from an Object format, the Storage Manager can supply Backing Map implementations with a BackingMapManagerContext reference.

Figure 14-1 shows a conceptual view of backing maps.

Figure 14-1 Backing Map Storage

storage architecture in Coherence

14.2 Operations

There are number of operation types performed against the Backing Map:

  • Natural access and update operations caused by the application usage. For example, NamedCache.get() call naturally causes a Map.get() call on a corresponding Backing Map; the NamedCache.invoke() call may cause a sequence of Map.get() followed by the Map.put(); the NamedCache.keySet(filter) call may cause an Map.entrySet().iterator() loop, and so on.

  • Remove operations caused by the time-based expiry or the size-based eviction. For example, a NamedCache.get() or NamedCache.size() call from the client tier could cause a Map.remove() call due to an entry expiry timeout; or NamedCache.put() call causing many Map.remove() calls (for different keys) caused by the total amount data in a backing map reaching the configured high water-mark value.

  • Insert operations caused by a CacheStore.load() operation (for backing maps configured with read-through or read-ahead features)

  • Synthetic access and updates caused by the partition distribution (which in turn could be caused by cluster nodes fail over or fail back). In this case, without any application tier call, many entries could be inserted or removed from the backing map.

14.3 Capacity Planning

Depending on the actual implementation, the Backing Map stores the cache data in the following ways:

  • on-heap memory

  • off-heap memory

  • disk (memory-mapped files or in-process DB)

  • combination of any of the above

Keeping data in memory naturally provides dramatically smaller access and update latencies and is most commonly used.

More often than not, applications must ensure that the total amount of data placed into the data grid does not exceed some predetermined amount of memory. It could be done either directly by the application tier logic or automatically using size- or expiry-based eviction. Quite naturally, the total amount of data held in a Coherence cache equals the sum of data volume in all corresponding backing maps (one per each cluster node that runs the corresponding partitioned cache service in a storage enabled mode).

Consider following cache configuration excerpts:

<backing-map-scheme>
  <local-scheme/>
</backing-map-scheme>

The backing map above is an instance of com.tangosol.net.cache.LocalCache and does not have any pre-determined size constraints and has to be controlled explicitly. Failure to do so could cause the JVM to go out-of-memory.

<backing-map-scheme>
  <local-scheme>
    <eviction-policy>LRU</eviction-policy>
    <high-units>100m</high-units>
    <unit-calculator>BINARY</unit-calculator>
  </local-scheme>
</backing-map-scheme>

This backing map is also a com.tangosol.net.cache.LocalCache and has a capacity limit of 100MB. As the total amount of data held by this backing map exceeds that high watermark, some entries are removed from the backing map, bringing the volume down to the low watermark value (<low-units> configuration element, witch defaults to 75% of the <high-units>). The choice of the removed entries is based on the LRU (Least Recently Used) eviction policy. Other options are LFU (Least Frequently Used) and Hybrid (a combination of the LRU and LFU). The value of <high-units> is limited to 2GB. To overcome that limitation (but maintain backward compatibility) Coherence uses the <unit-factor> element. For example, the <high-units> value of 8192 with a <unit-factor> of 1048576 results in a high watermark value of 8GB (see a configuration excerpt below).

<backing-map-scheme>
  <local-scheme>
    <expiry-delay>1h</expiry-delay>
  </local-scheme>
</backing-map-scheme>

This backing map automatically evicts any entries that were not updated for more than an hour. Note, that such an eviction is a "lazy" one and can happen any time after an hour since the last update happens; the only guarantee Coherence provides is that entries that are at least one hour old are not be returned to a caller.

<backing-map-scheme>
  <external-scheme>
    <nio-memory-manager>
      <initial-size>1MB</initial-size>
      <maximum-size>100MB</maximum-size>
    </nio-memory-manager>
    <high-units>100</high-units>
    <unit-calculator>BINARY</unit-calculator>
    <unit-factor>1048576</unit-factor>
  </external-scheme>
</backing-map-scheme>

This backing map is an instance of com.tangosol.net.cache.SerializationCache which stores values in the extended (NIO) memory and has a capacity limit of 100MB (100*1048576). Quite naturally, you would configure a backup storage for this cache being off-heap (or file-mapped):

<backup-storage>
  <type>off-heap</type>
  <initial-size>1MB</initial-size>
  <maximum-size>100MB</maximum-size>
</backup-storage>

14.4 Partitioned Backing Maps

The conventional backing map implementation contained entries for all partitions owned by the corresponding node. (During partition transfer, it could also hold "in flight" entries that from the clients' perspective are temporarily not owned by anyone).

Figure 14-2 shows a conceptual view of the conventional backing map implementation.

Figure 14-2 Conventional Backing Map Implementation

conventional backing map implementation

A partitioned backing map is basically a multiplexer of actual Map implementations, each of which would contain only entries that belong to the same partition.

Figure 14-3 shows a conceptual view of the partitioned backing map implementation.

Figure 14-3 Partitioned Backing Map Implementation

partitioned backing map

To configure a partitioned backing map, add a <partitioned> element with a value of true. For example:

<backing-map-scheme>
  <partitioned>true</partitioned>
  <external-scheme>
    <nio-memory-manager>
      <initial-size>1MB</initial-size>
      <maximum-size>50MB</maximum-size>
    </nio-memory-manager>
    <high-units>8192</high-units>
    <unit-calculator>BINARY</unit-calculator>
    <unit-factor>1048576</unit-factor>
  </external-scheme>
</backing-map-scheme>

This backing map is an instance of com.tangosol.net.partition.PartitionSplittingBackingMap, with individual partition holding maps being instances of com.tangosol.net.cache.SerializationCache that each store values in the extended (nio) memory. The individual nio buffers have a limit of 50MB, while the backing map as whole has a capacity limit of 8GB (8192*1048576). Again, you would must configure a backup storage for this cache being off-heap or file-mapped.