PK
ăZAMoa«, mimetypeapplication/epub+zipPK ăZAM META-INF/container.xmlâ ÿ
This chapter includes the following sections:
NamedCache
API is the primary interface used by applications to get and interact with cache instances.Parent topic: Getting Started
Learn about Coherence clustering, configuration, caching, data storage, and serialization.
This section includes the following topics:
At the core of Coherence is the concept of clustered data management. This implies the following goals:
A fully coherent, single system image (SSI)
Scalability for both read and write access
Fast, transparent failover and failback
Linear scalability for storage and processing
No Single-Points-of-Failure (SPOFs)
Cluster-wide locking and transactions
Built on top of this foundation are the various services that Coherence provides, including database caching, HTTP session management, grid agent invocation and distributed queries. Before going into detail about these features, some basic aspects of Coherence should be discussed.
Coherence supports many topologies for clustered data management. Each of these topologies has a trade-off in terms of performance and fault-tolerance. By using a single API, the choice of topology can be deferred until deployment if desired. This allows developers to work with a consistent logical view of Coherence, while providing flexibility during tuning or as application needs change.
Coherence provides several cache implementations:
Local Cache â Local on-heap caching for non-clustered caching. See Understanding Local Caches.
Distributed Cache â True linear scalability for both read and write access. Data is automatically, dynamically and transparently partitioned across nodes. The distribution algorithm minimizes network traffic and avoids service pauses by incrementally shifting data. See Understanding Distributed Caches.
Near Cache â Provides the performance of local caching with the scalability of distributed caching. Several different near-cache strategies are available and offer a trade-off between performance and synchronization guarantees. See Understanding Near Caches.
Replicated Cache â Perfect for small, read-heavy caches. See Understanding Replicated Caches.
In-process caching provides the highest level of raw performance, since objects are managed within the local JVM. This benefit is most directly realized by the Local, Replicated, Optimistic and Near Cache implementations.
Out-of-process (client/server) caching provides the option of using dedicated cache servers. This can be helpful when you want to partition workloads (to avoid stressing the application servers). This is accomplished by using the Partitioned cache implementation and simply disabling local storage on client nodes through a single command-line option or a one-line entry in the XML configuration.
Tiered caching (using the Near Cache functionality) enables you to couple local caches on the application server with larger, partitioned caches on the cache servers, combining the raw performance of local caching with the scalability of partitioned caching. This is useful for both dedicated cache servers and co-located caching (cache partitions stored within the application server JVMs).
See Using Caches.
While most customers use on-heap storage combined with dedicated cache servers, Coherence has several options for data storage:
On-heapâThe fastest option, though it can affect JVM garbage collection times.
JournalâA combination of RAM storage and disk storage, optimized for solid state disks, that uses a journaling technique. Journal-based storage requires serialization/deserialization.
File-basedâUses a Berkeley Database JE storage system.
Coherence storage is transient: the disk-based storage options are for managing cached data only. For persistent storage, Coherence offers backing maps coupled with a CacheLoader/CacheStore.
Because serialization is often the most expensive part of clustered data management, Coherence provides the following options for serializing/deserializing data:
com.tangosol.io.pof.PofSerializer
â The Portable Object Format (also referred to as POF) is a language agnostic binary format. POF was designed to be incredibly efficient in both space and time and is the recommended serialization option in Coherence. See Using Portable Object Format.
java.io.Serializable
â The simplest, but slowest option.
java.io.Externalizable
â This requires developers to implement serialization manually, but can provide significant performance benefits. Compared to java.io.Serializable
, this can cut serialized data size by a factor of two or more (especially helpful with Distributed caches, as they generally cache data in serialized form). Most importantly, CPU usage is dramatically reduced.
com.tangosol.io.ExternalizableLite
â This is very similar to java.io.Externalizable
, but offers better performance and less memory usage by using a more efficient IO stream implementation.
com.tangosol.run.xml.XmlBean
â A default implementation of ExternalizableLite
.
Coherence's API provides access to all Coherence functionality. The most commonly used subset of this API is exposed through simple XML options to minimize effort for typical use cases. There is no penalty for mixing direct configuration through the API with the easier XML configuration.
Coherence is designed to allow the replacement of its modules as needed. For example, the local "backing maps" (which provide the actual physical data storage on each node) can be easily replaced as needed. The vast majority of the time, this is not required, but it is there for the situations that require it. The general guideline is that 80% of tasks are easy, and the remaining 20% of tasks (the special cases) require a little more effort, but certainly can be done without significant hardship.
Coherence is organized as set of services. At the root is the Cluster service. A cluster is defined as a set of Coherence instances (one instance per JVM, with one or more JVMs on each computer). See Introduction to Coherence Clusters. Under the cluster service are the various services that comprise the Coherence API. These include the various caching services (Replicated, Distributed, and so on) and the Invocation Service (for deploying agents to various nodes of the cluster). Each instance of a service is named, and there is typically a default service instance for each type. The cache services contain named caches (com.tangosol.net.NamedCache
), which are analogous to database tablesâthat is, they typically contain a set of related objects.
NamedCache
API is the primary interface used by applications to get and interact with cache instances.
This section includes the following topics:
The following source code returns a reference to a NamedCache
instance. The underlying cache service is started if necessary.
import com.tangosol.net.*; ... NamedCache cache = CacheFactory.getCache("MyCache");
Coherence scans the cache configuration XML file for a name mapping for MyCache
. This is similar to Servlet name mapping in a web container's web.xml
file. Coherence's cache configuration file contains (in the simplest case) a set of mappings (from cache name to cache scheme) and a set of cache schemes.
By default, Coherence uses the coherence-cache-config.xml
file found at the root of coherence.jar
. This can be overridden on the JVM command-line with -Dcoherence.cacheconfig=file.xml
. This argument can reference either a file system path, or a Java resource path.
The com.tangosol.net.NamedCache
interface extends several other interfaces:
java.util.Map
âbasic Map
methods such as get()
, put()
, remove()
.
com.tangosol.net.cache.CacheMap
âmethods for getting a collection of keys (as a Map
) that are in the cache and for putting objects in the cache. Also supports adding an expiry value when putting an entry in a cache.
com.tangosol.util.QueryMap
âmethods for querying the cache. See Querying Data In a Cache.
com.tangosol.util.InvocableMap
âmethods for server-side processing of cache data. See Processing Data In a Cache.
com.tangosol.util.ObservableMap
âmethods for listening to cache events. See Using Map Events.
com.tangosol.util.ConcurrentMap
âmethods for concurrent access such as lock()
and unlock()
. See Performing Transactions.
There are two general approaches to using a NamedCache
:
As a clustered implementation of java.util.Map
with several added features (queries, concurrency), but with no persistent backing (a "side" cache).
As a means of decoupling access to external data sources (an "inline" cache). In this case, the application uses the NamedCache
interface, and the NamedCache
takes care of managing the underlying database (or other resource).
Typically, an inline cache is used to cache data from:
a databaseâThe most intuitive use of a cacheâsimply caching database tables (in the form of Java objects).
a serviceâMainframe, web service, service bureauâany service that represents an expensive resource to access (either due to computational cost or actual access fees).
calculationsâFinancial calculations, aggregations, data transformations. Using an inline cache makes it very easy to avoid duplicating calculations. If the calculation is complete, the result is simply pulled from the cache. Since any serializable object can be used as a cache key, it is a simple matter to use an object containing calculation parameters as the cache key.
See Caching Data Sources.
Write-back options:
write-throughâEnsures that the external data source always contains up-to-date information. Used when data must be persisted immediately, or when sharing a data source with other applications.
write-behindâProvides better performance by caching writes to the external data source. Not only can writes be buffered to even out the load on the data source, but multiple writes can be combined, further reducing I/O. The trade-off is that data is not immediately persisted to disk; however, it is immediately distributed across the cluster, so the data survives the loss of a server. Furthermore, if the entire data set is cached, this option means that the application can survive a complete failure of the data source temporarily as both cache reads and writes do not require synchronous access the data source.
NamedCache
instance, all objects should implement a common interface (or base class). Any field of an object can be queried; indexes are optional, and used to increase performance. With a replicated cache, queries are performed locally, and do not use indexes. See Querying Data In a Cache.To add an index to a NamedCache
, you first need a value extractor (which accepts as input a value object and returns an attribute of that object). Indexes can be added blindly (duplicate indexes are ignored). Indexes can be added at any time, before or after inserting data into the cache.
It should be noted that queries apply only to cached data. For this reason, queries should not be used unless the entire data set has been loaded into the cache, unless additional support is added to manage partially loaded sets.
Developers have the option of implementing additional custom filters for queries, thus taking advantage of query parallel behavior. For particularly performance-sensitive queries, developers may implement index-aware filters, which can access Coherence's internal indexing structures.
Coherence includes a built-in optimizer, and applies indexes in the optimal order. Because of the focused nature of the queries, the optimizer is both effective and efficient. No maintenance is required.
The invocation service is accessed through the InvocationService
interface and includes the following two methods:
public void execute(Invocable task, Set setMembers, InvocationObserver observer); public Map query(Invocable task, Set setMembers);
An instance of the service can be retrieved from the CacheFactory
class.
Coherence implements the WorkManager
API for task-centric processing.
The event programming models are:
Live Events â The live event programming model uses user-defined event interceptors that are registered to receive different types of events. Applications decide what action to take based on the event type. Many events that are available through the use of map events are also supported using live events. See Using Live Events.
Map Events â The map event programming model uses user-defined map listeners that are attached to the underlying map implementation. Map events offer customizable server-based filters and lightweight events that can minimize network traffic and processing. Map listeners follow the JavaBean paradigm and can distinguish between system cache events (for example, eviction) and application cache events (for example, get/put operations). See Using Map Events.
ConcurrentMap
interface and EntryProcessor
API, partition-level transactions using implicit locking and the EntryProcessor
API, atomic transactions using the Transaction Framework API, and atomic transactions with full XA support using the Coherence resource adapter. See Performing Transactions.Using Coherence session management does not require any changes to the application. Coherence*Web uses the near caching to provide fully fault-tole' Űörant caching, with almost unlimited scalability (to several hundred cluster nodes without issue).
This chapter includes the following sections:
Parent topic: Using Caches
Client View â The client view represents a virtual layer that provides access to the underlying partitioned data. Access to this tier is provided using the NamedCache
interface. In this layer you can also create synthetic data structures such as NearCache
or ContinuousQueryCache
.
Storage Manager â The storage manager is the server-side tier that is responsible for processing cache-related requests from the client tier. It manages the data structures that hold the actual cache data (primary and backup copies) and information about locks, event listeners, map triggers, and so on.
Backing Map â The Backing Map is the server-side data structure that holds actual data.
Coherence allows users to configure out-of-the-box and custom backing map implementations. The only constraint for a Map implementation is the understanding that the Storage Manager provides all keys and values in internal (Binary) format. To deal with conversions of that internal data to and from an Object format, the Storage Manager can supply Backing Map implementations with a BackingMapManagerContext
reference.
Figure 13-1 shows a conceptual view of backing maps.
Figure 13-1 Backing Map Storage
java.util.Map
. When a local storage implementation is used by Coherence to store replicated or distributed data, it is called a backing map because Coherence is actually backed by that local storage implementation. The other common uses of local storage is in front of a distributed cache and as a backup behind the distributed cache.Caution:
Be careful when using any backing map that does not store data on heap, especially if storing more data than can actually fit on heap. Certain cache operations (for example, unindexed queries) can potentially traverse a large number of entries that force the backing map to bring those entries onto the heap. Also, partition transfers (for example, restoring from backup or transferring partition ownership when a new member joins) force the backing map to bring lots of entries onto the heap. This can cause GC problems and potentially lead to OutOfMemory
errors.
Coherence supports the following local storage implementations:
Safe HashMap: This is the default lossless implementation. A lossless implementation is one, like the Java Hashtable
class, that is neither size-limited nor auto-expiring. In other words, it is an implementation that never evicts ("loses") cache items on its own. This particular HashMap
implementation is optimized for extremely high thread-level concurrency. For the default implementation, use class com.tangosol.util.SafeHashMap
; when an implementation is required that provides cache events, use com.tangosol.util.ObservableHashMap
. These implementations are thread-safe.
Local Cache: This is the default size-limiting and auto-expiring implementation. See Capacity Planning. A local cache limits the size of the cache and automatically expires cache items after a certain period. For the default implementation, use com.tangosol.net.cache.LocalCache
; this implementation is thread safe and supports cache events, com.tangosol.net.CacheLoader
, CacheStore
and configurable/pluggable eviction policies.
Read/Write Backing Map: This is the default backing map implementation for caches that load from a backing store (such as a database) on a cache miss. It can be configured as a read-only cache (consumer model) or as either a write-through or a write-behind cache (for the consumer/producer model). The write-through and write-behind modes are intended only for use with the distributed cache service. If used with a near cache and the near cache must be kept synchronous with the distributed cache, it is possible to combine the use of this backing map with a Seppuku-based near cache (for near cache invalidation purposes). For the default implementation, use class com.tangosol.net.cache.ReadWriteBackingMap
.
Binary Map (Java NIO): This is a backing map implementation that can store its information outside of the Java heap in memory-mapped files, which means that it does not affect the Java heap size and the related JVM garbage-collection performance that can be responsible for application pauses. This implementation is also available for distributed cache backups, which is particularly useful for read-mostly and read-only caches that require backup for high availability purposes, because it means that the backup does not affect the Java heap size yet it is immediately available in case of failover.
Serialization Map: This is a backing map implementation that translates its data to a form that can be stored on disk, referred to as a serialized form. It requires a separate com.tangosol.io.BinaryStore
object into which it stores the serialized form of the data. Serialization Map supports any custom implementation of BinaryStore
. For the default implementation of Serialization Map, use com.tangosol.net.cache.SerializationMap
.
Serialization Cache: This is an extension of the SerializationMap
that supports an LRU eviction policy. For example, a serialization cache can limit the size of disk files. For the default implementation of Serialization Cache, use com.tangosol.net.cache.SerializationCache
.
Journal: This is a backing map implementation that stores data to either RAM, disk, or both RAM and disk. Journaling use the com.tangosol.io.journal.JournalBinaryStore
class. See Using the Elastic Data Feature to Store Data.
Overflow Map: An overflow map does not actually provide storage, but it deserves mention in this section because it can combine two local storage implementations so that when the first one fills up, it overflows into the second. For the default implementation of OverflowMap
, use com.tangosol.net.cache.OverflowMap
.
Natural access and update operations caused by the application usage. For example, NamedCache.get()
call naturally causes a Map.get()
call on a corresponding Backing Map; the NamedCache.invoke()
call may cause a sequence of Map.get()
followed by the Map.put()
; the NamedCache.keySet(filter)
call may cause an Map.entrySet().iterator()
loop, and so on.
Remove operations caused by the time-based expiry or the size-based eviction. For example, a NamedCache.get()
or NamedCache.size()
call from the client tier could cause a Map.remove()
call due to an entry expiry timeout; or NamedCache.put()
call causing some Map.remove()
calls (for different keys) caused by the total amount data in a backing map reaching the configured high water-mark value.
Insert operations caused by a CacheStore.load()
operation (for backing maps configured with read-through or read-ahead features)
Synthetic access and updates caused by the partition distribution (which in turn could be caused by cluster nodes fail over or fail back). In this case, without any application tier call, some entries could be inserted or removed from the backing map.
The total amount of data held in a Coherence cache equals the sum of data volume in all corresponding backing maps (one per each cluster node that runs the corresponding partitioned cache service in a storage enabled mode).
Consider following cache configuration excerpts:
<backing-map-scheme> <local-scheme/> </backing-map-scheme>
The backing map above is an instance of com.tangosol.net.cache.LocalCache
and does not have any pre-determined size constraints and has to be controlled explicitly. Failure to do so could cause the JVM to go out-of-memory. The following example configures size constraints on the backing map:
<backing-map-scheme> <local-scheme> <eviction-policy>LRU</eviction-policy> <high-units>100</high-units> <unit-calculator>BINARY</unit-calculator> </local-scheme> </backing-map-scheme>
This backing map above is also a com.tangosol.net.cache.LocalCache
and has a capacity limit of 100MB. As the total amount of data held by this backing map exceeds that high watermark, some entries are removed from the backing map, bringing the volume down to the low watermark value (<low-units>
configuration element, which defaults to 80% of the <high-units>
). If the value exceeds Integer.MAX_VALUE
, then a unit factor is automatically used and the value for <high-units>
and <low-units>
are adjusted accordingly. The choice of the removed entries is based on the LRU (Least Recently Used) eviction policy. Other options are LFU (Least Frequently Used) and Hybrid (a combination of the LRU and LFU).
The following backing map automatically evicts any entries that have not been updated for more than an hour. Entries that exceed one hour are not returned to a caller and are lazily removed from the cache.
<backing-map-scheme> <local-scheme> <expiry-delay>1h</expiry-delay> </local-scheme> </backing-map-scheme>
A backing map within a distributed scheme also supports sliding expiry. If enabled:
Read operations extend the expiry of the accessed cache entries. The read operations include get
, getAll
, invoke
and invokeAll
without mutating the entries (for example, only entry.getValue
in an entry processor).
Any enlisted entries that are not mutated (for example, from interceptors or triggers) are also expiry extended.
The backup (for expiry change) is done asynchronously if the operation is read access only. If a mutating operation is involved (for example, an eviction occurred during a get
or getAll
operation), then the backup is done synchronously.
Note:
Sliding expiry is not performed for entries that are accessed based on query requests like aggregate
and query
operations.
To enable sliding expiry, set the <sliding-expiry>
element, within a <backing-map-scheme>
element to true
and ensure that the <expiry-delay>
element is set to a value greater than zero. For example,
<distributed-scheme> <scheme-name>dist-expiry</scheme-name> <service-name>DistributedExpiry</service-name> <backing-map-scheme> <sliding-expiry>true</sliding-expiry> <local-scheme> <expiry-delay>3s</expiry-delay> </local-scheme> </backing-map-scheme> </distributed-scheme>
Figure 13-2 shows a conceptual view of the conventional backing map implementation.
Figure 13-2 Conventional Backing Map Implementation
A partitioned backing map is a multiplexer of actual Map
implementations, each of which contains only entries that belong to the same partition. Partitioned backing maps raise the storage limit (induced by the java.util.Map
API) from 2G for a backing map to 2G for each partition. Partitioned backing maps are typically used whenever a solution may reach the 2G backing map limit, which is often possible when using the elastic data feature. See Using the Elastic Data Feature to Store Data.
Figure 13-3 shows a conceptual view of the partitioned backing map implementation.
Figure 13-3 Partitioned Backing Map Implementation
To configure a partitioned backing map, add a <partitioned>
element with a value of true
. For example:
<backing-map-scheme> <partitioned>true</partitioned> <external-scheme> <nio-memory-manager> <initial-size>1MB</initial-size> <maximum-size>50MB</maximum-size> </nio-memory-manager> <high-units>8192</high-units> <unit-calculator>BINARY</unit-calculator> </external-scheme> </backing-map-scheme>
This backing map is an instance of com.tangosol.net.partition.PartitionSplittingBackingMap
, with individual partition holding maps being instances of com.tangosol.net.cache.SerializationCache
that each store values in the extended (nio) memory. The individual nio buffers have a limit of 50MB, while the backing map as whole has a capacity limit of 8GB (8192*1048576).
Elastic data contains two distinct components: the RAM journal for storing data in-memory and the flash journal for storing data to disk-based devices. These can be combined in different combinations and are typically used for backing maps and backup storage but can also be used with composite caches (for example, a near cache). The RAM journal can work with the flash journal to enable seamless overflow to disk.
Caches that use RAM and flash journals are configured as part of a cache scheme definition within a cache configuration file. Journaling behavior is configured, as required, by using an operational override file to override the out-of-box configuration.
This section includes the following topics:
Journaling refers to the technique of recording state changes in a sequence of modifications called a journal. As changes occur, the journal records each value for a specific key and a tree structure that is stored in memory keeps track of which journal entry contains the current value for a particular key. To find the value for an entry, you find the key in the tree which includes a pointer to the journal entry that contains the latest value.
As changes in the journal become obsolete due to new values being written for a key, stale values accumulate in the journal. At regular intervals, the stale values are evacuated making room for new values to be written in the journal.
The Elastic Data feature includes a RAM journal implementation and a Flash journal implementation that work seamlessly with each other. If for example the RAM Journal runs out of memory, the Flash Journal can automatically accept the overflow from the RAM Journal, allowing for caches to expand far beyond the size of RAM.
Note:
Elastic data is ideal when performing key-based operations and typically not recommend for large filter-based operations. When journaling is enabled, additional capacity planning is required if you are performing data grid operations (such as queries and aggregations) on large result sets. See General Guidelines in Administering Oracle Coherence.
A resource manager controls journaling. The resource manager creates and utilizes a binary store to perform operations on the journal. The binary store is implemented by the JournalBinaryStore
class. All reads and writes through the binary store are handled by the resource manager. There is a resource manager for RAM journals (RamJournalRM
) and one for flash journals (FlashJournalRM
).
The <ramjournal-scheme>
and <flashjournal-scheme>
elements are used to configure RAM and Flash journals (respectively) in a cache configuration file. See ramjournal-scheme and flashjournal-scheme.
This section includes the following topics:
To configure a RAM journal backing map, add the <ramjournal-scheme>
element within the <backing-map-scheme>
element of a cache definition. The following example creates a distributed cache that uses a RAM journal for the backing map. The RAM journal automatically delegates to a flash journal when the RAM journal exceeds the configured memory size. See Changing Journaling Behavior.
<distributed-scheme> <scheme-name>distributed-journal</scheme-name> <service-name>DistributedCacheRAMJournal</service-name> <backing-map-scheme> <ramjournal-scheme/> </backing-map-scheme> <autostart>true</autostart> </distributed-scheme>
To configure a flash journal backing map, add the <flashjournal-scheme>
element within the <backing-map-scheme>
element of a cache definition. The following example creates a distributed scheme that uses a flash journal for the backing map.
<distributed-scheme> <scheme-name>distributed-journal</scheme-name> <service-name>DistributedCacheFlashJournal</service-name> <backing-map-scheme> <flashjournal-scheme/> </backing-map-scheme> <autostart>true</autostart> </distributed-scheme>
The RAM and flash journal schemes both support the use of scheme references to reuse scheme definitions. The following example creates a distributed cache and configures a RAM journal backing map by referencing the RAM scheme definition called default-ram
.
<caching-schemes>
<distributed-scheme>
<scheme-name>distributed-journal</scheme-name>
<service-name>DistributedCacheJournal</service-name>
<backing-map-scheme>
<ramjournal-scheme>
<scheme-ref>default-ram</scheme-ref>
</ramjournal-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</distributed-scheme>
<ramjournal-scheme>
<scheme-name>default-ram</scheme-name>
</ramjournal-scheme>
</caching-schemes>
The RAM and flash journal can be size-limited. They can restrict the number of entries to store and automatically evict entries when the journal becomes full. Furthermore, both the sizing of entries and the eviction policies can be customized. The following example defines expiry and eviction settings for a RAM journal:
<distributed-scheme> <scheme-name>distributed-journal</scheme-name> <service-name>DistributedCacheFlashJournal</service-name> <backing-map-scheme> <ramjournal-scheme> <eviction-policy>LFU</eviction-policy> <high-units>100</high-units> <low-units>80</low-units> <unit-calculator>Binary</unit-calculator> <expiry-delay>0</expiry-delay> </ramjournal-scheme> </backing-map-scheme> <autostart>true</autostart> </distributed-scheme>
Journal schemes are used for backup storage as well as for backing maps. By default, Flash Journal is used as the backup storage. This default behavior can be modified by explicitly specifying the storage type within the <backup-storage>
element. The following configuration uses a RAM journal for the backing map and explicitly configures a RAM journal for backup storage:
<caching-schemes> <distributed-scheme> <scheme-name>default-distributed-journal</scheme-name> <service-name>DistributedCacheJournal</service-name> <backup-storage> <type>scheme</type> <scheme-name>example-ram</scheme-name> </backup-storage> <backing-map-scheme> <ramjournal-scheme/> </backing-map-scheme> <autostart>true</autostart> </distributed-scheme> <ramjournal-scheme> <scheme-name>example-ram</scheme-name> </ramjournal-scheme> </caching-schemes>
Journal schemes can be configured to use a custom backing map as required. Custom map implementations must extend the CompactSerializationCache
class and declare the exact same set of public constructors.
To enable, a custom implementation, add a <class-scheme>
element whose value is the fully qualified name of the custom class. Any parameters that are required by the custom class can be defined using the <init-params>
element. The following example enables a custom map implementation called MyCompactSerializationCache
.
<flashjournal-scheme> <scheme-name>example-flash</scheme-name> <class-name>package.MyCompactSerializationCache</class-name> </flashjournal-scheme>
A resource manager controls journaling behavior. There is a resource manager for RAM journals (RamJournalRM
) and a resource manager for Flash journals (FlashJournalRM
). The resource managers are configured for a cluster in the tangosol-coherence-override.xml
operational override file. The resource managers' default out-of-box settings are used if no configuration overrides are set.
This section includes the following topics:
The <ramjournal-manager>
element is used to configure RAM journal behavior. The following list summarizes the default characteristics of a RAM journal. See ramjournal-manager.
Binary values are limited by default to 64KB (and a maximum of 4MB). A flash journal is automatically used if a binary value exceeds the configured limit.
An individual buffer (a journal file) is limited by default to 2MB (and a maximum of 2GB). The maximum file size should not be changed.
A journal is composed of up to 512 files. 511 files are usable files and one file is reserved for depleted states.
The total memory used by the journal is limited to 1GB by default (and a maximum of 64GB). A flash journal is automatically used if the total memory of the journal exceeds the configured limit.
To configure a RAM journal resource manager, add a <ramjournal-manager>
element within a <journaling-config>
element and define any subelements that are to be overridden. The following example demonstrates overriding RAM journal subelements:
<?xml version='1.0'?> <coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config" xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config coherence-operational-config.xsd"> <cluster-config> <journaling-config> <ramjournal-manager> <maximum-value-size>64K</maximum-value-size> <maximum-size system-property="coherence.ramjournal.size"> 2G</maximum-size> </ramjournal-manager> </journaling-config> </cluster-config> </coherence>
The <flashjournal-manager>
element is used to configure flash journal behavior. The following list summarizes the default characteristics of a flash journal. See flashjournal-manager.
Binary values are limited by default to 64MB.
An individual buffer (a journal file) is limited by default to 2GB (and maximum 4GB).
A journal is composed of up to 512 files. 511 files are usable files and one file is reserved for depleted states. A journal is limited by default to 1TB, with a theoretical maximum of 2TB.
A journal has a high journal size of 11GB by default. The high size determines when to start removing stale values from the journal. This is not a hard limit on the journal size, which can still grow to the maximum file count (512).
Keys remain in memory in a compressed format. For values, only the unwritten data (being queued or asynchronously written) remains in memory. When sizing the heap, a reasonable estimate is to allow 50 bytes for each entry to hold key data (this is true for both RAM and Flash journals) and include additional space for the buffers (16MB). The entry size is increased if expiry or eviction is configured.
A flash journal is automatically used as overflow when the capacity of the RAM journal is reached. The flash journal can be disabled by setting the maximum size of the flash journal to 0, which means journaling exclusively uses a RAM journal.
To configure a flash journal resource manager, add a <flashjournal-manager>
element within a <journaling-config>
element and define any subelements that are to be overridden. The following example demonstrates overriding flash journal subelements:
<?xml version='1.0'?> <coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config" xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config coherence-operational-config.xsd"> <cluster-config> <journaling-config> <flashjournal-manager> <maximum-value-size>64K</maximum-value-size> <maximum-file-size>8M</maximum-file-size> <block-size>512K</block-size> <maximum-pool-size>32M</maximum-pool-size> <directory>/coherence_storage</directory> <async-limit>32M</async-limit> <high-journal-size system-property="coherence.flashjournal.highjournalsize"> 11GB</high-journal-size> </flashjournal-manager> </journaling-config> </cluster-config> </coherence>
Note:
The directory specified for storing journal files must exist. If the directory does not exist, a warning is logged and the default temporary file directory, as designated by the JVM, is used.
Asynchronous backup is typically used to increase client performance. However, applications that use asynchronous backup must handle the possible effects on data integrity. Specifically, cache operations may complete before backup operations complete (successfully or unsuccessfully) and backup operations may complete in any order. Consider using asynchronous backup if an application does not require backups (that is, data can be restored from a system of record if lost) but the application still wants to offer fast recovery in the event of a node failure.
Note:
The use of asynchronous backups together with rolling restarts requires the use of the shutdown
method to perform an orderly shut down of cluster members instead of the stop
method or kill -9
. Otherwise, a member may shutdown before asynchronous backups are complete. The shutdown
method guarantees that all updates are complete.
To enable asynchronous backup for a distributed cache, add an <async-backup>
element, within a <distributed-scheme>
element, that is set to true
. For example:
<distributed-scheme> ... <async-backup>true</async-backup> ... </distributed-scheme>
To enable asynchronous backup for all instances of the distributed cache service type, override the partitioned cache service's async-backup
initialization parameter in an operational override file. For example:
<?xml version='1.0'?> <coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config" xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config coherence-operational-config.xsd"> <cluster-config> <services> <service id="3"> <init-params> <init-param id="27"> <param-name>async-backup</param-name> <param-value system-property="coherence.distributed.asyncbackup"> false </param-value> </init-param> </init-params> </service> </services> </cluster-config> </coherence>
The coherence.distributed.asyncbackup
system property is used to enable asynchronous backup for all instances of the distributed cache service type instead of using the operational override file. For example:
-Dcoherence.distributed.asyncbackup=true
Delta backup uses a compressor that compares two in-memory buffers containing an old and a new value and produces a result (called a delta) that can be applied to the old value to create the new value. Coherence provides standard delta compressors for POF and non-POF formats. Custom compressors can also be created and configured as required.
This section includes the following topics:
Delta backup is only available for distributed caches and is disabled by default. Delta backup is enabled either individually for each distributed cache or for all instances of the distributed cache service type.
To enable delta backup for a distributed cache, add a <compressor>
element, within a <distributed-scheme>
element, that is set to standard
. For example:
<distributed-scheme> ... <compressor>standard</compressor> ... </distributed-scheme>
To enable delta backup for all instances of the distributed cache service type, override the partitioned cache service's compressor
initialization parameter in an operational override file. For example:
<?xml version='1.0'?> <coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config" xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config coherence-operational-config.xsd"> <cluster-config> <services> <service id="3"> <init-params> <init-param id="22"> <param-name>compressor</param-name> <param-value system-property="coherence.distributed.compressor"> standard</param-value> </init-param> </init-params> </service> </services> </cluster-config> </coherence>
The coherence.distributed.compressor
system property is used to enable delta backup for all instances of the distributed cache service type instead of using the operational override file. For example:
-Dcoherence.distributed.compressor=standard
To use a custom compressor for performing delta backup, include an <instance>
subelement and provide a fully qualified class name that implements the DeltaCompressor
interface. See instance. The following example enables a custom compressor that is implemented in the MyDeltaCompressor
class.
<distributed-scheme> ... <compressor> <instance> <class-name>package.MyDeltaCompressor</class-name> </instance> </compressor> ... </distributed-scheme>
As an alternative, the <instance>
element supports the use of a <class-factory-name>
element to use a factory class that is responsible for creating DeltaCompressor
instances, and a <method-name>
element to specify the static factory method on the factory class that performs object instantiation. The following example gets a custom compressor instance using the getCompressor
method on the MyCompressorFactory
class.
<distributed-scheme> ... <compressor> <instance> <class-factory-name>package.MyCompressorFactory</class-factory-name> <method-name>getCompressor</method-name> </instance> </compressor> ... </distributed-scheme>
Any initialization parameters that are required for an implementation can be specified using the <init-params>
element. The following example sets the iMaxTime
parameter to 2000
.
<distributed-scheme> ... <compressor> <instance> <class-name>package.MyDeltaCompressor</class-name> <init-params> <init-param> <param-name>iMaxTime</param-name> <param-value>2000</param-value> </init-param> </init-params> </instance> </compressor> ... </distributed-scheme>