Cache Topologies
Overview
Coherence supports many different cache topologies, but generally they fall into a few general categories. Because Coherence uses the TCMP clustering protocol, Coherence can supports each of these without compromise. However, there are inherent advantages of each topology, and trade-offs between them.
Topology as a concept refers to where the data physically resides and how it is accessed in a distributed environment. It is important to understand that, regardless of where the data physically resides, and which topology is being used, every cluster participant has the same logical view of the data, and uses the same exact API to access the data. Generally, this means that the topology can be tuned or even selected at deployment time.
- Peer-to-Peer: For most purposes, a peer-to-peer topology is the easiest to configure and offers very good performance. A peer-to-peer topology is one in which each server both provides and consumes the same clustered services. For example, in a cluster of J2EE application servers, if those servers are all managing and consuming data, and the data are shared via replicated and/or distributed caches, that would be a peer-to-peer topology. This topology spreads the load evenly over as many servers as possible, minimizing configuration details and offering the cache services the greatest overall amounts of memory and CPU resources.
- Centralized (Cache Servers): In order to centralize cache management to a cluster of servers, yet provide access to other servers, a centralized cache topology is used. This has several benefits, including the ability to reduce the resource requirements of the servers that utilize the cache by completely offloading cache management from those servers. Additionally, the cache servers (those that actually manage the cache data) can be hosted on machines explicitly configured for that purpose, and those machines do not require a J2EE application server or any other software other than a standard JVM. The cluster of cache servers (so designated by their storage-enabled attribute) provides a unified cache image, such that this model could almost be described as a cache client/cache server architecture, with the exception being that the server part of it is composed of a cluster of any number of actual servers. The obvious benefits of the clustered cache server architecture is the transparent failover and failback, and the ability to provision new servers to expand the caching resources, both in terms of processing power and in-memory cache sizes. This topology uses the Coherence Distributed Cache Service, which is a cluster-partitioned cache.
- Multi-Tier (n-tier): While the Centralized topology is a two-tier architecture, it is possible to extend this topology to three or more tiers, by having each tier be a client of the tier behind it, and coupling these tiers either over a clustered protocol (such as TCMP) or via JMS in cases where real-time cache coherency is not required and communication protocols are limited (such as production server tiers in which only certain protocols are permitted.)
- Hybrid (Near Caching): To accelerate cache accesses for Centralized and Multi-Tier topologies, Coherence supports a hybrid topology using a Near Cache technology. A Near Cache provides local cache access to recently- and/or often-used data, backed by a centralized or multi-tiered cache that is used to load-on-demand for local cache misses. Near Caches have configurable levels of cache coherency, from the most basic expiry-based caches and invalidation-based caches, up to advanced data-versioning caches that can provide guaranteed coherency. The result is a tunable balance between the preservation of local memory resources and the performance benefits of truly local caches.
The extent of the cluster and of its tiers is fully definable in the tangosol-coherence.xml configuration file. This includes the ability to lock down the set of servers that can access and manage the cache for security purposes. The selection of which topology to use is typically driven by the cache configuration file, which by default is named coherence-cache-config.xml; however, the topology can also be driven entirely by the Coherence programmatic API, if the developer so chooses.