The EclipseLink cache is an in-memory repository that stores recently read or written objects based on class and primary key values. The cache improves performance by holding recently read or written objects and accessing them in-memory to minimize database access, manage locking and isolation level, and manage object identity.
EclipseLink defines the following entity caching annotations:
@Cache
@TimeOfDay
@ExistenceChecking
EclipseLink also provides a number of persistence unit properties that you can specify to configure the EclipseLink cache. These properties may compliment or provide an alternative to annotations. For more information on these annotations and properties, see Oracle Fusion Middleware Java Persistence API (JPA) Extensions Reference for Oracle TopLink.
This chapter contains the following sections:
EclipseLink uses two types of cache: the shared persistence unit cache (L2) maintains objects retrieved from and written to the data source; and the isolated persistence context cache (L1) holds objects while they participate in transactions. When a persistence context (entity manager) successfully commits to the data source, EclipseLink updates the persistence unit cache accordingly. Conceptually the persistence context cache is represented by the EntityManager
and the persistence unit cache is represented by the EntityManagerFactory
.
Internally, EclipseLink stores the persistence unit cache on a EclipseLink session, and the persistence context cache on a EclipseLink unit of work. As Figure 9-1 shows, the persistence unit cache (session) cache and the persistence context cache (unit of work cache) work together with the data source connection to manage objects in a EclipseLink application. The object life cycle relies on these three mechanisms.
Figure 9-1 Object Life Cycle and the EclipseLink Caches
The persistence unit cache is a shared cache (L2) that services clients attached to a given persistence unit. When you read objects from or write objects to the data source using an EntityManager
object, EclipseLink saves a copy of the objects in the persistence unit's cache and makes them accessible to all other processes accessing the same persistence unit.
EclipseLink adds objects to the persistence unit cache from the following:
The data store, when EclipseLink executes a read operation
The persistence context cache, when a persistence context successfully commits a transaction
EclipseLink defines three cache isolation levels: Isolated, Shared and Protected. For more information on these levels, see Section 9.1.3, "Shared, Isolated, Protected, Weak, and Read-only Caches."
There is a separate persistence unit cache for each unique persistence unit name. Although the cache is conceptually stored with the EntityManagerFactory
, two factories with the same persistence unit name will share the same cache (and effectively be the same persistence unit instance). If the same persistence unit is deployed in two separate applications in Java EE, their full persistence unit name will normally still be unique, and they will use separate caches. Certain persistence unit properties, such as data-source, database URL, user, and tenant id can affect the unique name of the persistence unit, and result in separate persistence unit instances and separate caches. The eclipselink.session.name
persistence unit property can be used to force two persistence units to resolve to the same instance and share a cache.
The persistence context cache is an isolated cache (L1) that services operations within an EntityManager. It maintains and isolates objects from the persistence unit cache, and writes changed or new objects to the persistence unit cache after the persistence context commits changes to the data source.
Note:
Only committed changes are merged into the shared persistence unit cache, flush or other operations do not affect the persistence unit cache until the transaction is committed.
The life-cycle for the persistence context cache differs between application managed, and container managed persistence contexts. The unit of work cache services operations within the unit of work. It maintains and isolates objects from the session cache, and writes changed or new objects to the session cache after the unit of work commits changes to the data source.
An application managed persistence context is created by the application from an EntityManagerFactory
. The application managed persistence context's cache will remain until the EntityManager
is closed or clear()
is called. It is important to keep application managed persistence units short lived, or to make use of clear()
to avoid the persistence context cache from growing too big, or from becoming out of sync with the persistence unit cache and the database. Typically a separate EntityManager
should be created for each transaction or request.
An extended persistence context has the same caching behavior as an application managed persistence context, even if it is managed by the container.
EclipseLink also supports a WEAK
reference mode option for long lived persistence contexts, such as two-tier applications. See Section 9.1.3.4, "Weak Reference Mode."
A container managed persistence context is typically injected into a SessionBean
or other managed object by a Java EE container, or frameworks such as Spring. The container managed persistence context's cache will only remain for the duration of the transaction. Entities read in a transaction will become detached after the completion of the transaction and will require to be merged to be edited in subsequent transactions.
Note:
EclipseLink supports accessing an entity's LAZY relationships after the persistence context has been closed.
EclipseLink defines three cache isolation levels. The cache isolation level defines how caching for an entity is performed by the persistence unit and the persistence context. The cache isolation levels are:
Isolated—entities are only cached in the persistence context, not in the persistence unit. See Section 9.1.3.1, "Isolated Cache."
Shared—entities are cached both in the persistence context and persistence unit, read-only entities are shared and only cached in the persistence unit. See Section 9.1.3.2, "Shared Cache."
Protected—entities are cached both in the persistence context and persistence unit, read-only entities are isolated and cached in the persistence unit and persistence context. See Section 9.1.3.3, "Protected Cache."
The isolated cache (L1) is the cache stored in the persistence context. It is a transactional or user session based cache. Setting the cache isolation to ISOLATED
for an entity disables its shared cache. With an isolated cache all queries and find operations will access the database unless the object has already been read into the persistence context and refreshing is not used.
Use a isolated cache to do the following:
avoid caching highly volatile data in the shared cache
achieve serializable transaction isolation
Each persistence context owns an initially empty isolated cache. The persistence context's isolated cache is discarded when the persistence context is closed, or the EntityManager.clear()
operation is used.
When you use an EntityManager
to read an isolated entity, the EntityManager
reads the entity directly from the database and stores it in the persistence context's isolated cache. When you read a read-only entity it is still stored in the isolated cache, but is not change tracked.
The persistence context can access the database using a connection pool or an exclusive connection. The persistence unit property eclipselink.jdbc.exclusive-connection.mode
can be used to use an exclusive connection. Using an exclusive connection provides improved user-based security for reads and writes. Specific queries can also be configured to use the persistence context's exclusive connection.
Note:
If an EntityManager
contains an exclusive connection, you must close the EntityManager
when you are finished using it. We do not recommend relying on the finalizer to release the connection when the EntityManager
is garbage-collected. If you are using a managed persistence context, then you do not need to close it.
The shared cache (L2) is the cache stored in the persistence unit. It is a shared object cache for the entire persistence unit. Setting the cache isolation to SHARED
for an entity enables its shared cache. With a shared cache queries and find operations will resolve against the shared cache unless refreshing is used.
Use a shared cache to do the following:
improve performance by avoiding database access when finding or querying an entity by Id or index;
improve performance by avoiding database access when accessing an entity's relationships;
preserve object identity across persistence contexts for read-only entities.
When you use an EntityManager
to find a shared entity, the EntityManager
first checks the persistence unit's shared cache. If the entity is not in the persistence unit's shared cache, it will be read from the database and stored in the persistence unit's shared cache, a copy will also be stored in the persistence context's isolated cache. Any query not by Id, and not by an indexed attribute will first access the database. For each query result row, if the object is already in the shared cache, the shared object (with its relationships) will be used, otherwise a new object will be built from the row and put into the shared cache, and a copy will be put into the isolated cache. The isolated copy is always returned, unless read-only is used. For read-only the shared object is returned as the isolated copy is not required.
The size and memory usage of the shared cache depends on the entities cache type. The JPA Cache
and EclipseLink JpaCache
annotations can also be used to invalidate or clear the cache.
The protected cache option allows for shared objects to reference isolated objects. Setting the cache isolation to PROTECTED
for an entity enables its shared cache. The protected option is mostly the same as the shared option, except that protected entities can have relationships to isolated entities, whereas shared cannot.
Use a protected cache to do the following:
improve performance by avoiding database access when finding or querying an entity by Id or index;
improve performance by avoiding database access when accessing an entity's relationships to shared entities;
ensure read-only entities are isolated to the persistence context;
allow relationships to isolated entities.
Protected entities have the same life-cycle as shared entities, except for relationships, and read-only. Protected entities relationships to shared entities are cached in the shared cache, but their relationships to isolated entities are isolated and not cached in the shared cache. The @Noncacheable
annotation can also be used to disable caching of a relationship to shared entities. Protected entities that are read-only are always copied into the isolated cache, but are not change tracked.
EclipseLink offers a specialized persistence context cache for long-lived persistence contexts. Normally it is best to keep persistence contexts short-lived, such as creating a new EntityManager
per request, or per transaction. This is referred to as a stateless model. This ensures the persistence context does not become too big, causing memory and performance issues. It also ensures the objects cached in the persistence context do not become stale or out of sync with their committed state.
Some two-tier applications, or stateful models require long-lived persistence contexts. EclipseLink offers a special weak reference mode option for these types of applications. A weak reference mode maintains weak references to the objects in the persistence context. This allows the objects to garbage-collected if not referenced by the application. This helps prevent the persistence context from becoming too big, reducing memory usage and improving performance. Any new, removed or changed objects will be held with strong references until a commit occurs.
A weak reference mode can be configured through the eclipselink.persistence-context.reference-mode
persistence unit property. The following options can be used:
HARD
—This is the default, weak references are not used. The persistence context will grow until cleared or closed.
WEAK
—Weak references are used. Unreferenced unchanged objects will be eligible for garbage collection. Objects that use deferred change tracking will not be eligible for garbage collection.
FORCE_WEAK
—Weak references are used. Unreferenced, unchanged objects will be eligible for garbage collection. Changed (but unreferenced) objects that use deferred change tracking will also be eligible for garbage collection, causing any changes to be lost.
An entity can be configured as read-only using the @ReadOnly
annotation or the read-only
XML attribute. A read-only entity will not be tracked for changes and any updates will be ignored. Read-only entities cannot be persisted or removed. A read-only entity must not be modified, but EclipseLink does not currently enforce this. Modification to read-only objects can corrupt the persistence unit cache.
Queries can also be configured to return read-only objects using the eclipselink.read-only
query hint.
A SHARED
entity that is read-only will return the shared instance from queries. The same entity will be returned from all queries from all persistence contexts. Shared read-only entities will never be copied or isolated in the persistence context. This improves performance by avoiding the cost of copying the object, and tracking the object for changes. This both reduces memory, reduces heap usage, and improves performance. Object identity is also maintained across the entire persistence unit for read-only entities, allowing the application to hold references to these shared objects.
An ISOLATED
or PROTECTED
entity that is read-only will still have an isolated copy returned from the persistence context. This gives some improvement in performance and memory usage from avoid tracking the object for changes, but it is not as significant as SHARED
entities.
EclipseLink provides several different cache types which have different memory requirements. The size of the cache (in number of cached objects) can also be configured. The cache type and size to use depends on the application, the possibility of stale data, the amount of memory available in the JVM and on the machine, the garbage collection cost, and the amount of data in the database.
By default, EclipseLink uses a SOFT_CACHE
with an initial size of 100 objects. The cache size is not fixed, but just the initial size, EclipseLink will never eject an object from the cache until it has been garbage collected from memory. It will eject the object if the CACHE
type is used, but this is not recommended. The cache size of the SOFT_CACHE
and HARD_CACHE
is also the size of the soft or hard sub-cache that can determine a minimum number of objects to hold in memory.
You can configure how object identity is managed on a class-by-class basis. The ClassDescriptor
object provides the cache and identity map options described in Table 9-1.
Table 9-1 Cache and Identity Map Options
Option (Cache Type) | Caching | Guaranteed Identity | Memory Use |
---|---|---|---|
Yes |
Yes |
Very High |
|
Yes |
Yes |
Low |
|
Yes |
Yes |
High |
|
Yes |
Yes |
Medium-high |
There are two other options, NONE
, and CACHE
. These options are not recommend.
This option provides full caching and guaranteed identity: objects are never flushed from memory unless they are deleted.
It caches all objects and does not remove them. Cache size doubles whenever the maximum size is reached. This method may be memory-intensive when many objects are read. Do not use this option on batch operations.
Oracle recommends using this identity map when the data set size is small and memory is in large supply.
This option only caches objects that have not been garbage collected. Any object still referenced by the application will still be cached.
The weak cache type uses less memory than full identity map but also does not provide a durable caching strategy across client/server transactions. Objects are available for garbage collection when the application no longer references them on the server side (that is, from within the server JVM).
This option is similar to the weak cache type, except that the cache uses soft references instead of weak references. Any object still referenced by the application will still be cached, and objects will only be removed from the cache when memory is low.
The soft identity map allows for optimal caching of the objects, while still allowing the JVM to garbage collect the objects if memory is low.
These options are similar to the weak cache except that they maintain a most frequently used sub-cache. The sub-cache uses soft or hard references to ensure that these objects are not garbage collected, or only garbage collected only if the JVM is low on memory.
The soft cache and hard cache provide more efficient memory use. They release objects as they are garbage-collected, except for a fixed number of most recently used objects. Note that weakly cached objects might be flushed if the transaction spans multiple client/server invocations. The size of the sub-cache is proportional to the size of the cache as specified by the size option. You should set the cache size to the number of objects you wish to hold in your transaction.
Oracle recommends using this cache in most circumstances as a means to control memory used by the cache.
NONE
and CACHE
options do not preserve object identity and should only be used in very specific circumstances. NONE
does not cache any objects. CACHE
only caches a fixed number of objects in an LRU
fashion. These cache types should only be used if there are no relationships to the objects.Oracle does not recommend using these options. To disable caching set the cache isolation to ISOLATED
instead.
Use the following guidelines when configuring your cache type:
For objects with a long life span, use a SOFT
, SOFT_CACHE
or HARD_CACHE
cache type. For more information on when to choose one or the other, see Section 9.2.6.1, "About the Internals of Weak, Soft, and Hard Cache Types.".
For objects with a short life span, use a WEAK
cache type.
For objects with a long life span, that have few instances, such as reference data, use a FULL
cache type.
Note:
Use the FULL
cache type only if the class has a small number of finite instances. Otherwise, a memory leak will occur.
If caching is not required or desired, disable the shared cache by setting the cache isolation to ISOLATED
.
Note:
Oracle does not recommend the use of CACHE
and NONE
cache types.
See Section 9.2.6.1, "About the Internals of Weak, Soft, and Hard Cache Types."
The WEAK
and SOFT
cache types use JVM weak and soft references to ensure that any object referenced by the application is held in the cache. Once the application releases its reference to the object, the JVM is free to garbage collection the objects. When a weak or a soft reference is garbage collected is determined by the JVM. In general, expect a weak reference to be garbage collected with each JVM garbage-collection operation.
The SOFT_CACHE
and HARD_CACHE
cache types contain the following two caches:
Reference cache: implemented as a LinkedList
that contains soft or hard references, respectively.
Weak cache: implemented as a Map
that contains weak references.
When you create a SOFT_CACHE
or HARD_CACHE
cache with a specified size, the reference cache LinkedList
is exactly this size. The weak cache Map
has the size as its initial size: the weak cache will grow when more objects than the specified size are read in. Because EclipseLink does not control garbage collection, the JVM can reap the weakly held objects whenever it sees fit.
Because the reference cache is implemented as a LinkedList
, new objects are added to the end of the list. Because of this, it is by nature a least recently used (LRU) cache: fixed size, object at the top of the list is deleted, provided the maximum size has been reached.
The SOFT_CACHE
and HARD_CACHE
are essentially the same type of cache. The HARD_CACHE
was constructed to work around an issue with some JVMs.
If your application reaches a low system memory condition frequently enough, or if your platform's JVM treats weak and soft references the same, the objects in the reference cache may be garbage-collected so often that you will not benefit from the performance improvement provided by it. If this is the case, Oracle recommends that you use the HARD_CACHE
. It is identical to the SOFT_CACHE
except that it uses hard references in the reference cache. This guarantees that your application will benefit from the performance improvement provided by it.
When an object in a HARD_CACHE
or SOFT_CACHE
is pushed out of the reference cache, it gets put in the weak cache. Although it is still cached, EclipseLink cannot guarantee that it will be there for any length of time because the JVM can decide to garbage-collect weak references at anytime.
A query that is run against the shared session cache is known as an in-memory query. Careful configuration of in-memory querying can improve performance.
By default, a query that looks for a single object based on primary key attempts to retrieve the required object from the cache first, and searches the data source only if the object is not in the cache. All other query types search the database first, by default. You can specify whether a given query runs against the in-memory cache, the database, or both.
Stale data is an artifact of caching, in which an object in the cache is not the most recent version committed to the data source. To avoid stale data, implement an appropriate cache locking strategy.
By default, EclipseLink optimizes concurrency to minimize cache locking during read or write operations. Use the default EclipseLink isolation level, unless you have a very specific reason to change it. For more information on isolation levels in EclipseLink, see Section 9.1.3, "Shared, Isolated, Protected, Weak, and Read-only Caches".
Cache locking regulates when processes read or write an object. Depending on how you configure it, cache locking determines whether a process can read or write an object that is in use within another process.
A well-managed cache makes your application more efficient. There are very few cases in which you turn the cache off entirely, because the cache reduces database access, and is an important part of managing object identity.
To make the most of your cache strategy and to minimize your application's exposure to stale data, Oracle recommends the following:
Make sure you configure a locking policy so that you can prevent or at least identify when values have already changed on an object you are modifying. Typically, this is done using optimistic locking. EclipseLink offers several locking policies such as numeric version field, time-stamp version field, and some or all fields. Optimistic and pessimistic locking are described in the following sections.
Oracle recommends using EclipseLink optimistic locking. With optimistic locking, all users have read access to the data. When a user attempts to write a change, the application checks to ensure the data has not changed since the user read the data.
You can use version or field locking policies. Oracle recommends using version locking policies. For more information, see Section 6.2.4.1, "Optimistic Version Locking Policies" and Section 6.2.4.1.3, "Optimistic Field Locking Policies".
With pessimistic locking, the first user who accesses the data with the purpose of updating it locks the data until completing the update. The disadvantage of this approach is that it may lead to reduced concurrency and deadlocks.
Consider using pessimistic locking support at the query level. See Section 6.2.4.2, "Pessimistic Locking Policies."
If other applications can modify the data used by a particular class, use a weaker style of cache for the class. For example, the SoftCacheWeakIdentityMap
or WeakIdentityMap
minimizes the length of time the cache maintains an object whose reference has been removed.
Any query can include a flag that forces EclipseLink to go to the data source for the most up-to-date version of selected objects and update the cache with this information.
Using the Descriptor API, you can designate an object as invalid: when any query attempts to read an invalid object, EclipseLink will go to the data source for the most up to date version of that object and update the cache with this information. You can manually designate an object as invalid or use a CacheInvalidationPolicy
to control the conditions under which an object is designated invalid. For more information, see Section 9.6, "About Cache Expiration and Invalidation".
If your application is primarily read-based and the changes are all being performed by the same Java application operating with multiple, distributed sessions, you may consider using the EclipseLink cache coordination feature. Although this will not prevent stale data, it should greatly minimize it. For more information, see Section 9.11, "About Cache Coordination".
Some distributed systems require only a small number of objects to be consistent across the servers in the system. Conversely, other systems require that several specific objects must always be guaranteed to be up-to-date, regardless of the cost. If you build such a system, you can explicitly refresh selected objects from the database at appropriate intervals, without incurring the full cost of distributed cache coordination.
To implement this type of strategy, do the following:
Configure a set of queries that refresh the required objects.
Establish an appropriate refresh policy.
Invoke the queries as required to refresh the objects.
When you execute a query, if the required objects are in the cache, EclipseLink returns the cached objects without checking the database for a more recent version. This reduces the number of objects that EclipseLink must build from database results, and is optimal for noncoordinated cache environments. However, this may not always be the best strategy for a coordinated cache environment.
To override this behavior, set a refresh policy that specifies that the objects from the database always take precedence over objects in the cache. This updates the cached objects with the data from the database.
You can implement this type of refresh policy on each EclipseLink descriptor, or just on certain queries, depending upon the nature of the application.
By default, objects remain in the shared cache until they are explicitly deleted or garbage collected.
You can configure any entity with an expiry
that lets you specify either the number of milliseconds after which an entity instance should expire from the cache, or a time of day that all instances of the entity class should expire from the cache. Expiry is set on the @Cache
annotation or <cache>
XML element, and can be configured in two ways:
expiry
—The number of milliseconds after which an entity instance will expire.
expiryTimeOfDay
—A @TimeOfDay
representing the 24-hour time at which all instances of the entity class will expire from the cache.
When an instance expires, it is only invalidated in the cache. It is not removed from the cache, but when next accessed it will be refreshed from the database as part of the query that was used to access it.
The application can also explicitly invalidate objects in the cache using the JPA Cache
API, or the EclipseLink JpaCache
API.
Expiry can also be used in the query results cache. See Section 9.8, "About Query Results Cache."
Invalidation can also be used in a cluster through cache coordination, or from database events using database event notification. See Section 9.11.4, "Coordinated Cache and Clustering."
Alternatively, you can configure any object with a CacheInvalidationPolicy
that lets you specify, either with annotations or XML, under what circumstances a cached object is invalid. When any query attempts to read an invalid object, EclipseLink will go to the data source for the most up-to-date version of that object, and update the cache with this information.
For descriptions of the available CacheInvalidationPolicy
instances, see "Setting Cache Expiration" in Oracle Fusion Middleware Solutions Guide for Oracle TopLink.
You can configure a cache invalidation policy in the following ways:
At the project level that applies to all objects
At the descriptor level to override the project level configuration on a per-object basis
At the query level that applies to the results returned by the query
If you configure a query to cache results in its own internal cache, the cache invalidation policy you configure at the query level applies to the query's internal cache in the same way it would apply to the session cache.
If you are using a coordinated cache you can customize how EclipseLink communicates the fact that an object has been declared invalid. See Section 9.11, "About Cache Coordination".
The EclipseLink CacheInvalidationPolicy
API offers a few advanced features that are not available through annotations or XML. It is also possible to define your own expiry or invalidation policy by defining your own CacheInvalidationPolicy
. Advanced configuration can be done through using a DescriptorCustomizer
to customize your entity's ClassDescriptor
.
Here are a few of the CacheInvalidationPolicy
advanced options:
isInvalidationRandomized
—This allows the invalidation time to be randomized by 10% to avoid a large number of instances becoming invalid at the same time and causing a bottleneck in the database load. This is not used by default.
shouldRefreshInvalidObjectsOnClone
—This ensures that an invalid object accessed through a relationship from another object will be refreshed in the persistence context. This is enabled by default.
shouldUpdateReadTimeOnUpdate
—This updates an object's read time when the object is successfully updated. This is not enabled by default.
The EclipseLink cache is indexed by the entities Id. This allows the find()
operation, relationships, and queries by Id to obtain cache hits and avoid database access. The cache is not used by default for any non-Id query. All non-Id queries will access the database then resolve with the cache for each row returned in the result-set.
Applications tend to have other unique keys in their model in addition to their Id. This is quite common when a generated Id is used. The application frequently queries on these unique keys, and it is desirable to be able to obtain cache hits to avoid database access on these queries.
Cache indexes allow an in-memory index to be created in the EclipseLink cache to allow cache hits on non-Id fields. The cache index can be on a single field, or on a set of fields. The indexed fields can be updateable, and although they should be unique, this is not a requirement. Queries that contain the indexed fields will be able to obtain cache hits. Only single results can be obtained from indexed queries.
Cache indexes can be configured using the @CacheIndex
and @CacheIndexes
annotations and <cache-index>
XML element. A @CacheIndex
can be defined on the entity, or on an attribute to index the attribute. Indexes defined on the entity must define the columnNames
used for the index. An index can be configured to be re-indexed when the object is updated using the updateable attribute.
It is still possible to cache query results for non-indexed queries using the query result cache. For more information, see Section 9.8, "About Query Results Cache."
The EclipseLink query results cache allows the results of named queries to be cached, similar to how objects are cached.
By default in EclipseLink all queries access the database, unless they are by Id, or by cache-indexed fields. The resulting rows will still be resolved with the cache, and further queries for relationships will be avoided if the object is cached, but the original query will always access the database. EclipseLink does have options for querying the cache, but these options are not used by default, as EclipseLink cannot assume that all of the objects in the database are in the cache. The query results cache allows for non-indexed and result list queries to still benefit from caching.
The query results cache is indexed by the name of the query, and the parameters of the query. Only named queries can have their results cached, dynamic queries cannot use the query results cache. As well, if you modify a named query before execution, such as setting hints or properties, then it cannot use the cached results.
The query results cache does not pick up committed changes from the application as the object cache does. It should only be used to cache read-only objects, or should use an invalidation policy to avoid caching stale results. Committed changes to the objects in the result set will still be picked up, but changes that affect the results set (such as new or changed objects that should be added/removed from the result set) will not be picked up.
The query results cache supports a fixed size, cache type, and invalidation options.
By default, EclipseLink optimizes concurrency to minimize cache locking during read or write operations. Use the default EclipseLink transaction isolation configuration unless you have a very specific reason to change it.
Tune the EclipseLink cache for each class to help eliminate the need for distributed cache coordination. Always tune these settings before implementing cache coordination. For more information, see "Monitoring and Optimizing TopLink-Enabled Applications" in Oracle Fusion Middleware Solutions Guide for Oracle TopLink.
The need to maintain up-to-date data for all applications is a key design challenge for building a distributed application. The difficulty of this increases as the number of servers within an environment increases. EclipseLink provides a distributed cache coordination feature that ensures data in distributed applications remains current.
Cache coordination reduces the number of optimistic lock exceptions encountered in a distributed architecture, and decreases the number of failed or repeated transactions in an application. However, cache coordination in no way eliminates the need for an effective locking policy. To effectively ensure working with up-to-date data, cache coordination must be used with optimistic or pessimistic locking. Oracle recommends that you use cache coordination with an optimistic locking policy.
You can use cache invalidation to improve cache coordination efficiency. For more information, see Section 9.6, "About Cache Expiration and Invalidation".
As Figure 9-2 shows, cache coordination is a session feature that allows multiple, possibly distributed, instances of a session to broadcast object changes among each other so that each session's cache is either kept up-to-date or notified that the cache must update an object from the data source the next time it is read.
Note:
You cannot use isolated client sessions with cache coordination. For more information, see Section 9.1.3, "Shared, Isolated, Protected, Weak, and Read-only Caches."
When sessions are distributed, that is, when an application contains multiple sessions (in the same JVM, in multiple JVMs, possibly on different servers), as long as the servers hosting the sessions are interconnected on the network, sessions can participate in cache coordination. Coordinated cache types that require discovery services also require the servers to support User Datagram Protocol (UDP) communication and multicast configuration. For more information, see Section 9.11.2, "Coordinated Cache Architecture and Types."
This section describes the following:
Cache coordination can enhance performance and reduce the likelihood of stale data for applications that have the following characteristics:
Changes are all being performed by the same Java application operating with multiple, distributed sessions
Primarily read-based
Regularly requests and updates the same objects
To maximize performance, avoid cache coordination for applications that do not have these characteristics.
For other options to reduce the likelihood of stale data, see Section 9.4, "About Handling Stale Data."
You can configure a coordinated cache to broadcast changes using any of the following communication protocols:
JMS Coordinated Cache, for the Java Message Service (JMS)
RMI Coordinated Cache, for Remote Method Invocation (RMI)
For a JMS coordinated cache, when a particular session's coordinated cache starts up, it uses its JNDI naming service information to locate and create a connection to the JMS server. The coordinated cache is ready when all participating sessions are connected to the same topic on the same JMS server. At this point, sessions can start sending and receiving object change messages. You can then configure all sessions that are participating in the same coordinated cache with the same JMS and JNDI naming service information.
Example 9-1 illustrates a persistence.xml
file configured for a JMS coordinated cache.
Example 9-1 persistence.xml File for JMS Cache Coordination
<?xml version="1.0" encoding="UTF-8"?> <persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_2_0.xsd" version="2.0"> <persistence-unit name="acme" transaction-type="RESOURCE_LOCAL"> <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider> <exclude-unlisted-classes>false</exclude-unlisted-classes> <properties> <property name="eclipselink.cache.coordination.protocol" value="jms"/> <property name="eclipselink.cache.coordination.jms.topic" value="jms/ACMETopic"/> <property name="eclipselink.cache.coordination.jms.factory" value="jms/ACMETopicConnectionFactory"/> </properties> </persistence-unit> </persistence>
For more information on configuring JMS, see "Configuring JMS Cache Coordination Using Persistence Properties" in Oracle Fusion Middleware Solutions Guide for Oracle TopLink. See also your JMS provider's documentation.
For an RMI coordinated cache, when a particular session's coordinated cache starts up, the session binds its connection in its naming service (either an RMI registry or JNDI), creates an announcement message (that includes its own naming service information), and broadcasts the announcement to its multicast group. When a session that belongs to the same multicast group receives this announcement, it uses the naming service information in the announcement message to establish bidirectional connections with the newly announced session's coordinated cache. The coordinated cache is ready when all participating sessions are interconnected in this way, at which point sessions can start sending and receiving object change messages. You can then configure each session with naming information that identifies the host on which the session is deployed.
Example 9-2 illustrates a persistence.xml
file configured for a RMI coordinated cache.
Example 9-2 persistence.xml File for RMI Cache Coordination
<?xml version="1.0" encoding="UTF-8"?> <persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence persistence_2_0.xsd" version="2.0"> <persistence-unit name="acme" transaction-type="RESOURCE_LOCAL"> <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider> <exclude-unlisted-classes>false</exclude-unlisted-classes> <properties> <property name="eclipselink.cache.coordination.protocol" value="rmi"/> </properties> </persistence-unit> </persistence>
For more information, see "Configuring RMI Cache Coordination Using Persistence Properties" in Oracle Fusion Middleware Solutions Guide for Oracle TopLink.
Using the classes in org.eclipse.persistence.sessions.coordination
package, you can define your own coordinated cache for custom solutions.
An application cluster is a set of middle tier server machines or VMs servicing requests for a single application, or set of applications. Multiple servers are used to increase the scalability of the application and/or to provide fault tolerance and high availability. Typically the same application will be deployed to all of the servers in the cluster and application requests will be load balanced across the set of servers. The application cluster will access a single database, or a database cluster. An application cluster may allow new servers to be added to increase scalability, and for servers to be removed such as for updates and servicing.
Application clusters can consist of Java EE servers, Web containers, or Java server applications.
EclipseLink can function in any clustered environment. The main issue in a clustered environment is utilizing a shared persistence unit (L2) cache. If you are using a shared cache (enabled by default in EclipseLink projects), then each server will maintain its own cache, and each caches data can get out of sync with the other servers and the database.
EclipseLink provides cache coordination in a clustered environment to ensure the servers caches are is sync.
There are also many other solutions to caching in a clustered environment, including:
Disable the shared cache (through setting @Cacheable(false)
, or @Cache(isolation=ISOLATED
)).
Only cache read-only objects.
Set a cache invalidation timeout to reduce stale data.
Use refreshing on objects/queries when fresh data is required.
Use optimistic locking to ensure write consistency (writes on stale data will fail, and will automatically invalidate the cache).
Use a distributed cache (such as TopLink Grid's integration of TopLink with Oracle Coherence).
Use database events to invalidate changed data in the cache (such as EclipseLink's support for Oracle Query Change Notification).
Cache coordination enables a set of persistence units deployed to different servers in the cluster (or on the same server) to synchronize their changes. Cache coordination works by each persistence unit on each server in the cluster being able to broadcast notification of transactional object changes to the other persistence units in the cluster. EclipseLink supports cache coordination over RMI and JMS. The cache coordination framework is also extensible so other options could be developed.
Cache coordination works by broadcasting changes for each transaction to the other servers in the cluster. Each other server will receive the change notification, and either invalidate the changed objects in their cache, or update the cached objects state with the changes. Cache coordination occurs after the database commit, so only committed changes are broadcast.
Cache coordination greatly reduces to chance of an application getting stale data, but does not eliminate the possibility. Optimistic locking should still be used to ensure data integrity. Even in a single server application stale data is still possible within a persistence context unless pessimistic locking is used. Optimistic (or pessimistic) locking is always required to ensure data integrity in any multi-user system.
TopLink includes the following persistence property extensions for caching. For more information on these extensions, see Oracle Fusion Middleware Java Persistence API (JPA) Extensions Reference for Oracle TopLink.
cache.coordination.channel
cache.coordination.jms.factory
cache.coordination.jms.host
cache.coordination.jms.reuse-topic-publisher
cache.coordination.jms.topic
cache.coordination.jndi.initial-context-factory
cache.coordination.jndi.password
cache.coordination.jndi.user
cache.coordination.naming-service
cache.coordination.propagate-asynchronously
cache.coordination.protocol
cache.coordination.remove-connection-on-error
cache.coordination.rmi.announcement-delay
cache.coordination.rmi.multicast-group
cache.coordination.rmi.multicast-group.port
cache.coordination.rmi.packet-time-to-live
cache.coordination.rmi.url
cache.coordination.thread.pool.size
Both RMI and JMS cache coordination work with Oracle WebLogic. When a WebLogic cluster is used JNDI is replicated among the cluster servers, so a cache.coordination.rmi.url
or a cache.coordination.jms.host
option is not required. For JMS cache coordination, the JMS topic should only be deployed to only one of the servers (as of Oracle WebLogic 10.3.6). It may be desirable to have a dedicated JMS server if the JMS messaging traffic is heavy.
Use of other JMS services in WebLogic may have other requirements.
JMS cache coordination works with Glassfish. When a Glassfish cluster is used, JNDI is replicated among the cluster servers, so a cache.coordination.jms.host
option is not required.
Use of other JMS services in Glassfish may have other requirements.
RMI cache coordination does not work when the JNDI naming service option is used in a Glassfish cluster. RMI will work if the eclipselink.cache.coordination.naming-service
option is set to rmi
. Each server must provide its own eclipselink.cache.coordination.rmi.url
option, either by having a different persistence.xml
file for each server, or by setting the URL as a System property in the server, or through a customizer.
JMS cache coordination may have issues on IBM WebSphere. Use of a Message Driven Bean (MDB) may be required to allow access to JMS. To use an MDB with cache coordination, set the eclipselink.cache.coordination.protocol
option to the value jms-publishing
. The application will also have to deploy an MDB that processes cache coordination messages in its EAR file.
Example 9-3 illustrates the Java code required to configure an MDB.
Example 9-3 Cache Coordination Message Driven Bean
@MessageDriven public class JMSCacheCoordinationMDB implements MessageListener { private JMSTopicRemoteConnection connection; @PersistenceUnit(unitName="acme") private EntityManagerFactory emf; public void ejbCreate() { this.connection = new JMSTopicRemoteConnection(this.emf.unwrap(ServerSession.class).getCommandManager()); } public void onMessage(Message message) { this.connection.onMessage(message); } }