7.3. Datastore Cache

Table of Contents

7.3.1. Overview of Kodo JDO Datastore Caching
7.3.2. Kodo JDO Cache Usage
7.3.3. Cache Extension
7.3.4. Important notes about the DataCache
7.3.5. Known issues and limitations

7.3.1. Overview of Kodo JDO Datastore Caching

Kodo JDO includes support for an optional datastore cache that operates at the PersistenceManagerFactory level. This cache is designed to significantly increase performance while remaining in full compliance with the JDO standard. This means that turning on the caching option can transparently increase the performance of your application, with no changes to your code base.

Kodo JDO's datastore cache is not related to the PersistenceManager cache dictated by the JDO specification. The JDO specification mandates behavior for the PersistenceManager cache aimed at guaranteeing transaction isolation when operating on persistent objects. Kodo JDO's datastore cache is designed to provide significant performance increases over cacheless operation, while guaranteeing that all JDO behavior will be identical in both cache-enabled and cacheless operation.

When enabled, the cache is checked before making a trip to the data store. Data is stored in the cache when objects are committed and when persistent objects are loaded from the datastore.

There are currently two versions of the datastore cache: a single JVM version and a distributed version. Both versions are bundled with Kodo JDO Enterprise Edition. They are available as optional plug-ins for Kodo JDO Standard Edition.

The single JVM version maintains and shares a data cache across all PersistenceManager instances obtained from a particular PersistenceManagerFactory. This version is not appropriate for use in a distributed environment, as caches in different JVMs or created from different PersistenceManagerFactory objects will not be synchronized.

The distributed version communicates cache invalidation information to other JVMs via JMS, TCP, or UDP, or by using a Tangosol Coherence cache. See the descriptions of the different distributed caches below for more information.

7.3.2. Kodo JDO Cache Usage

To enable the basic single-PersistenceManagerFactory cache, set the com.solarmetric.kodo.DataCacheClass property to com.solarmetric.kodo.runtime.datacache.plugins.LocalCache.

To enable the distributed PersistenceManagerFactory cache, set the com.solarmetric.kodo.DataCacheClass property to com.solarmetric.kodo.runtime.datacache.plugins.UDPCache.

The default cache implementations maintain a least-recently-used map of object IDs to cache data. By default, 1000 elements are kept in cache. This can be adjusted by setting com.solarmetric.kodo.DataCacheProperties appropriately -- see below for an example. Removed objects are moved to a soft reference map, so they may stick around for a little while longer. Additionally, objects that are pinned into the cache are not counted when determining if the cache size exceeds the maximum.

Individual classes can be excluded from the data cache by setting the can-cache metadata extension to false.

The DataCache API provides a mechanism for pinning objects into memory by creating hard references to them. Caching algorithms are not permitted to flush objects that have been pinned from memory unless an explicit remove() call is made. To pin an object into memory, obtain a reference to the cache and invoke pin() on it:

Example 7.1. Pinning an object into the DataCache

	PersistenceManagerFactoryImpl factory = 
		(PersistenceManagerFactoryImpl) pm.getPersistenceManagerFactory ();
	factory.getConfiguration ().getDataCache ().pin (JDOHelper.getObjectId (o));
    

A previously pinned object can later be unpinned by invoking DataCache.unpin():

Example 7.2. Unpinning an object from the DataCache

	PersistenceManagerFactoryImpl factory = 
		(PersistenceManagerFactoryImpl) pm.getPersistenceManagerFactory ();
	factory.getConfiguration ().getDataCache ().unpin (JDOHelper.getObjectId (o));
    

It is possible to evict data from the cache, but cache eviction does not automatically happen when the evict() method is invoked on a PersistenceManager. Instead, you must obtain a reference to the DataCache object from a PersistenceManagerFactory and explicitly evict the object id from the cache:

Example 7.3. Evicting an object from the DataCache

	PersistenceManagerFactoryImpl factory = 
		(PersistenceManagerFactoryImpl) pm.getPersistenceManagerFactory ();
	factory.getConfiguration ().getDataCache ().remove (JDOHelper.getObjectId (o));
    

The JMS cache can be configured by setting the com.solarmetric.kodo.DataCacheProperties to contain the appropriate configuration properties. The JMS cache understands the following properties:

  • Topic

    Default: topic/KodoCacheTopic

    Description: The topic that the cache should publish cache updates to and subscribe to for cache updates sent from other JVMs.

  • TopicConnectionFactory

    Default: java:/ConnectionFactory

    Description: The JNDI name of a javax.naming.TopicConnectionFactory factory to use for finding topics.

To configure a PersistenceManagerFactory to use the JMS cache, your properties filename might look like the following:

Example 7.4. Configuring a PersistenceManagerFactory to use a JMS cache update mechanism

com.solarmetric.kodo.DataCacheClass=com.solarmetric.kodo.datacache.plugins.JMSCache
com.solarmetric.kodo.DataCacheProperties=Topic=topic/KodoCacheTopic CacheSize=5000
    

The Tangosol Coherence cache can be configured by setting the com.solarmetric.kodo.DataCacheProperties to contain the appropriate configuration properties. The Tangosol cache understands the following properties:

  • TangosolCacheName

    Default: kodo

    Description: The name of the Tangosol Coherence cache to use.

  • TangosolCacheType

    Default: distributed

    Description: The type of Tangosol Coherence cache to use. Valid values are either distributed or replicated.

To configure a PersistenceManagerFactory to use the Tangosol cache, your properties filename might look like the following:

Example 7.5. Configuring a PersistenceManagerFactory to use a Tangosol cache for distributed cache needs

com.solarmetric.kodo.DataCacheClass=com.solarmetric.kodo.datacache.plugins.TangosolCache
com.solarmetric.kodo.DataCacheProperties=TangosolCacheName=kodo TangosolCacheType=distributed
    

Note that as of this writing, it is not possible to use a Tangosol Coherence 1.2.2 distributed cache type with Apple's 1.3.1 JVM. Use their replicated cache instead.

The TCP and UDP caches have several options that are defined as host specifications containing a host name or ip address and an optional port separated by a colon. For example, the host specification saturn.solarmetric.com:1234 represents an InetAddress retrieved by invoking InetAddress.getByName ("saturn.solarmetric.com") and a port of 1234.

The TCP cache can be configured by setting the com.solarmetric.kodo.DataCacheProperties to contain the appropriate configuration properties. The TCP cache understands the following properties:

  • Port

    Default: 5636

    Description: The TCP port that the cache should listen on for data updates.

  • Addresses

    Default: none

    Description: A semicolon-separated list of IP addresses to which invalidations should be sent. No default value.

To configure a PersistenceManagerFactory to use the TCP cache, your properties filename might look like the following:

Example 7.6. Configuring a PersistenceManagerFactory to use a TCP cache update mechanism

com.solarmetric.kodo.DataCacheClass=com.solarmetric.kodo.datacache.plugins.TCPCache
com.solarmetric.kodo.DataCacheProperties=Addresses=10.0.1.10;10.0.1.11;10.0.1.12;10.0.1.13 CacheSize=5000
    

The UDP cache can be configured by setting the com.solarmetric.kodo.DataCacheProperties to contain the appropriate configuration properties. The UDP cache understands the following properties:

  • Port

    Default: 5636

    Description: The UDP port that the cache should listen on for data updates.

  • PacketLength

    Default: 1024

    Description: The maximum packet length for which the cache should prepare a buffer.

  • UseMulticast

    Default: false

    Description: A boolean selector for controlling how the cache communicates with other cache instances. If true, the cache will broadcast changes on the multicast host specification specified in the multicast-group pref, and will disregard the setting of Port. Otherwise, it will send a UDP packet to each IP address listed in the addresses list.

  • MulticastGroup

    Default: none

    Description: The host specification of the multicast group to which packets should be sent.

  • Addresses

    Default: none

    Description: A semicolon-separated list of IP addresses to which invalidations should be sent. No default value. Note that it is more efficient to use the multicast mechanism, as less network traffic is necessary.

To configure a PersistenceManagerFactory to use the UDP cache, your properties filename might look like the following:

Example 7.7. Configuring a PersistenceManagerFactory to use a UDP cache update mechanism

com.solarmetric.kodo.DataCacheClass=com.solarmetric.kodo.datacache.plugins.UDPCache
com.solarmetric.kodo.DataCacheProperties=Multicast=false Addresses=10.0.1.10;10.0.1.11;10.0.1.12;10.0.1.13 CacheSize=5000
    

7.3.3. Cache Extension

The provided data cache classes can be easily extended to add additional functionality. If you are adding new behavior, you should extend com.solarmetric.kodo.runtime.datacache.plugins.LocalCache, com.solarmetric.kodo.runtime.datacache.plugins.JMSCache, com.solarmetric.kodo.runtime.datacache.plugins.TCPCache, or com.solarmetric.kodo.runtime.datacache.plugins.UDPCache, as appropriate. If you want to implement a distributed cache that uses a method other than UDP for communications, extend the abstract class com.solarmetric.kodo.runtime.datacache.plugins.DistributedCache.

For specific examples of data cache extensions, please look at the updating JMS cache sample in the Kodo JDO distribution, or contact SolarMetric at jdosupport@solarmetric.com.

7.3.4. Important notes about the DataCache

  • The default cache implementations do not automatically refresh objects in other PersistenceManager objects when the cache is updated or invalidated. This behavior would not be compliant with the specification. An example of how to extend the JMS cache to update all non-transactional PersistenceManager objects associated with a particular cache is available in the Kodo JDO distribution samples/ directory.

  • Invoking PersistenceManager.refresh() or PersistenceManager.evict() or related methods does not result in the corresponding data being dropped from the DataCache. The DataCache assumes that it is up-to-date with respect to the data store, so it is effectively an in-memory extension of the data store. If you really want to force data out of the cache, you should use the DataCache APIs (see the DataCache JavaDoc for details), not the PersistenceManager cache control APIs.

  • The com.solarmetric.kodo.runtime.datacache.plugins.LocalCache does not communicate with other caches in the same JVM but assoctiated with different PersistenceManagerFactory objects. However, if you make multiple calls to JDOHelper.getPersistenceManagerFactory() with the same Properties argument, then we ensure that all returned PersistenceManagerFactory objects are the same.

  • Because of the nature of JMS, it is important that you invoke PersistenceManagerFactoryImpl.close() when finished with a PersistenceManagerFactory and all its PersistenceManager objects. If you do not do so, a daemon thread will stay up in the JVM, preventing the JVM from exiting.

7.3.5. Known issues and limitations

  • When using data store (pessimistic) transactions in concert with the distributed caching implementations, it is possible to read stale data.

    For example, if you have two JVMs (JVM A and JVM B) both communicating with each other, and JVM A obtains a data store lock on a particular object's underlying data, it is possible for JVM B to load the data from the cache without going to the data store, and therefore load data that should be locked. This will only happen if JVM B attempts to read data that is already in its cache during the period between when JVM A locked the data and JVM B received and processed the invalidation notification.

    This problem is impossible to solve without putting together a two-phase commit system for cache notifications, which would add significant overhead to the caching implementation. As a result, we recommend that people use optimistic locking when using data caching. If you do not, then understand that some of your non-transactional data may not be consistent with the data store.

    Note that when loading objects in a transaction, the appropriate data store transactions will be obtained. So, transactional code will maintain its integrity.