Chapter 7. Enterprise Features

7.1. Datastore Cache
7.1.1. Overview of Kodo JDO Datastore Caching
7.1.2. Kodo JDO Cache Usage
7.1.3. Kodo JDO Query Caching
7.1.4. Kodo JDO Data Cache Configuration
7.1.5. Cache Extension
7.1.6. Important notes about the DataCache
7.1.7. Known issues and limitations
7.2. Query Extensions
7.2.1. Using Query Extensions
7.2.2. Included Query Extensions
7.2.3. Deprecated Query Extensions
7.2.4. Developing Custom Query Extensions
7.2.5. Configuring Query Extensions
7.3. Fetch Groups
7.3.1. Normal Default Fetch Group Behavior
7.3.2. Kodo JDO Fetch Group Behavior
7.3.3. Configuring a PersistenceManager to load Fetch Groups
7.4. XA Transactions
7.4.1. Overview of XA Distributed Transaction Processing
7.4.2. Requirements for using Kodo with XA transactions
7.4.3. Configuring Kodo to utilize XA transactions
7.5. Remote Commit Notification Framework
7.5.1. Kodo JDO RemoteCommitProvider Configuration
7.5.2. Event Notification Framework Customization

Kodo JDO Enterprise Edition includes a number of proprietary features targeted at the enterprise developer. This chapter outlines these features.

7.1. Datastore Cache

7.1.1. Overview of Kodo JDO Datastore Caching

Kodo JDO includes support for an optional datastore cache that operates at the PersistenceManagerFactory level. This cache is designed to significantly increase performance while remaining in full compliance with the JDO standard. This means that turning on the caching option can transparently increase the performance of your application, with no changes to your code base.

Kodo JDO's datastore cache is not related to the PersistenceManager cache dictated by the JDO specification. The JDO specification mandates behavior for the PersistenceManager cache aimed at guaranteeing transaction isolation when operating on persistent objects. Kodo JDO's datastore cache is designed to provide significant performance increases over cacheless operation, while guaranteeing that all JDO behavior will be identical in both cache-enabled and cacheless operation.

There are four ways to access data via the JDO APIs: relation traversal, JDOQL queries, direct invocation of PersistenceManager.getObjectById(), and iteration over an Extent's iterator. Kodo JDO's cache plugin accelerates three of these mechanisms. It does not provide any caching of Extent iterators. If you find yourself in need of higher-performance extent iteration, consider this workaround.

Table 7.1. Data access methods

Access methodUses cache
Relation traversal Yes
JDOQL query Yes
PersistenceManager.getObjectById()Yes
Iteration over an ExtentNo

When enabled, the cache is checked before making a trip to the data store. Data is stored in the cache when objects are committed and when persistent objects are loaded from the datastore.

Kodo's datastore cache can operate both in a single-JVM environment and in a multi-JVM environment. Multi-JVM caching is achieved through use of the event notification framework. The datastore caching and the distributed event notification frameworks are both bundled with Kodo JDO Enterprise Edition, and are available as optional plug-ins for Kodo JDO Standard Edition.

The single JVM mode of operation maintains and shares a data cache across all PersistenceManager instances obtained from a particular PersistenceManagerFactory. This is not appropriate for use in a distributed environment, as caches in different JVMs or created from different PersistenceManagerFactory objects will not be synchronized.

When used in conjunction with a com.solarmetric.kodo.runtime.event.RemoteCommitProvider, commit information is communicated to other JVMs via JMS or TCP, and remote caches are invalidated based on this information.

When using a Tangosol Coherence cache plug-in, all remote updating of cache information is delegated to the Coherence cache. See the descriptions of the different distributed caches below for more information.

7.1.2. Kodo JDO Cache Usage

To enable the basic single-PersistenceManagerFactory cache, set the com.solarmetric.kodo.DataCacheClass property to com.solarmetric.kodo.runtime.datacache.plugins.CacheImpl, and set the com.solarmetric.kodo.RemoteCommitProviderClass to com.solarmetric.kodo.runtime.event.impl.SingleJVMRemoteCommitProvider.

To configure the PersistenceManagerFactory cache to remain up-to-date in a distributed environment, set the com.solarmetric.kodo.DataCacheClass property to com.solarmetric.kodo.runtime.datacache.plugins.CacheImpl, and set the com.solarmetric.kodo.RemoteCommitProviderClass and com.solarmetric.kodo.RemoteCommitProviderProperties appropriately. This process is described in greater depth in the remote event notification documentation.

The default cache implementations maintain a least-recently-used map of object IDs to cache data. By default, 1000 elements are kept in cache. This can be adjusted by setting com.solarmetric.kodo.DataCacheProperties appropriately -- see below for an example. Removed objects are moved to a soft reference map, so they may stick around for a little while longer. Additionally, objects that are pinned into the cache are not counted when determining if the cache size exceeds the maximum.

Individual classes can be excluded from the data cache by setting the can-cache metadata extension to false.

A cache timeout value can be specified for a class by setting the timeout metadata extension to a positive decimal representing the amount of time in seconds for which a class's data is valid.

The DataCache API provides a mechanism for pinning objects into memory by creating hard references to them. Caching algorithms are not permitted to flush objects that have been pinned from memory unless an explicit remove() call is made. To pin an object into memory, obtain a reference to the cache and invoke pin() on it:

Example 7.1. Pinning an object into the DataCache

import com.solarmetric.kodo.runtime.PersistenceManagerImpl;
import com.solarmetric.kodo.conf.Configuration;

Configuration conf = ((PersistenceManagerImpl) pm).getConfiguration ();
conf.getDataCache ().pin (JDOHelper.getObjectId (o));
    
A previously pinned object can later be unpinned by invoking DataCache.unpin():

Example 7.2. Unpinning an object from the DataCache

import com.solarmetric.kodo.runtime.PersistenceManagerImpl;
import com.solarmetric.kodo.conf.Configuration;

Configuration conf = ((PersistenceManagerImpl) pm).getConfiguration ();
conf.getDataCache ().unpin (JDOHelper.getObjectId (o));
    

It is possible to evict data from the cache, but cache eviction does not automatically happen when the evict() method is invoked on a PersistenceManager. Instead, you must obtain a reference to the DataCache object from a PersistenceManagerFactory and explicitly evict the object id from the cache:

Example 7.3. Evicting an object from the DataCache

import com.solarmetric.kodo.runtime.PersistenceManagerImpl;
import com.solarmetric.kodo.conf.Configuration;

Configuration conf = ((PersistenceManagerImpl) pm).getConfiguration ();
conf.getDataCache ().remove (JDOHelper.getObjectId (o));
    

It is possible for different persistence-capable classes to use different caches. This is achieved by specifying a cache name in a JDO metadata class-level extension:

Example 7.4. Specifying a non-default DataCache

<jdo>
    <package name="">
        <class name="NonDefaultCacheClass">
            <extension vendor-name="kodo" key="data-cache-name" value="small-cache"/>
        </class>
    </package>
</jdo>
    
This will cause instances of the NonDefaultCacheClass class to be stored in a cache named small-cache. Currently, Kodo does not provide any direct semantics for simple configuration of named caches. Instead, Kodo creates all caches in the same manner, by using the DataCacheClass and DataCacheProperties configuration properties. Any additional configuration must take place after initialization by using the getDataCache(String) method in com.solarmetric.kodo.runtime.datacache.DataCacheStoreManager or by extending the newDataCache(String) method in in DataCacheStoreManager.

7.1.3. Kodo JDO Query Caching

Query caching is enabled by default when datastore caching is enabled. The cache stores the object IDs returned by invocations of the Query.execute() methods. When a query is executed, Kodo assembles a key based on the query properties and the parameters used at execution time, and checks for a cached query result. If one is found, the object IDs in the cached result are looked up, and the resultant persistence-capable objects are returned. Otherwise, the query is executed against the database, and the object IDs loaded by the query are put into the cache. The object ID list is not cached until the list returned at query execution time is fully traversed.

The default query cache implementation caches 100 query executions in a least-recently-used cache. This can be changed by setting the cache size in the QueryCacheProperties configuration property: com.solarmetric.kodo.QueryCacheProperties: CacheSize=1000. Setting this to 0 will disable query caching. Setting it to -1 will disable query cache flushing, causing the cache to retain all results, only removing query data when the queries are invalidated.

There are certain situations in which the query cache is bypassed:

  • Caching is not used for in-memory queries (queries in which the candidates are a collection instead of a class or extent).

  • Caching is not used in transactions that have IgnoreCache set to false and in which modifications to classes in the query's access path have occurred. If none of the classes in the access path have been touched, then cached results are still valid and are used.

  • Caching is not used in pessimistic transactions, since Kodo must go to the database to lock the appropriate rows.

Cache results are removed from the cache when instances of classes in a cached query's access path are touched. That is, if a query accesses data in class A, and instances of class A are modified, deleted, or inserted, then the cached query data is dropped from the cache.

Additionally, cache results are removed when an object in the query's result list is evicted from cache due to a timeout or space requirements. This is done because if a significant number of objects in a cached query are not in the data cache, then it is more efficient to go to the database than to pull the objects in from the database one-at-a-time.

It is possible to tell the query cache that a class has been altered. This is only necessary when the changes occur via direct modification of the database outside of Kodo's control.

Example 7.5. Notifying the query cache of altered classes

import com.solarmetric.kodo.conf.Configuration;
import com.solarmetric.kodo.runtime.PersistenceManagerImpl;
import com.solarmetric.kodo.runtime.datacache.query.QueryCache;

Configuration conf = ((PersistenceManagerImpl) pm).getConfiguration ();
QueryCache cache = conf.getQueryCache ();
Set changed = new HashSet ();
changed.add (A.class);
changed.add (B.class);
cache.notifyChangedClasses (changed);
    
When using one of Kodo's distributed cache implemenations, it is necessary to perform this in every JVM -- the change notification is not propagated automatically. When using a coherent cache implementation such as Kodo's Tangosol cache implementation, it is not necessary to do this in every JVM (although it won't hurt to do so), as the cache results are stored directly in the coherent cache.

Data can manually be dropped from the cache or pinned into the cache, as well. To do so, you must first create a QueryKey for the query invocation in question.

Example 7.6. Dropping or pinning query results in the cache

import com.solarmetric.kodo.conf.Configuration;
import com.solarmetric.kodo.runtime.PersistenceManagerImpl;
import com.solarmetric.kodo.runtime.datacache.query.QueryCache;
import com.solarmetric.kodo.runtime.datacache.query.QueryKey;

Configuration conf = ((PersistenceManagerImpl) pm).getConfiguration ();
QueryCache cache = conf.getQueryCache ();

QueryKey key1 = new QueryKey (query, params1);
cache.pin (key1);

QueryKey key2 = new QueryKey (query, params2);
cache.remove (key2);
    

Pinning data into the cache instructs the cache to not expire the pinned results when cache flushing occurs. However, pinned results will be removed from the cache if an event occurs that invalidates the results.

Caching can be disabled on a per-PersistenceManager or per-Query basis:

Example 7.7. Temporarily disabling and enabling query caching

import com.solarmetric.kodo.runtime.PersistenceManagerImpl;

// temporarily disable query caching for all queries created from pm
PersistenceManagerImpl pmi = (PersistenceManagerImpl) pm;
pmi.setQueryCacheEnabled (false);

// re-enable caching for a particular query
Query q = pm.newQuery (A.class);
q.setQueryCacheEnabled (true);
    

Example 7.8. Disabling query caching via configuration properties

Alter the PersistenceManagerProperties property to indicate that query caching should be disabled by default:

com.solarmetric.kodo.PersistenceManagerProperties: QueryCacheEnabled=false
    

7.1.4. Kodo JDO Data Cache Configuration

As mentioned above, the datastore cache can work in conjunction with the event notification framework in order to support multi-JVM cache behavior. See the remote commit provider configuration documentation for more information on this process.

The Tangosol Coherence cache can be configured by setting the com.solarmetric.kodo.DataCacheProperties to contain the appropriate configuration properties. The Tangosol cache understands the following properties:

  • TangosolCacheName

    Default: kodo

    Description: The name of the Tangosol Coherence cache to use.

  • TangosolCacheType

    Default: distributed

    Description: The type of Tangosol Coherence cache to use. Valid values are either distributed or replicated.

To configure a PersistenceManagerFactory to use the Tangosol cache, your properties filename might look like the following:

Example 7.9. Configuring a PersistenceManagerFactory to use a Tangosol cache for distributed cache needs

com.solarmetric.kodo.DataCacheClass= \
  com.solarmetric.kodo.datacache.plugins.TangosolCache
com.solarmetric.kodo.DataCacheProperties= \
  TangosolCacheName=kodo TangosolCacheType=distributed
    

Note that as of this writing, it is not possible to use a Tangosol Coherence 1.2.2 distributed cache type with Apple's 1.3.1 JVM. Use their replicated cache instead.

7.1.5. Cache Extension

The provided data cache classes can be easily extended to add additional functionality. If you are adding new behavior, you should extend com.solarmetric.kodo.runtime.datacache.plugins.CacheImpl. If you want to implement a distributed cache that uses an unsupported method for communications, create an implementation of com.solarmetric.kodo.runtime.event.RemoteCommitProvider. This process is described in greater detail in the event notification customization documentation.

7.1.6. Important notes about the DataCache

  • The default cache implementations do not automatically refresh objects in other PersistenceManager objects when the cache is updated or invalidated. This behavior would not be compliant with the specification. An example of how to extend the JMS cache to update all non-transactional PersistenceManager objects associated with a particular cache is available in the Kodo JDO distribution samples/ directory.

  • Invoking PersistenceManager.refresh() or PersistenceManager.evict() or related methods does not result in the corresponding data being dropped from the DataCache. The DataCache assumes that it is up-to-date with respect to the data store, so it is effectively an in-memory extension of the data store. If you really want to force data out of the cache, you should use the DataCache APIs (see the DataCache JavaDoc for details), not the PersistenceManager cache control APIs.

  • A com.solarmetric.kodo.runtime.event.RemoteCommitProvider class must be specified (via the com.solarmetric.kodo.RemoteCommitProviderClass property) in order to use the com.solarmetric.kodo.runtime.datacache.plugins.CacheImpl, even when using the cache in a single-JVM mode. When using it in a single-JVM context, the property can be set to com.solarmetric.kodo.runtime.event.impl.SingleJVMRemoteCommitProvider.

7.1.7. Known issues and limitations

  • When using data store (pessimistic) transactions in concert with the distributed caching implementations, it is possible to read stale data when reading data outside a transaction.

    For example, if you have two JVMs (JVM A and JVM B) both communicating with each other, and JVM A obtains a data store lock on a particular object's underlying data, it is possible for JVM B to load the data from the cache without going to the data store, and therefore load data that should be locked. This will only happen if JVM B attempts to read data that is already in its cache during the period between when JVM A locked the data and JVM B received and processed the invalidation notification.

    This problem is impossible to solve without putting together a two-phase commit system for cache notifications, which would add significant overhead to the caching implementation. As a result, we recommend that people use optimistic locking when using data caching. If you do not, then understand that some of your non-transactional data may not be consistent with the data store.

    Note that when loading objects in a transaction, the appropriate data store transactions will be obtained. So, transactional code will maintain its integrity.

  • Extents are not cached. So, if you plan on iterating over a list of all the objects in an extent on a regular basis, you will only benefit from caching if you do so with a query instead:

    Example 7.10. Use queries instead of extents

    Extent extent = pm.getExtent (A.class, false);
    
    // This iterator does not benefit from caching...
    Iterator uncachedIterator = extent.iterator ();
    
    // ... but this one does.
    Query extentQuery = pm.newQuery (extent);
    Iterator cachedIterator = ((Collection) extentQuery.execute ()).iterator ();