Partition Affinity

Overview

Partition affinity describes the concept of ensuring that a group of related cache entries is contained within a single cache partition. This ensures that all relevant data is managed on a single primary cache node (without compromising fault-tolerance).

Affinity may span multiple caches (as long as they are managed by the same cache service, which will generally be the case). For example, in a master-detail pattern such as an "Order-LineItem", the Order object may be co-located with the entire collection of LineItem objects that are associated with it.

The benefit is two-fold. First, only a single cache node is required to manage queries and transactions against a set of related items. Second, all concurrency operations can be managed locally, avoiding the need for clustered synchronization.

A number of standard Coherence operations can benefit from affinity, including cache queries, InvocableMap operations and the getAll/putAll/removeAll methods.

Specifying Affinity

Affinity is specified in terms of a relationship to a partitioned key. In the Order-LineItem example above, the Order objects would be partitioned normally, and the LineItem objects would be associated with the appropriate Order object.

The association does not need to be directly tied to the actual parent key – it only needs to be a functional mapping of the parent key. It could be a single field of the parent key (even if it is non-unique), or an integer hash of the parent key. All that matters is that all child keys return the same associated key; it does not matter whether the associated key is an actual key (it is simply a "group id"). This fact may help minimize the size impact on the child key classes that don't already contain the parent key information (as it is derived data, the size of the data may be decided explicitly, and it also will not affect the behavior of the key). Note that making the association too general (having too many keys associated with the same "group id") can cause a "lumpy" distribution (if all child keys return the same association key regardless of what the parent key is, the child keys will all be assigned to a single partition, and will not be spread across the cluster).

There are two ways to ensure that a set of cache entries are co-located. Note that association is based on the cache key, not the value (otherwise updating a cache entry could cause it to change partitions). Also, note that while the Order will be co-located with the child LineItems, Coherence at present does not support composite operations that span multiple caches (e.g. updating the Order and the collection of LineItems within a single invocation request com.tangosol.util.InvocableMap.EntryProcessor).

For application-defined keys, the class (of the cache key) may implement com.tangosol.net.cache.KeyAssociation as follows:

import com.tangosol.net.cache.KeyAssociation;

public class LineItemId implements KeyAssociation
   {
   // {...}

   public Object getAssociatedKey()
       {
       return getOrderId();
       }

   // {...}
   }

Applications may also provide a custom KeyAssociator:

import com.tangosol.net.partition.KeyAssociator;

public class LineItemAssociator implements KeyAssociator
    {
    public Object getAssociatedKey(Object oKey)
        {
        if (oKey instanceof LineItemId)
            {
            return ((LineItem) oKey).getOrderId();
            }
        else if (oKey instanceof OrderId)
            {
            return ((Order) oKey).getId();
            }
        else
            {
            return null;
            }
        }

    public void init(PartitionedService service)
        {
        }
    }

The key associator may be configured for a NamedCache in the associated distributed-scheme element:

<distributed-scheme>
    <!-- ... -->
    <key-associator>LineItemAssociator</key-associator>
</distributed-scheme>

An example of using affinity for efficient query ( NamedCache.entrySet(Filter)) and cache access ( NamedCache.getAll(Collection)).

OrderId orderId = new OrderId(1234);

// this Filter will be applied to all LineItem objects in order to fetch those
// for which getOrderId() returns the specified order identifier
// "select * from LineItem where OrderId = :orderId"
Filter filterEq = new EqualsFilter("getOrderId", orderId);

// this Filter will direct the query to the cluster node that currently owns
// the Order object with the given identifier
Filter filterAsc = new KeyAssociatedFilter(filterEq, orderId);

// run the optimized query to get the ChildKey objects
Set setLineItems = cacheLineItems.keySet(filterAsc);

// get all the Child objects at once
Set setLineItems = cacheLineItems.getAll(setLineItems);
 
// Or remove all at once
cacheLineItems.keySet().removeAll(setLineItems);