Skip Headers
Oracle® Coherence Developer's Guide
Release 3.7.1

Part Number E22837-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

28 Working with Partitions

This chapter provides instructions for using data affinity to ensure data is located in specific partitions and also includes instructions for changing the default partition distribution strategy.

The following sections are included in this chapter:

28.1 Specifying Data Affinity

Data affinity describes the concept of ensuring that a group of related cache entries is contained within a single cache partition. This ensures that all relevant data is managed on a single primary cache node (without compromising fault-tolerance).

Affinity may span multiple caches (if they are managed by the same cache service, which generally is the case). For example, in a master-detail pattern such as an Order-LineItem, the Order object may be co-located with the entire collection of LineItem objects that are associated with it.

The are two benefits for using data affinity. First, only a single cache node is required to manage queries and transactions against a set of related items. Second, all concurrency operations are managed locally and avoids the need for clustered synchronization.

Several standard Coherence operations can benefit from affinity, including cache queries, InvocableMap operations and the getAll, putAll, and removeAll methods.

Note:

Data affinity is specified in terms of entry keys (not values). As a result, the association information must be present in the key class. Similarly, the association logic applies to the key class, not the value class.

Affinity is specified in terms of a relationship to a partitioned key. In the Order-LineItem example above, the Order objects would be partitioned normally, and the LineItem objects would be associated with the appropriate Order object.

The association does not have to be directly tied to the actual parent key - it only must be a functional mapping of the parent key. It could be a single field of the parent key (even if it is non-unique), or an integer hash of the parent key. All that matters is that all child keys return the same associated key; it does not matter whether the associated key is an actual key (it is simply a "group id"). This fact may help minimize the size impact on the child key classes that do not contain the parent key information (as it is derived data, the size of the data may be decided explicitly, and it also does not affect the behavior of the key). Note that making the association too general (having too many keys associated with the same "group id") can cause a "lumpy" distribution (if all child keys return the same association key regardless of what the parent key is, the child keys are all assigned to a single partition, and are not spread across the cluster).

There are two ways to ensure that a set of cache entries are co-located. Note that association is based on the cache key, not the value (otherwise updating a cache entry could cause it to change partitions). Also, note that while the Order is co-located with the child LineItems, Coherence does not currently support composite operations that span multiple caches (for example, updating the Order and the collection of LineItems within a single invocation request com.tangosol.util.InvocableMap.EntryProcessor).

28.1.1 Specifying Data Affinity with a KeyAssociation

For application-defined keys, the class (of the cache key) may implement com.tangosol.net.cache.KeyAssociation as follows:

Example 28-1 Creating a Key Association

import com.tangosol.net.cache.KeyAssociation;

public class LineItemId implements KeyAssociation
   {
   // {...}

   public Object getAssociatedKey()
       {
       return getOrderId();
       }

   // {...}
   }

28.1.2 Specifying Data Affinity with a KeyAssociator

Applications may also provide a custom KeyAssociator:

Example 28-2 A Custom KeyAssociator

import com.tangosol.net.partition.KeyAssociator;

public class LineItemAssociator implements KeyAssociator
    {
    public Object getAssociatedKey(Object oKey)
        {
        if (oKey instanceof LineItemId)
            {
            return ((LineItemId) oKey).getOrderId();
            }
        else if (oKey instanceof OrderId)
            {
            return oKey;
            }
        else
            {
            return null;
            }
        }

    public void init(PartitionedService service)
        {
        }
    }

The key associator may be configured for a NamedCache in the associated <distributed-scheme> element:

Example 28-3 Configuring a Key Associator

<distributed-scheme>
    <!-- ... -->
    <key-associator>
        <class-name>LineItemAssociator</class-name>
    </key-associator>
</distributed-scheme>

28.1.3 Example of Using Affinity

Example 28-4 illustrates how to use affinity to create a more efficient query (NamedCache.entrySet(Filter)) and cache access (NamedCache.getAll(Collection)).

Example 28-4 Using Affinity for a More Efficient Query

OrderId orderId = new OrderId(1234);

// this Filter is applied to all LineItem objects to fetch those
// for which getOrderId() returns the specified order identifier
// "select * from LineItem where OrderId = :orderId"Filter filterEq = new EqualsFilter("getOrderId", orderId);

// this Filter directs the query to the cluster node that currently owns
// the Order object with the given identifier
Filter filterAsc = new KeyAssociatedFilter(filterEq, orderId);

// run the optimized query to get the ChildKey objects
Set setLineItemKeys = cacheLineItems.keySet(filterAsc);

// get all the Child objects immediately
Set setLineItems = cacheLineItems.getAll(setLineItemKeys);
 
// Or remove all immediately
cacheLineItems.keySet().removeAll(setLineItemKeys);

28.2 Changing the Partition Distribution Strategy

Partition distribution defines how partitions are assigned to storage-enabled cluster members. There are two methods of distribution available:

The autonomous distribution strategy is enabled by default. To use a centralized distribution strategy, an assignment strategy implementation must be provided. The assignment strategy must implement the com.tangosol.net.partition.PartitionAssignmentStrategy interface. Coherence includes a default assignment strategy implementation and custom assignment strategies can be written and configured as required.

28.2.1 Specifying the Simple Partition Assignment Strategy

The simple partition assignment strategy is a centralized distribution method that uses a similar partition-balancing algorithm as the default autonomous distribution. The simple strategy attempts to distribute a fair-share number of partitions to each storage-enabled server while maintaining (as much as the service membership topology allows) machine, rack, or site safety. The assignment strategy is implemented in the com.tangosol.net.partition.SimpleAssignmentStrategy class, which can be extended as required.

To configure the simple partition assignment strategy for a specific partitioned cache service, add a <partition-assignment-strategy> element within a distributed cache definition and include the fully qualified class name within the <instance> element. For example:

<distributed-scheme>
   ...        
   <partition-assignment-strategy>
      <instance>
         <class-name>com.tangosol.net.partition.SimpleAssignmentStrategy
         </class-name>
      </instance>
   </partition-assignment-strategy>
   ...
</distributed-scheme>

To configure the partition assignment strategy for all instances of the distributed cache service type, override the partitioned cache service's partition-assignment-strategy initialization parameter in an operational override file. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <services>
         <service id="3">
            <init-params>
               <init-param id="21">
                  <param-name>partition-assignment-strategy</param-name>
                  <param-value>com.tangosol.net.partition.SimpleAssignmentStrategy
                  </param-value>
               </init-param>
            </init-params>
         </service>
      </services>
   </cluster-config>
</coherence>

28.2.2 Enabling a Custom Partition Assignment Strategy

To specify a Custom partition assignment strategy, include an <instance> subelement within the <partition-assignment-strategy> element and provide a fully qualified class name that implements the com.tangosol.net.partition.PartitionAssignmentStrategy interface. See "instance" for detailed instructions on using the <instance> element. The following example enables a partition assignment strategy that is implemented in the MyPAStrategy class.

<distributed-scheme>
   ...        
   <partition-assignment-strategy>
      <instance>
         <class-name>package.MyPAStrategy</class-name>
      </instance>
   </partition-assignment-strategy>
   ...
</distributed-scheme>

As an alternative, the <instance> element supports the use of a <class-factory-name> element to use a factory class that is responsible for creating PartitionAssignmentStrategy instances, and a <method-name> element to specify the static factory method on the factory class that performs object instantiation. The following example gets a strategy instance using the getStrategy method on the MyPAStrategyFactory class.

<distributed-scheme>
   ...        
   <partition-assignment-strategy>
      <instance>
         <class-factory-name>package.MyPAStrategyFactory</class-factory-name>
         <method-name>getStrategy</method-name>
      </instance>
   </partition-assignment-strategy>
   ...
</distributed-scheme>

Any initialization parameters that are required for an implementation can be specified using the <init-params> element. The following example sets the iMaxTime parameter to 2000.

<distributed-scheme>
   ...        
   <partition-assignment-strategy>
      <instance>
         <class-name>package.MyPAStrategy</class-name>
         <init-params>
            <init-param>
               <param-name>iMaxTime</param-name>
               <param-value>2000</param-value>
            </init-param>
         </init-params>
      </instance>
   </partition-assignment-strategy>
   ...
</distributed-scheme>