Priority Tasks

Coherence Priority Tasks provide applications that have critical response time requirements better control of the execution of processes within Coherence. Execution and request timeouts can be configured to limit wait time for long running threads. In addition, a custom task API allows applications to control queue processing. Note that these features should be used with extreme caution because they can dramatically effect performance and throughput of the data grid.

Priority Tasks - Timeouts

Overview

Care should be taken when configuring Coherence Task Execution timeouts, especially for Coherence applications that pre-date this feature and thus do not handle timeout exceptions. If a write-through in a CacheStore is blocked (for instance, if a database query is hung) and exceeds the configured timeout value, the Coherence Task Manager will attempt to interrupt the execution of the thread and an exception will be thrown. In a similar fashion, queries or aggregations that exceed configured timeouts will be interrupted and an exception will be thrown. Applications that use this feature should make sure that they handle these exceptions correctly to ensure system integrity. Since this configuration is done at a service by service basis, changing these settings on existing caches/services not designed with this feature in mind should be done with great care.

Configuration

When configuring Execution Timeouts three values need to be considered request-timeout, task-timeout, and the task-hung-threshold (see reference). The request-timeout is the amount of time the client will wait a request to return. The task-timeout is the amount of time that the server will allow the thread to execute before interrupting execution. The task-hung-threshold is the amount of time that a thread can execute before the server reports the thread as "hung." "Hung" threads are for reporting purposes only. These timeout settings are in milliseconds and are configured in the coherence-cache-config.xml or by using command line parameters. See the reference and examples below.

Reference

setting	Description
`<task-hung-threshold>`	Specifies the amount of time in milliseconds that a task can execute before it is considered "hung". Note: A posted task that has not yet started is never considered as hung. This attribute is applied only if the Thread pool is used (the "thread-count" value is positive).
`<task-timeout>`	Specifies the default timeout value for tasks that can be timed-out (e.g. implement the `PriorityTask` interface), but don't explicitly specify the task execution timeout value. The task execution time is measured on the server side and does not include the time spent waiting in a service backlog queue before being started. This attribute is applied only if the thread pool is used (the "thread-count" value is positive)
`<request-timeout>`	Specifies the default timeout value for requests that can time-out (e.g. implement the `PriorityTask` interface), but don't explicitly specify the request timeout value. The request time is measured on the client side as the time elapsed from the moment a request is sent for execution to the corresponding server node(s) and includes the following: (1) the time it takes to deliver the request to an executing node (server). (2) the interval between the time the task is received and placed into a service queue until the execution starts. (3) the task execution time. (4) the time it takes to deliver a result back to the client.

Examples

To set the distributed cache thread count to 7 with a task time out of 5000 milliseconds and a task hung threshold of 10000 milliseconds, the following would need to be added to the coherence-cache-config.xml for the node.

<caching-schemes>
    <distributed-scheme>
      <scheme-name>example-distributed</scheme-name>
      <service-name>DistributedCache</service-name>
      <thread-count>7</thread-count>
      <task-timeout>5000ms</task-timeout>
      <task-hung-threshold>10000ms</task-hung-threshold>
    </distributed-scheme>
</caching-schemes>

Setting the client request timeout to 15 milliseconds

    <distributed-scheme>
      <scheme-name>example-distributed</scheme-name>
      <service-name>DistributedCache</service-name>
      <request-timeout>15000ms</request-timeout>
    </distributed-scheme>

Note: The request-timeout should always be longer than the thread-hung-threshold or the task-timeout.

Command Line options

The command line options are for setting the service type (distributed cache, invocation, proxy, etc) default for the node.

Option	Desicription
tangosol.coherence.replicated.request.timeout	The default client request timeout for the Replicated cache service
tangosol.coherence.optimistic.request.timeout	The default client request timeout for the Optimistic cache service
tangosol.coherence.distributed.request.timeout	The default client request timeout for distributed cache services
tangosol.coherence.distributed.task.timeout	The default server execution timeout for distributed cache services
tangosol.coherence.distributed.task.hung	the default time before a thread is reported as hung by distributed cache services
tangosol.coherence.invocation.request.timeout	The default client request timeout for invocation services
tangosol.coherence.invocation.task.timeout	The default server execution timeout invocation services
tangosol.coherence.invocation.task.hung	the default time before a thread is reported as hung by invocation services
tangosol.coherence.proxy.request.timeout	The default client request timeout for proxy services
tangosol.coherence.proxy.task.timeout	The default server execution timeout proxy services
tangosol.coherence.proxy.task.hung	the default time before a thread is reported as hung by proxy services

Priority Task Execution - Custom Objects

Overview

The PriorityTask interface allows to control the ordering in which a service schedules tasks for execution using a thread pool and hold their execution time to a specifed limit. Instances of PriorityTask typically also implement either Invocable or Runnable interface. Priority Task Execution is only relevant when a task back logs exists.

Execution Type

SCHEDULE_STANDARD - a task will be scheduled for execution in a natural (based on the request arrival time) order;
SCHEDULE_FIRST - a task will be scheduled in front of any equal or lower scheduling priority tasks and executed as soon as any of worker threads become available;
SCHEDULE_IMMEDIATE - a task will be immediately executed by any idle worker thread; if all of them are active, a new thread will be created to execute this task.

Developing

Coherence provides 4 classes to aid developers in creating priority task objects

PriorityProcessor can be extended to create a custom entry processor.
PriorityFilter can be extended to create a custom priority filter.
PriorityAggregator can be extended to create a custom aggregation.
PriorityTask can be extended to create an priority invocation class.

After extending each of these classes the developer will need to implement several methods. The return values for getRequestTimeoutMillis, getExecutionTimeoutMillis and getSchedulingPriority should be stored at a class by class basis in your application configuration parameters.

Method	Description
public long getRequestTimeoutMillis()	Obtain the maximum amount of time a calling thread is willing to wait for a result of the request execution. The request time is measured on the client side as the time elapsed from the moment a request is sent for execution to the corresponding server node(s) and includes: the time it takes to deliver the request to the executing node(s);the interval between the time the task is received and placed into a service queue until the execution starts; the task execution time; the time it takes to deliver a result back to the client. The value of TIMEOUT_DEFAULT indicates a default timeout value configured for the corresponding service; the value of TIMEOUT_NONE indicates that the client thread is willing to wait indefinitely until the task execution completes or is canceled by the service due to a task execution timeout specified by the getExecutionTimeoutMillis() value.
public long getExecutionTimeoutMillis()	Obtain the maximum amount of time this task is allowed to run before the corresponding service will attempt to stop it. The value of TIMEOUT_DEFAULT indicates a default timeout value configured for the corresponding service; the value of TIMEOUT_NONE indicates that this task can execute indefinitely. If, by the time the specified amount of time passed, the task has not finished, the service will attempt to stop the execution by using the Thread.interrupt() method. In the case that interrupting the thread does not result in the task's termination, the runCanceled method will be called.
public int getSchedulingPriority()	Obtain this task's scheduling priority. Valid values are SCHEDULE_STANDARD, SCHEDULE_FIRST, SCHEDULE_IMMEDIATE
public void runCanceled(boolean fAbandoned)	This method will be called if and only if all attempts to interrupt this task were unsuccessful in stopping the execution or if the execution was canceled before it had a chance to run at all. Since this method is usually called on a service thread, implementors must exercise extreme caution since any delay introduced by the implementation will cause a delay of the corresponding service.

Errors

When a task timeout occurs the node will get a RequestTimeoutException. e.g:

com.tangosol.net.RequestTimeoutException: Request timed out after 4015 millis
	at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.checkRequestTimeout(Service.CDB:8)
	at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.poll(Service.CDB:52)
	at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.poll(Service.CDB:18)
	at com.tangosol.coherence.component.util.daemon.queueProcessor.service.InvocationService.query(InvocationService.CDB:17)
	at com.tangosol.coherence.component.util.safeService.SafeInvocationService.query(SafeInvocationService.CDB:1)