17 Tuning Oracle Human Workflow

This chapter describes how to tune Oracle Human Workflow for optimal performance.

You can tune Oracle Human Workflow in these areas:

Section 17.1, "About Oracle Human Workflow"
Section 17.2, "Tuning Human Workflow"
Section 17.3, "Using Other Tuning Strategies"

17.1 About Oracle Human Workflow

Oracle Human Workflow is a service engine running in Oracle SOA Service Infrastructure that allows the execution of interactive human driven processes. A human workflow provides the human interaction support such as approve, reject, and reassign actions within a process or outside of any process. The Human Workflow service consists of a number of services that handle various aspects of human interaction with a business process.

For more information, see "Using the Human Workflow Service Component" in Developing SOA Applications with Oracle SOA Suite.

See also the Oracle Human Workflow web site at http://www.oracle.com/technology/products/soa/hw/index.html.

17.2 Tuning Human Workflow

This section discusses how to optimize taskflow. The following suggestions are all applicable to API usage.

Table 17-1 Essential Human Workflow Tuning Strategies

Name	Description	Recommendation
Minimize Client Response Time	Since workflow client applications are interactive, it is important to have good response time at the client. Some of the factors that affect the response time include service call performance impacts, querying time to determine the set of qualifying tasks for the request, and the amount of additional information to be retrieved for each qualifying task.	Review your performance metrics to determine how response time can be improved.
Choose the Right Workflow Service Client	Remote client is the best option in terms of performance in most cases. If the client is running in the same JVM as the workflow services (soa-infra application), the API calls are optimized so that there is no remote method invocation (RMI) involved. If the client is on a different JVM, then RMI is used, which can impact performance due to the serialization and de-serialization of data between the API methods. SOAP client is preferred for standardization (based on web services). There are additional performance considerations compared to the remote method invocation (RMI) used in the remote client. Additional processing is performed by the web-services technology stack which causes the marshalling and unmarshalling of API method arguments between XML.	If the client application is based on Java EE technology, then consider which client should be used based on your use case scenarios. Note that if the client application is based on .Net technologies, then only the SOAP workflow services can be used.
Narrow Qualifying Tasks Using Precise Filters	When a task list is retrieved, the query should be as precise as possible so the maximum filtering can be done at the database level.	Use precise filters to improve response time.
Retrieve Subset of Qualifying Tasks (Paging)	The query API has paging parameters that control the number of qualifying rows returned to the user and the start row.	Decrease the `startRow` and `endRow` parameters to values that may limit the number of returned records. This will decrease query time, the application process time, and the amount of data returned to client.
Fetch Only the Information That Is Needed for a Qualifying Task	Typically only some of the payload fields are needed for displaying the task list.	When using the `queryTask` service, consider reducing the amount of optional information retrieved for each task returned in the list. In rare cases where the entire payload is needed, then the payload information can be requested.
Reduce the Number of Return Query Columns	When using the `queryTask` service, consider reducing the number of query columns to improve the SQL time.	Try using the common columns as they are the most likely indexed columns. This allows the SQL to execute faster.
Use the Aggregate API for Charting Task Statistics	Sometimes it is necessary to display charts or statistics to summarize task information.	Consider using the new aggregate APIs to compute the statistics at the database level rather than fetching all the tasks using the query API and computing the statistics at the client layer.
Use the Count API Methods for Counting the Number of Tasks	Sometimes it is only necessary to count how many tasks exist that match certain criteria.	Call the `countTasks` API method, which returns only the number of matching tasks.
Create Indexes On Demand for Flexfields	The workflow schema table WFTASK contains several flexfield attribute columns that can be used for storing task payload values in the workflow schema. Because there are numerous columns, and their use is optional, the installed schema does not contain indexes for these columns.	Create indexes on these columns in certain cases where certain mapped flexfield columns are frequently used in query predicates.
Use the `doesTaskExist` Method	Sometimes it is necessary to check whether a task exists that matches particular query criteria.	Consider using `doesTaskExist` instead of the default of `countTasks`. The `doesTaskExist` method performs an optimized query that simply checks if any rows exist that match the specified criteria. This method may achieve better results than calling the `countTasks` method.

17.3 Using Other Tuning Strategies

Once you have tuned the parameters listed in the previous section, you can consider using the following strategies to further improve performance.

17.3.1 Improving Server Performance

Server performance essentially determines the scalability of the system under heavily loaded conditions. In Section 17.2, "Tuning Human Workflow", strategy "Minimize Client Task Response Time" lists several ways in which client response times can be minimized by fetching the right of amount of information and reducing the potential performance impact associated with querying. These techniques also reduce the database and service logic performance impacts on the server and can improve server performance. In addition, a few other configuration changes can be made to improve server performance:

Table 17-2 Essential server performance tuning strategies

Name	Description	Recommendation
Archive Completed Instances Periodically	The database scalability of a system is largely dependent on the amount of data in the system. Since business processes and workflows are temporal in nature, once they are processed, they are not queried frequently.	Consider using an archival scheme to periodically move completed instances to another system that can be used to query historical data. Archival should be done carefully to avoid orphan task instances.
Select the Appropriate Workflow Callback Functionality	The workflow callback functionality can be used to query or update external systems after any significant workflow event, such as assignment or task completion.	Ensure that there are sufficient resources to update the external system after the task is completed instead of after every workflow event. If a callback cannot be avoided, then consider using a Java callback instead of a BPEL callback. Java callbacks do not have the performance impact associated with a BPEL callback since the callback method is executed in the same thread.
Minimize Performance Impacts from Notification	Notifications are useful for alerting users that they have a task to execute. In environments where most approvals happen through email, actionable notifications are especially useful. This also implies that there is not much load in terms of worklist usage.	Minimize the notification to alert a user only when a task is assigned instead of sending out notifications for each workflow event. Also consider making the notifications secure, in which case only a link to the task is sent in the notification and not the task content itself.
Deploy Clustered Nodes	All workflow instances and state information are stored in the dehydration database. Workflow services are stateless which means they can be used concurrently on a cluster of nodes.	When performance is critical and a highly scalable system is needed, a clustered environment can be used for supporting workflow.

17.3.2 Completing Workflows Faster

The time it takes for a workflow to complete depends on the routing type specified for the workflow. The workflow functionality provides some options that can be used to decrease the amount of time it takes to complete workflows. Some of these options are discussed in this section:

Table 17-3 Essential workflow completion tuning strategies

Name	Description	Recommendation
Use Workflow Reports to Monitor Progress	Several workflow reports (and corresponding views) are available that can make monitoring and proactive problem fixing easier.	By checking the unattended tasks report, you can assign tasks that have been in the queue for a long time to specific users. By monitoring cycle time and other statistics, you can add staff to groups that are overloaded or take a longer time to complete their tasks.
Specify Escalation Rules	To ensure that tasks do not get stuck at any user, you can specify escalation rules. For example, you can move a task to a manager if a certain amount of time passes without any action being taken on the task. Custom escalation rules can also be plugged in if the task must be escalated to some other user based on alternative routing logic.	By specifying proper escalation rules, you can reduce workflow completion times.
Specify User and Group Rules for Automated Assignment	Rules can help significantly reduce workflow waiting time, which results in faster workflow completion.	Instead of manually reassigning tasks to other users or members of a group, you can use user and group rules to perform automated reassignment. This ensures that workflows get timely attention.
Use Task Views to Prioritize Work	A user's inbox can contain tasks of various types with various due dates. The user has to manually sift through the tasks or sort them to find out which one he or she should work on next.	By creating task views where tasks are filtered based on due dates or priority, users can get their work prioritized automatically so they can focus on completing their tasks instead of wasting their time on deciding which tasks to work on.

17.3.3 Tuning the Identity Provider

The workflow service uses information from the identity provider in constructing the SQL query to determine the tasks qualifying for a user based on his or her role/group membership. The identity provider is also queried for determining role information to determine privileges of a user when fetching the details of a task and determining what actions the user can perform on a task. There are a few ways to speed up requests made to the identity provider.

Set the search base in the identity configuration file to node(s) as specific as possible. Ideally you should populate workflow-related groups under a single node to minimize traversal for search and lookup. This is not always possible; for example, you may need to use existing groups and grant membership to groups located in other nodes. If it is possible to specify filters that can narrow down the nodes to be searched, then you should specify them in the identity configuration file.
Index all critical attributes such as dn and cn in the identity provider. This ensures that when a search or a lookup is done, only a subset of the nodes are traversed instead of a full tree traversal.
Use an identity provider that supports caching. Not all LDAP providers support caching but Oracle Internet Directory supports caching which can make lookup and search queries faster.
If using Oracle Internet Directory as a Identity Provider, ensure that you run the oidstats.sql to gather latest statistics on the database after the data shape has changed.

17.3.4 Tuning the Database

The Human Workflow schema is shipped with several indexes defined on the most important columns. Based on the type of request, different SQL queries are generated to fetch the task list for a user. The database optimizer evaluates the cost of different plan alternatives (for example, full table scan, access table by index) and decides on a plan that is lower in cost. For the optimizer to work correctly, the index statistics should be current at all times. As with any database usage, it is important to make sure the database statistics are updated at regular intervals and other tunable parameters such as memory, table space, and partitions are used effectively to get maximum performance.

For more information on tuning the database, see Section 2.6, "Tuning Database Parameters".