Oracle JCA Adapter Tuning Guide

17.1 Oracle JCA Adapter Framework Performance and Tuning

Understand the most important tuning practices and parameters for the Oracle JCA Adapter framework.

17.1.1 payloadSizeThreshold

This parameter controls the maximum size (measured in bytes) acceptable to any JCA adapter when data is consumed, that is, either asynchronously received (JCA Service) during endpoint polling, or synchronously read using a JCA Reference.

If the size of a (native) message being consumed exceeds the configured payloadSizeThreshold (configured on a per SCA Service or Reference basis) the message will be rejected.

When parsing a native message into Oracle XDK XML, the message will require 10-15 times the amount of memory occupied by the native message. For example, a flat file in the file system of size 1Mb will consume 10-15Mb on the JVM heap (as DOM).

The default value for this property is -1, which means "unlimited".

There is a global payloadSizeThreshold (in the AdapterConfigMBean) which can be set using the Fusion Middleware Control and which becomes the default limit for all JCA endpoints, which have not explicitly configured a specific value for payloadSizeThreshold.

In other words, the endpoint setting always takes precedence over the global setting.

This setting must typically be configured if there is any doubt about the message size that can be expected.

If the application itself is generating the messages being consumed, assumptions related to the application can be made that would alleviate the need for configuring this property.

The goal of the property is to limit the amount of JVM heap space occupied by incoming adapter messages. The value would be (at least) impacted by the following factors within synchronous or asynchronous business flow:

Synchronous flow. Each inbound consumed message must stay on the JVM heap for a longer time, for the duration of the synchronous instance execution. The number of adapter polling threads determines how many message instances exists concurrently in the JVM. The more polling threads, the lower the payloadSizeThreshold.
Asynchronous flow. The inbound message exists on the heap for a short time, between the time the adapter reads and translates the incoming message and until the first downstream service engine receives and persists the message. However, the threading model (for example, BPEL worker threads) for these downstream service engines decides how many instances can exist concurrently on the heap, which you need to factor back into a reasonable payloadSizeThreshold value.

17.1.1.1 DOM or Scalable DOM

If Streaming has been enabled, that is, the adapter is producing SDOM (Scalable DOM) instances, you can use higher values for payloadSizeThreshold because the Oracle XDK keeps most of the XML in paging storage. This depends on the DOM access pattern employed in xpath, xslt and xqueries executed on the instance.

How much less memory will be used by SDOM instances largely depends on your business logic, so empirical test data should guide your setting of this parameter.

17.1.1.2 Synchronous Consume

This use case is related to situations where the business logic invokes the adapter to perform a synchronous read operation, that is, read a file or select data from a table. If, for example, there is a risk that a SQL select operation would return too many (nested) rows, setting a reasonable value for payloadSizeThreshold is highly recommended. If the size is exceeded, a business fault is thrown and it is up to the business logic to decide compensatory action, that is, there is no automatic rejection involved.

17.1.1.3 Symptoms if payloadSizeThreshold is Not Properly Tuned

You must choose a well considered value for payloadSizeThreshold, which is balanced against the competing concerns of:

Protecting the JVM heap from becoming swamped with over-sized messages, versus
The problem of having too many messages being rejected.

Consequently, leaving the value at the default -1 (unlimited) or setting it too high can lead to fatal runtime failures when the JVM runs out of available heap space, if there are too many polling or processing threads allowed to work in parallel on separate inbound messages.

Setting the value too low causes too many messages to be rejected, that is, doing so unnecessarily disrupts normal business flow, and requires either scheduled or manual recovery of messages which likely are of an acceptable size.

17.1.1.4 Downside of Tuning

The cost involved in calculating the size of the native message is the potential downside of configuring this property. Setting a threshold value can lead to a measurable performance impact especially in the case of the Database Adapter.

17.1.1.5 Recommendations if Symptoms Occur

If out-of-memory (OOM) errors start occurring in SOA are seen in the logs), or overall system slowdown is detected or too frequent garbage collection starts occurring - if any of these occur and you can correlate them back to the inflow of over-sized native adapter messages, a message threshold should likely be imposed immediately.

You can impose a new threshold dynamically, either globally by using the configuration MBean (which would apply to all endpoints) or by changing the Binding Property payloadSizeThreshold of one or more inbound JCA endpoints (the endpoints suspected of receiving very large messages).

17.1.2 minimumDelayBetweenMessages

minimumDelayBetweenMessages is a simple mechanism to delay the execution of an adapter polling thread.

It adds a trivial thread sleep as part of the instance execution, that is, on a per polling thread basis. The setting is measured in milliseconds. The delay is added immediately before invoking the next downstream component.

If the adapter initiates a transaction as part of the poller thread, the transaction timeout must accommodate the maximum of the worst case acceptable instance execution, plus the value of minimumDelayBetweenMessages.

If the instance execution itself (the time between the two most recent message publications) exceeds the value of minimumDelayBetweenMessages, no delay will be added (hence the name "minimumDelay...").

17.1.2.1 Symptoms if Not Properly Tuned

If not configured, there is no delay between messages coming from the inbound JCA adapter endpoint.

If the business process has an asynchronous structure/signature, this situation can lead to overloading of the downstream business logic if it is not able to keep up with the speed of inbound messages arriving from the adapter endpoint (where messages pile up in the dispatcher queue).

If, alternatively, the business process flow is synchronous, the posting adapter polling thread will not reacquire a new incoming message before the instance execution is complete of the previous message (this only pertains to the scope leading up to the first downstream dehydration point).

However, configuring a high value for minimumDelayBetweenMessages can make the downstream business process unnecessarily idle, or lead to JTA/XA transaction timeouts.

17.1.2.2 Downside of Tuning

Setting a value for minimumDelayBetweenMessages imposes a fixed delay, which does not adjust to changing performance characteristics of the downstream business log. That is, if the downstream process flow performance improves, the minimumDelayBetweenMessages is still applied unconditionally.

17.1.2.3 Recommendations if Symptoms Occur

If the downstream business logic becomes increasingly backlogged, for example, in terms of the number of unprocessed messages in the BPEL dispatcher queue, this backlog indicates the logic cannot keep up with the rate of incoming adapter messages.

You can then apply this property to the JCA endpoint, choosing a value so the aggregate delay across all polling threads for the endpoint matches the execution speed of the asynchronous business process.

The rule is that if there are 10 polling threads, the effective delay is = minimumDelayBetweenMessages / # of polling threads.

17.2 JMS Adapter

Get an overview of the most important tuning practices and parameters for the JMS Adapter.

17.2.1 adapter.jms.receive.threads

This parameter indicates the number of poller threads that are created when an adapter endpoint is activated. The default is 1. Each poller thread receives its own message that is processed independently and thus allows for increased throughput.

17.2.1.1 Symptoms if Not Properly Tuned

If the message processing is slow, resulting in an increase in the queue backlog, one way to speed up processing is to increase the number of poller threads.

17.2.1.2 Downside of Tuning

An increased number of polling threads increases the system resources required for concurrent processing.

17.2.1.3 Recommendations if Symptoms Occur

If symptoms occur, increase the thread while performance continues to scale linearly.

17.2.2 EnableStreaming

The default for EnableStreaming is false. Its importance is usecase-specific, but when the property is set to true, the payload from the JMS destination is streamed (that is, the adapter generates an instance of SDOM) instead of processing the payload as an in-memory DOM. You must use the SDOM feature while handling large payloads.

17.2.2.1 Symptoms if Not Properly Tuned

Large payloads from the JMS destination can result in an out-of-memory condition.

17.2.2.2 Downside of Tuning

Setting this parameter to True incurs an extra cost to process SDOM.

17.2.2.3 Recommendations if Symptoms Occur

Turn streaming on as it enables a larger value of the PayloadSizeThreshold property to be used with the JMS adapter.

17.2.3 adapter.jms.receive.timeout

The default value for this parameter is 1 second. This timeout value is used for the synchronous receive call, and is the time after which the receive() API times out if a message is not received on the inbound JMS queue.

17.2.3.1 Symptoms if Not Properly Tuned

Symptoms can include high CPU usage during the idle cycle (when there are no messages in the queue for an extended time.)

17.2.3.2 Downside of Tuning

A large value for this parameter can lead to transaction timeouts.

17.2.3.3 Recommendations if Symptoms Occur

Increase the timeout value, taking into consideration the total transaction timeout value that is set for the SOA environment.

17.3 AQ Adapter

Get an overview of the most important tuning practices and parameters for the AQ Adapter.

17.3.1 adapter.aq.dequeue.threads

This is the number of poller threads that are created when an AQ adapter endpoint is activated. The default is 1. Each poller thread receives its own message that is processed independently and which thus allows for increased throughput.

17.3.1.1 Symptoms if Not Properly Tuned

Not necessarily a symptom of improper tuning, but if message processing is slow, resulting in an increase in the AQ queue backlog, one way to increase message processing speed is to increase the number of threads.

17.3.1.2 Downside of Tuning

Increased threads will probably increase system resources that are required for concurrent processing.

17.3.1.3 Recommendations if Symptoms Occur

Increase poller threads for as long as performance continues to scale linearly.

17.3.2 EnableStreaming

This AQ parameter's default is false and the parameter is use-case specific. You must use this feature while handling large payloads.

When the property is set to true, the payload from the JMS destination is streamed (the adapter generates an instance of SDOM) instead of processing as an in-memory DOM.)

17.3.2.1 Symptoms if Not Properly Tuned

Large payloads will most likely result in out of memory.

17.3.2.2 Downside of Tuning

Incurs extra cost to process SDOM.

17.3.2.3 Recommendations if Symptoms Arise

If out of memory conditions occur, turn streaming on as it enables you to use a larger value of PayloadSizeThreshold property with the AQ adapter.

17.3.3 DequeueTimeOut

DequeueTimeOut is the interval after which the dequeue() API times out if a message is not received on the inbound queue. The default value is 1 second.

17.3.3.1 Symptoms if Not Properly Tuned

If this parameter is not properly tuned, you might see high CPU usage during the idle cycle (when there are no messages in the queue for an extended time.)

17.3.3.2 Downside of Tuning

A large value can lead to transaction timeouts.

17.3.3.3 Recommendations if Symptoms Arise

Increase this timeout parameter value, taking into consideration the total transaction timeout set for the SOA environment.

17.4 File/FTP adapter

Get an overview of the most important tuning practices and parameters for the File/FTP Adapter.

17.4.1 Thread Count and Single Thread Model

The inbound design for File and FTP Adapters is based on a producer/consumer model where each deployment of a composite application (with inbound File/Ftp Adapter service) results in the creation of a single poller thread and an optional number of processor threads.

The following steps highlight the default threading behavior of File and Ftp Adapters:

The poller thread periodically scans for files in the input directory based on a configured polling frequency.
For each new file that the poller detects in the inbound directory, the poller enqueues information such as file name, file directory, modified time, and file size into an internal system-wide in-memory queue.
A global pool of processor threads (four in number) wait to process from this in-memory queue.
The items in the in-memory queue are dequeued and processed by these threads; processing includes translation of file content into XML and publishing to SOA infrastructure, followed by post-processing, for example, deletion/archival of the file.

This is the default threading behavior for both the File and FTP Adapters. However, you can change this behavior by configuring either a ThreadCount or SingleThreadModel property in the inbound jca configuration file.

Specifying a finite value for ThreadCount results in that particular File/FTP adapter inbound service to change into a partitioned threaded model, where each service receives its own in-memory queue and its own set of dedicated processor threads.

Because the processor threads themselves come from the SOA Work manager, they consume memory/CPU time and hence this property must be configured carefully. Recommendation is to start with a value of 2 and slowly increase the speed as required; the system, however, sets the upper bound for this configuration of ThreadCount at 40.

Setting SingleThreadModel as true in the JCA configuration results in the poller assuming the role of the processor. In other words, the poller scans the inbound directory and processes the files in the same thread (one after another).

This parameter is particularly useful if you want to throttle the system. You can also use this parameter if you want to process files in a specific order (for example, process files in descending order of last modified time)

17.4.1.1 Symptoms if Not Properly Tuned

Low performance in high volume situations when multiple inbound services are running due to a single shared in-memory queue and shared processor threads.

17.4.1.2 Downside of Tuning

Increased number of processor threads increase the amount of system resources required. A dedicated in-memory queue for each service also increases the memory requirement.

17.4.1.3 Recommendations if Symptoms Arise

Start increasing the value of the ThreadCount from 2 and slowly increase until you observe a linear performance increase.

17.4.2 maxRaiseSize

The MaxRaiseSize parameter is used by the inbound File/FTP Adapter to decide how many files the inbound adapter poller thread raises to the in-memory queue in a single polling cycle.

For example, if you have configured a Polling Frequency of 1 minute and a MaxRaiseSize of 10, the poller thread raises a maximum of 10 files every 1 minute into the in-memory queue.

This parameter is useful when the File/FTP Adapter is configured to execute in active/active cluster. In this case, there is more than one poller thread (one per managed server actually) polling the same directory in the cluster; this parameter ensures the files get more or less evenly distributed between the nodes and that no single managed server cannibalizes the processing of files.

17.4.2.1 Symptoms if Not Properly Tuned

If this parameter is not correctly tuned, balancing is not correct; for example, one managed server in an active-active cluster will be processing most of the messages.

17.4.2.2 Downside of Tuning

A low value of MaxRaiseSize can result in slower performance due to fewer files being processed in each polling cycle and additional poller thread sleeptime.

17.4.2.3 Recommendations if Symptoms Arise

Based on the expected load of your application, you can synchronize the MaxRaiseSize and Polling frequency to ensure less sleeptime within each polling cycle and also proper load balancing across poller threads.

17.4.3 PublishSize

The debatching usecase within the File/FTP Adapter occurs when a single inbound file can be broken down into multiple batches and each batch is processed individually. For example, if you have a single file that has millions of invoices, you cannot process the entire file in one attempt.

With debatching enabled, the File/FTP Adapter reads from the underlying stream and translates the first 1000 (given a PublishSize of 1000) invoices into XML and publishes to SOA; this process continues until the entire file has been processed. Additionally, the adapter saves enough information between iterations so it can restart the debatching from the point it left off during crash recovery.

You should specify a sufficiently large value for the PublishSize for better performance. For example, if you file has 1,000,000 logical records, then setting the PublishSize to say 10,000 would result in the entire file being processed in 100 iterations.

Each publish from the adapter results in many database activities, for example, saving to dehydration store and re-hydration at a later point of time. Reducing the number of publishes (by having a larger value for the PublishSize) improves performance.

17.4.3.1 Symptoms if Not Properly Tuned

Low performance due to a higher number of publish calls and many database activities.

17.4.3.2 Downside of Tuning

A larger value of PublishSize requires additional system resources.

17.4.3.3 Recommendations if Symptoms Arise

Increase the PublishSize for the duration of linear performance improvement.

17.4.4 ChunkSize

Chunked Read is analogous to debatching, but, it applies to the outbound File/FTP adapter and, more importantly, it is supported only in BPEL.

The usecase is a BPEL process that employs Chunked-Read uses an <invoke> activity within a <while/> BPEL construct to process a huge file, one logical chunk at a time.

At the end of each <invoke>, one logical chunk is returned back to BPEL as XML data (that is, materialized in memory).

The ChunkSize parameter defines how many logical records are returned back to BPEL during each <invoke>. While setting this parameter, you must be aware that a larger number of iterations in the <while> loop in BPEL will result in out-of-memory errors as it results in an increase in amount of audit information being held in memory.

While setting the value of ChunkSize, you need to ensure that you do not return too large a chunk of records while keeping the number of iterations to a minimum. For example, if you have a huge file with one million records, you should start with a large value, say 1000.

17.4.4.1 Symptoms if Not Properly Tuned

Large payloads might result in out-of-memory issues if the property is either not configured or the value is either too high or too low.

17.4.4.2 Downside of Tuning

You must find a balanced ChunkSize based on the expected input size to ensure that you do not return too large a chunk while keeping the number of iterations to a minimum.

17.4.4.3 Recommendations if Symptoms Arise

Based on the expected input file size and the available system resources, start with a balanced ChunkSize and then increase its value until the memory requirements are acceptable.

17.5 Database Adapter

Get an overview of the most important practices and parameters related to Database Adapter performance and tuning

17.5.1 Use Indexes

The following information relates to using indexes with the Database Adapter.

17.5.1.1 Symptoms if Not Properly Tuned

Polling can be extremely slow. Delete polling strategy with a 1-M relationship can deadlock if the foreign key column is not indexed.

17.5.1.2 Downside of Tuning

All indexes have a cost. Using pure delete from a flat table and configuring no primary key (using rowid instead) is a viable alternative to needing indexes. Indexes may not be needed if the adapter can poll at a speed where there are never more than a few unprocessed rows (again for DeletePollingStrategy).

17.5.1.3 Recommendations if Symptoms Arise

Create an index or remove all indexes/constraints and use rowid (advanced, delete polling only).

Additionally, Use indexes is not a parameter which defaults to false. It is something you do separate from the Database Adapter itself by altering the relational schema you are polling against to make sure the SQL the adapter will be executing will be performed optimally. It is not a "tuning knob" per se, but it is still highly important.

Note that with the Rowid option, you get the fast selects, while avoiding the traditional costs of maintaining an index. For more information, see the section on Rowid in the Database Adapter Chapter.

17.5.2 MaxTransactionSize and MaxRaiseSize

Following are the considerations for MaxTransactionSize and MaxRaiseSize.

17.5.2.1 Symptoms if Not Properly Tuned

Good way to scale by reducing overhead using number of transactions/fetches (MaxTransactionSize) and number of downstream instances (MaxRaiseSize). If moving data in bulk passing multiple rows through the system as one payload can improve performance.

17.5.2.2 Downside of Tuning

MaxRaiseSize=1 is often for business constraints, for instance the bpel process is designed to process a single record. Processing multiple records in batch can be slower overall if messages frequently need to be rejected (the bulk transaction must be rolled back and rows retried individually). Also if the end to end process is synchronous then with a high MaxTransactionSize and low MaxRaiseSize many downstream processes will be participating in a single global transaction, causing transaction timeouts.

17.5.2.3 Recommendations if Symptoms Arise

In case of transaction timeouts make the post asynchronous (to BPEL) or reduce the ratio MaxTransactionSize/MaxRaiseSize.

Note that BPEL has a high per-instance overhead, on the order of 10 times the overhead compared to OSB. Some of this use is dehydration and instance tracking. However, this cost is something of a fixed cost, whether each instance represents 10 rows in a single XML or 1 row in an XML. Consequently, one way to work around this overhead is for the Database Adapter to send and receive slightly larger payloads representing multiple rows. A symptom of such overhead may be slowness compared to a pure JDBC program.

17.5.3 Do not use RowsPerPollingInterval

You need to be very careful in your use of this tuning parameter.

17.5.3.1 Symptoms if not Properly Tuned

RowsPerPollingInterval is an explicit cap on throughput, designed to reduce burst load on downstream components, and must be used very carefully so as not to cap performance too much.

17.5.3.2 Downside of Tuning

May be set much lower than the burst throughput that the system can actually handle.

17.5.3.3 Recommendations if Symptoms Arise

Increase or eliminate RowsPerPollingInterval.

17.5.4 Enable Skip Locking true (Use Parameter usesSkipLocking)

Enable Skip Locking true is an important parameter.

17.5.4.1 Symptoms if not Properly Tuned

Polling will not scale with the number of threads as the exclusive locks used will block other threads from picking up any rows.

17.5.4.2 Downside of Tuning

None, though this parameter is only supported for Oracle Database and SQLServer platforms. Ensure the data source has enough capacity; if you don't have enough connections, adding more threads will simply create more waiting threads.

17.5.4.3 Recommendations if Symptoms Arise

Ensure usesSkipLocking = true (true by default).

17.5.5 Increase NumberOfThreads

This section discusses use of IncreaseNumberof Threads.

17.5.5.1 Symptoms if not Properly Tuned

Polling will not scale.The definition of scaling is approximately that performance increases linearly with number of threads/agents. Consequently, increasing threads in is a precondition to scaling. The other tuning parameters are to make sure the performance gains are linear (that is, skip locking by letting each thread work without being slowed down by other similar threads).

17.5.5.2 Downside of Tuning

Unless a distributed polling strategy like usesSkipLocking=true is used, increasing threads might not increase concurrency. Make sure the underlying data source connection has at least same the maximum capacity.

17.5.5.3 Recommendations if Symptoms Arise

Increase threads so long as performance continues to scale linearly.