In order to optimize and ensure business process throughput on highly scalable systems, the BPEL Service Engine supports clustering and failover . Clustering distributes processing over multiple BPEL Service Engines via multiple BPEL service units. Failover prevents processing from being interrupted by picking up business processes from any failed systems and processing them to completion.
Clustering
When a business process needs to be scaled to meet heavier processing needs, you can distribute it across multiple service engines, running on multiple processors or systems, to increase throughput. The BPEL Service Engine's clustering algorithm automatically distributes processing across multiple engines.
For details about setting up a cluster of application servers with BPEL Service Engines, see the documentation for Sun Java System (GlassFish) Application Server.
Failover
When your business process is configured for clustering, the BPEL Service Engine's failover capabilities ensure throughput of running business process instances. When a business process instance encounters an engine failure, any suspended instances are picked up by the next available BPEL Service Engine in the cluster.
To configure failover, set the BPEL Service Engine property, EngineExpiryInterval to register itself as alive frequently enough to meet the demands of your system. Optimizing this property setting might require some testing. The default setting is 15.
In order to configure a cluster of BPEL Service Engines, you must adhere to the following guidelines.
Persistence must be enabled for both clustering and failover.
To run persistence, all BPEL Service Engines must be restarted.
Service assemblies must be deployed manually across all clustered JBI environments.
Clustering/failover is implemented consistently for the specific protocol and binding components involved in a given business process.
Only a single database can be used for all BPEL Service Engines when implementing clustering/failover.
The database must be highly available; should the database fail, clustering/failover will fail.
When a BPEL Service Engine fails, a single BPEL Service Engine picks up those instances without distributing them across the cluster. Consequently, a large number of failed over instances can overload an entire cluster, one service engine at a time, as a sort of domino effect.
All BPEL Service Engines in a cluster must reside in the same time zone.
For more information on Clustering and Failover support for the BPEL Service Engine, see Practical Guide for Testing Clustering Support for the BPEL Service Engine