In order to optimize and ensure business process throughput on highly scalable systems, the BPEL Service Engine supports load balancing and failover. Load balancing distributes processing over multiple BPEL Service Engines via multiple BPEL service units. Failover prevents processing from being interrupted by picking up business processes from any failed systems and processing them to completion.
Load Balancing
When a business process needs to be scaled to meet heavier processing needs, you can distribute it across multiple service engines, running on multiple processors or systems, to increase throughput. The BPEL Service Engine's load balance algorithm automatically distributes processing across multiple engines.
Failover
When your business process is configured for load balancing, the BPEL Service Engine's failover capabilities ensure throughput of running business process instances. When a business process instance encounters an engine failure, any suspended instances are picked up by the next available BPEL Service Engine.
To configure failover, set the BPEL Service Engine property, EngineExpiryInterval to register itself as alive frequently enough to meet the demands of your system. Optimizing this property setting might require some testing. The default setting is 15.
In order to configure failover for BPEL Service Engines, you must adhere to the following guidelines:
Persistence must be enabled for both load balancing and failover.
To run persistence, all BPEL Service Engines must be restarted.
Service assemblies must be deployed manually across all clustered JBI environments.
Failover is implemented consistently for the specific protocol and binding components involved in a given business process.
Only a single database can be used for all BPEL Service Engines when implementing failover.
The database must be highly available; should the database fail, failover will fail.
When a BPEL Service Engine fails, a single BPEL Service Engine picks up those instances without distributing them across the balanced system. Consequently, a large number of failed over instances can overload an entire system, one service engine at a time, as a sort of domino effect.
All BPEL Service Engines in a load balanced system must reside in the same time zone.