A clustered Oracle WebLogic Integration application provides scalability and high availability. A highly available deployment has recovery provisions in the event of hardware or network failures, and provides for the transfer of control to a backup component when a failure occurs.
For recommendations and database-specific requirements for configuring high availability Oracle WebLogic Integration applications, see Maintaining Availability.
For a cluster to provide high availability, it must be able to recover from service failures. Oracle WebLogic Server supports failover for replicated HTTP session states, clustered objects, and services pinned to servers in a clustered environment.
For information about how Oracle WebLogic Server handles such failover scenarios, see Communications in a Cluster in Using WebLogic Server Clusters.
The basic components of a highly available Oracle WebLogic Integration environment include the following:
Note: | For information about availability and performance considerations associated with the various types of JDBC drivers, see in Configure JDBC Data sources in Oracle WebLogic Server Administration Console Online Help. |
A full discussion of how to plan the network topology of your clustered system is beyond the scope of this section. For information about how to fully utilize load balancing and failover features for your Web application by organizing one or more Oracle WebLogic Server clusters in relation to load balancers, firewalls, and Web servers, see Cluster Architectures in Using WebLogic Server Clusters.
For a simplified view of a cluster, showing the HTTP load balancer, highly available database and multi-ported file system, see the following figure.
The default Oracle WebLogic Integration domain configuration uses a JDBC store for JMS servers. A file store can be used for JMS persistence in cases where a highly available multi-ported disk can be shared between managed servers, as described in the configuration shown in the preceding graphic. This will typically be more performant than a JDBC store.
For information about configuring JMS file stores, see Oracle WebLogic Server Administration Console Online Help.
A server can fail due to either software or hardware problems. The following sections describe the processes that occur automatically in each case and the manual steps that must be taken in these situations.
If a software fault occurs, the Node Manager (if configured to do so) will restart the Oracle WebLogic Server. For information about the Node Manager, see Overview of Node Manager. For information about the steps to take to prepare for recovering a secure installation, see Avoiding and Recovering From Server Failure.
If a hardware fault occurs, the physical machine may need to be repaired and could be out of operation for an extended period. In this case, the following events occur:
In the case of a failure of extended duration, it may be necessary to migrate to another, operational managed server. When manually migrating a failed server to another managed server:
For detailed information regarding Oracle WebLogic Server migration, see the following topics in the Oracle WebLogic Server documentation set:
In addition to the high availability features of Oracle WebLogic Server, Oracle WebLogic Integration has failure and recovery characteristics that are based on the implementation and configuration of your Oracle WebLogic Integration solution.
For more information about Oracle WebLogic Integration failure and recovery topics, see WebLogic Integration Application Recovery in the WebLogic Integration Solutions Best Practices FAQ.
RosettaNet and ebXML handle failure and recovery differently because of differences in the business protocols. However, both protocols send messages that fail to be delivered after the configured number of retries to wli.b2b.failedmessage.queue
. If you require additional processing of failed messages, you can implement custom message listeners for this queue.
When message delivery fails in the case of RosettaNet messages, the Oracle WebLogic Integration protocol layer does not retry messages. It returns HttpStatus code to the workflow layer, instead. RosettaNet workflows are usually designed to handle retries.
The Oracle WebLogic Integration Administration Console enables you to specify retry intervals, retry counts, and process timeouts for various trading partners based on the PIP(s) being used. For example, RosettaNet typically supports three retries at two-hour intervals with an overall 24-hour limit on the life of the actual PIP exchange. For information about changing these settings, see “Viewing and Changing Bindings” in Trading Partner Management in Using the WebLogic Integration Administration Console.
If one instance of Oracle WebLogic Integration sends a message to another instance and the destination instance has failed, you may see one or more error messages followed by a stack trace in the server console.
You can specify ebXML message retries using the Oracle WebLogic Integration Administration Console, Trading Partner Management Bulk Loader, or third-party Trading Partner Management message beans. If you set ebXML Delivery Semantics to OnceAndOnlyOnce
or AtLeastOnce
, messages will be retried according to the values you specify for Retry Count and Retry Interval. For information about using the Oracle WebLogic Integration Administration Console to set ebXML message retries, see “Defining Protocol Bindings” in
Trading Partner Management in Using the WebLogic Integration Administration Console.
For ebXML processes, set the action mode value to non-default to guarantee recovery and high availability. For information about setting the action mode, see “ebXML Business Processes” in Introducing ebXML Solutions in Introducing Trading Partner Integration.