Skip Headers
Oracle® Application Server 10g High Availability Guide
10g (10.1.2)
Part No. B14003-01
  Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

3 Middle-tier High Availability

This chapter describes solutions that are available to protect the Oracle Application Server middle-tier from failures. It is organized into the following main sections:

3.1 Redundancy

The OracleAS middle-tier can be configured to provide two types of redundancy:

The ensuing sections provide details on how the middle tier achieves these redundancy types.

3.1.1 Active-Active

An Oracle Application Server middle-tier can be made redundant in an active-active configuration with Oracle Application Server Cluster (Middle-Tier). An OracleAS Cluster (Middle-Tier) is a set of application server middle-tier instances configured to act in active-active configuration to deliver greater scalability and availability than a single instance. Using middle-tier OracleAS Clusters removes the single point of failure that a single instance poses. While a single application server instance leverages the operating resources of a single host, a cluster can span multiple hosts, distributing application execution over a greater number of CPUs. A single application server instance is vulnerable to the failure of its host and operating system, but a cluster continues to function despite the loss of an operating system or a host, hiding any such failure from clients

Figure 3-1 presents the various sub-tiers of the OracleAS middle-tier in a redundant active-active configuration. Each sub-tier is configured with redundant processes so that the failure of any of these processes is handled by the sub-tier above the processes such that the failure has no effect in the incoming requests from clients.

Figure 3-1 Overall active-active architecture of OracleAS middle-tier

Description of ashia016.gif follows
Description of the illustration ashia016.gif

The following sub-sections describe features that characterize each sub-tier's active-active configuration:

3.1.1.1 Oracle Application Server Web Cache Sub-Tier

Two or more OracleAS Web Cache instances can be clustered together to create a single logical cache. Physically, the cache can be distributed amongst several nodes. If one node fails, a remaining node in the same cluster can fulfill the requests serviced by the failed node. The failure is detected by the remaining nodes in the cluster who take over ownership of the cacheable content of the failed member. The load balancing mechanism in front of the OracleAS Web Cache cluster, for example, a hardware load balancing appliance, redirects the requests to the live OracleAS Web Cache nodes.

OracleAS Web Cache clusters also add to the availability of OracleAS instances. By caching static and dynamic content in front of the OracleAS instances, requests can be serviced by OracleAS Web Cache reducing the need for the requests to be fulfilled by OracleAS instances, particularly for Oracle HTTP Servers. The load and stress on OracleAS instances is reduced, thereby increasing availability of the components in the instances.

Oracle Application Server Web Cache can also perform a stateless or stateful load balancing role for Oracle HTTP Servers. Load balancing is done based on the percentage of the available capacity of each Oracle HTTP Server, or, in other words, the weighted available capacity of each Oracle HTTP Server. If the weighted available capacity is equal for several Oracle HTTP Servers, OracleAS Web Cache uses round robin to distribute the load. Refer to Oracle Application Server Web Cache Administrator's Guide for the formula to calculate weighted available capacity.

Table 3-1 provides a summary of the high availability characteristics of OracleAS Web Cache.

Table 3-1 OracleAS Web Cache high availability characteristics

Component Protection from Node Failure Protection from Service Failure Protection from Process Failure Automatic Re-routing State Replication Configuration Cloning
OracleAS Web Cache
OracleAS Web Cache cluster protects from single point of failure. An external load balancer should be deployed in front of this cluster to route requests to live OracleAS Web Cache nodes. In an OracleAS Web Cache cluster, pings are made to a specific URL in each cluster member to ensure that the URL is still serviceable. OPMN monitors OracleAS Web Cache processes and restarts them upon process failure. OracleAS Web Cache members in a cluster ping each other to verify that peer members are alive or have failed. External load balancers provide failover capabilities for requests routed to OracleAS Web Cache components. OracleAS Web Cache clustering manages cached contents that need to be transferred between OracleAS Web Cache nodes. OracleAS Web Cache cluster maintains uniform configuration across cluster.

In the case of failure of a Oracle HTTP Server, OracleAS Web Cache redistributes the load to the remaining Oracle HTTP Servers and polls the failed server intermittently until it comes back online. Thereafter, OracleAS Web Cache recalculates the load distribution with the revived Oracle HTTP Server in scope.

3.1.1.2 Oracle HTTP Server Sub-Tier

Oracle HTTP Server and Oracle Application Server Web Cache provide HTTP and HTTPS request handling for Oracle Application Server requests. Each HTTP request is met by a response from Oracle HTTP Server or from OracleAS Web Cache, if the content requested is cached.

Oracle HTTP Server routes a request to different plug-in modules depending on the type of request received. These modules in turn delegate the request to different types of processes. The most common modules are mod_oc4j for J2EE applications and mod_plsql for PLSQL applications. mod_oc4j delegates requests to OC4J processes. mod_plsql delegates requests to database processes. For all these types of requests, no state is required to be maintained in the Oracle HTTP Server processes.

This section covers the following topics:

3.1.1.2.1 Oracle HTTP Server High Availability Summary

Table 3-2 summarizes some of the Oracle Application Server high availability features for Oracle HTTP Server.

Table 3-2 Oracle HTTP Server high availability characteristics

Component Protection from Node Failure Protection from Service Failure Protection from Process Failure Automatic Re-routing State Replication Configuration Cloning
Oracle HTTP Server
OracleAS Cluster protects from single point of failure. A load balancer should be deployed in front of Oracle HTTP Server instances. This can be an external load balancer or OracleAS Web Cache. Load balancer in front of Oracle HTTP Server sends request to another Oracle HTTP Server if first one doesn't respond or is deemed failed through URL pings. Load balancer can be either OracleAS Web Cache or hardware appliance. OPMN monitors Oracle HTTP Server processes and restarts them upon process failure. Each Oracle HTTP Server is also notified by OPMN when another Oracle HTTP Server process in the OracleAS Cluster fails. Load balancer in front of Oracle HTTP Server auto re-routes to another Oracle HTTP Server if first does not respond. None. OracleAS Cluster allows configuration to be replicated across to other Oracle HTTP Servers in the cluster through DCM.

3.1.1.2.2 OC4J Load Balancing Using mod_oc4j

The Oracle HTTP Server module, mod_oc4j provides routing for HTTP requests that are handled by OC4J. Whenever a request is received for a URL that matches one of the mount points specified in mod_oc4j.conf, the request is routed to one of the available destinations specified for that URL. A destination can be a single OC4J process, or a set of OC4J instances. If an OC4J process fails, OPMN detects the failure and mod_oc4j does not send requests to the failed OC4J process until the OC4J process is restarted.

Using mod_oc4j configuration options, you can specify different load balancing routing algorithms depending on the type and complexity of routing you need. Stateless requests are routed to any destination available based on the algorithm specified in mod_oc4j.conf. Stateful HTTP requests are forwarded to the OC4J process that served the previous request using session identifiers, unless mod_oc4j determines through communication with OPMN that the process is not available. In this case, mod_oc4j forwards the request to an available OC4J process following the specified load balancing protocol.

Table 3-3 summarizes the routing styles that mod_oc4j provides. For each routing style, Table 3-3 lists the different algorithms that you can configure to modify the routing behavior. These mod_oc4j configuration options determine the OC4J process where mod_oc4j sends incoming HTTP requests to be handled.


See Also:


Table 3-3 mod_oc4j Routing Algorithms Summary

Routing Method Description
Round Robin Using the simple round robin configuration, all OC4J processes, remote and local to the application server instance running the Oracle HTTP Server, are placed in an ordered list. Oracle HTTP Server then chooses an OC4J process at random for the first request. For each subsequent request, Oracle HTTP Server forwards requests to another OC4J process in round robin style.

The round robin configuration supports local affinity and weighted routing options.

Random Using the simple random configuration, all OC4J processes, remote and local to the application server instance running the Oracle HTTP Server, are placed in an ordered list. For every request, Oracle HTTP Server chooses an OC4J process at random and forwards the request to that instance.

The random configuration supports local affinity and weighted routing options.

Metric-Based Using the metric-based configuration OC4J processes, remote and local to the application server instance running the Oracle HTTP Server, are placed into an ordered list. OC4J processes then regularly communicate to Oracle HTTP Server how busy they are and Oracle HTTP Server uses this information to send requests to the OC4J processes that are less busy. The overhead in each OC4J node is measured using the runtime performance metrics of OC4J processes. When there are no local OC4J processes available, mod_oc4j routes requests to each OC4J process on different hosts as per their performance metrics only.

The metric-based configuration supports a local affinity option.


OC4J Load Balancing Using Local Affinity and Weighted Routing Options

Using mod_oc4j options, you can select a routing method for routing OC4J requests. If you select either round robin or random routing, you can also use local affinity or weighted routing options. If you select metric-based routing, you can also use the local affinity option.

Using the weighted routing option, a weight is associated with OC4J processes on a node, as configured in mod_oc4j, on a node by node basis. During request routing, mod_oc4j then uses the routing weight to calculate which OC4J process to assign requests to. Thus, OC4J processes running on different nodes can be assigned different weights.

Using the local affinity option, mod_oc4j keeps two lists of available OC4J processes to handle requests, a local list and a remote list. If processes are available from the local list then requests are assigned locally using the random routing method or, for metric-based routing using metric-based routing. If no processes are available in the local list, then mod_oc4j selects processes randomly from the remote list when random method, using a round robin method for the round robin method, or using metric-based routing with the metric-based method.

Choosing a mod_oc4j Routing Algorithm

Table 3-3 summarizes the available routing options. To select a routing algorithm to configure with mod_oc4j, you need to consider the type of environment where Oracle HTTP Server runs. Use the following guidelines to help determine which configuration options to use with mod_oc4j:

  • For a Oracle Application Server cluster setup, with multiple identical machines running Oracle HTTP Server and OC4J in the same node, the round robin with local affinity algorithm is preferred. Using this configuration, an external router distributes requests to multiple machines running Oracle HTTP Server and OC4J. In this case Oracle HTTP Server gains little by using mod_oc4j to route requests to other machines, except in the extreme case that all OC4J processes on the same machine are not available.

  • For a tiered deployment, where one tier of machines contains Oracle HTTP Server and another contains OC4J instances that handle requests, the preferred algorithms are simple round robin and simple metric-based. To determine which of these two is best in a specific setup, you may need to experiment with each and compare the results. This is required because the results are dependent on system behavior and incoming request distribution.

  • For a heterogeneous deployment, where the different application server instances run on nodes that have different characteristics, using the weighted round robin algorithm is preferred. Tune the number of OC4J processes running on each application server instance may allow you to achieve the maximum benefit. For example, a machine with a weight of 4 gets 4 times as many requests as a machine with a weight of 1, but if the system with a weight of 4 may not be running 4 times as many OC4J processes.

  • Metric-based load balancing is useful when there are only a few metrics that dominate the performance of an application. For example, CPU or number of database connections.


    See Also:


3.1.1.2.3 Database Load Balancing with mod_plsql

mod_plsql maintains a pool of connections to the database and reuses established database connections for subsequent requests. If there is no response from a database connection in a connection pool, mod_plsql detects this, discards the dead connection, and creates a fresh database connection for subsequent requests.

The dead database connection detection feature of mod_plsql eliminates the occurrence of errors when a database node or instance goes down. This feature is also extremely useful in high availability configurations like Real Application Clusters. If a node in a Real Application Clusters database fails, mod_plsql detects this and immediately starts servicing requests using the other Real Application Clusters nodes. mod_plsql provides different configuration options to satisfy maximum protection or maximum performance needs. By default, mod_plsql tests all pooled database connections which were created prior to the detection of a failure, but it also allows constant validation of all pooled database connections prior to issuing a request.

3.1.1.3 Oracle Application Server Containers for J2EE Sub-Tier

The Oracle Application Server Containers for J2EE tier consists of the Oracle Application Server implementation of the J2EE container. This section discusses how the various OC4J components can be made highly available and consists of the following topics:

3.1.1.3.1 OracleAS Cluster (OC4J)

Oracle Application Server provides several strategies for ensuring high availability with OC4J instances, both within an application server instance and across a cluster that includes multiple application server instances.

Besides the high availability features described in this section, other Oracle Application Server features enable OC4J processes to be highly available, including the load balancing feature in Oracle HTTP Server and the Oracle Process Manager and Notification Server system that automatically monitors and restarts processes.

The following sections explain the strategies for ensuring high availability for stateful applications in OC4J instances. Overall, there are two strategies:

Web Application Session State Replication with OracleAS Cluster (OC4J)

When a stateful Web application is deployed to OC4J, multiple HTTP requests from the same client may need to access the application. However, if the application running on the OC4J server experiences a problem where the OC4J process fails, the state associated with a client request may be lost. Using Oracle Application Server, there are two ways to guard against such failures:

  • State safe applications save their state in a database or other persistent storage system, avoiding the loss of state when the server goes down. Obviously, there is a performance cost for continually writing the application state to persistent storage.


    Note:

    In this release of Oracle Application Server, 10g (10.1.2), saving application state to persistent storage is the application developer's responsibility.

  • Stateful applications can use OC4J session state replication, with OracleAS Cluster (OC4J), to automatically replicate the session state across multiple processes in an application server instance, and in a cluster, across multiple application instances which may run on different nodes.

An OC4J instance is the entity to which J2EE applications are deployed and configured. An OC4J instance is characterized by a specific set of binaries and configuration files. Several OC4J processes can be started for each OC4J instance. The OC4J process is what executes the J2EE applications for the OC4J instance. Within the application sever instance, you can configure multiple OC4J instances, each with its own number of OC4J processes. The advantage of this is for configuration management and application deployment for separate OC4J processes in a cluster.

OC4J processes can be grouped into OracleAS Cluster (OC4J) to support session state replication for the high availability of Web applications. Using an OracleAS Cluster (OC4J) together with mod_oc4j request routing provides stateful failover in the event of a software or hardware problem. For example, if an OC4J process that is part of an OracleAS Cluster (OC4J) fails, mod_oc4j is notified of the failure by OPMN and routes requests to another OC4J process in the same cluster.

Each OC4J instance in a cluster has the following features:

  • The configuration of the OC4J instance is valid for one or more OC4J executable processes. This way, you can duplicate the configuration for multiple OC4J processes by managing these processes in the OC4J instance construct. When you modify the cluster-wide configuration within the OC4J instance, the modifications are valid for all OC4J processes.

  • Each OC4J instance can be configured with one or more OC4J processes.

  • When you deploy an application to an OC4J instance, all OC4J processes share the same application properties and configuration defined in the OC4J instance. The OC4J instance is also responsible for replicating the state of its applications.

  • The number of OC4J processes is specific to each OC4J instance. This must be configured for each application server instance in the cluster. The OC4J process configuration provides flexibility to tune according to the specific hardware capabilities of the host. By default, each OC4J instance is instantiated with a single OC4J process.

Web Application Session State Replication Protecting Against Software Problems To guard against software problems, such as OC4J process failure or hang, you can configure an OC4J instance to run multiple OC4J processes in the same OracleAS Cluster (OC4J). The processes in the OracleAS Cluster (OC4J) communicate their session state between each other. Using this configuration provides failover and high availability by replicating state across multiple OC4J processes running on an application server instance.

In the event of a failure, Oracle HTTP Server forwards requests to active (alive) OC4J process within the OracleAS Cluster (OC4J). In this case, the Web application state for the client is preserved and the client does not notice any loss of service.

Figure 3-2 shows this type of software failure within an application server instance and the failover to the surviving process.

Figure 3-2 Web Application Session State Failover Within an OracleAS Cluster (OC4J) in an OC4J instance

Description of ashia019.gif follows
Description of the illustration ashia019.gif

Web Application Session State Replication Protecting Against Hardware Problems To guard against hardware problems, such as the failure of the node where an application server instance runs, you can configure OracleAS Cluster (OC4J) across application server instances that are in more than one node in an OracleAS Cluster. By configuring an OracleAS Cluster (OC4J) that uses the same name across multiple application server instances, the OC4J processes can share session state information across the OracleAS Cluster (OC4J). When an application server instance fails or is not available, for example, when the node it runs on goes down, Oracle HTTP Server forwards requests to an OC4J process in an application server instance that is available. Thus, Oracle HTTP Server forwards requests only to active (alive) OC4J processes within the cluster.

In this case, the Web application state for the client is preserved and the client does not notice any irregularity.

Figure 3-3 depicts an OracleAS Cluster (OC4J) configured across two Oracle Application Server instances. This configuration allows for web application session state replication failover within an OracleAS Cluster (OC4J).

Figure 3-3 Web Application Session State Failover Within an OracleAS Cluster (OC4J)

Description of ashia020.gif follows
Description of the illustration ashia020.gif

Configuring OracleAS Cluster (OC4J) for Web Application Session State Replication To protect against software or hardware failure while maintaining state with the least number of OC4J processes, you need to configure at least two OC4J processes in the same cluster. For example, if you have two application server instances, instance 1 and instance 2, you can configure two OC4J processes in the default_island on each application server instance. With this configuration, stateful session applications are protected against hardware and software failures, and the client maintains state if either of the following types of failures occurs:

  • If one of the OC4J processes fails, then the client request is redirected to the other OC4J process in the default_island on the same application server instance. State is preserved and the client does not notice any irregularity.

  • If application server instance 1 terminates abnormally, then the client is redirected to the OC4J process in the default_island on application server instance 2. The state is preserved and the client does not notice any irregularity.

Stateful Session EJB State Replication with OracleAS Cluster (OC4J)

Using OC4J, stateful session EJBs can be configured to provide state replication across OC4J processes associated to an application server instance or across an OracleAS Cluster. This EJB replication configuration provides high availability for stateful session EJBs by using multiple OC4J processes to run instances of the same stateful session EJB.


Note:

Use of EJB replication OracleAS Cluster (OC4J-EJB) for high availability is independent of middle-tier OracleAS Clusters and can involve multiple application server instances installed across nodes that are or are not part of middle-tier OracleAS Clusters.

OracleAS Cluster (OC4J-EJB)s provide high availability for stateful session EJBs. They allow for failover of these EJBs across multiple OC4J processes that communicate over the same multicast address. Thus, when stateful session EJBs use replication, this can protect against process and node failures and can provide for high availability of stateful session EJBs running on Oracle Application Server.

JNDI Namespace Replication When EJB clustering is enabled, JNDI namespace replication is also enabled between the OC4J instances in a middle-tier OracleAS Cluster. New bindings to the JNDI namespace in one OC4J instance are propagated to other OC4J instances in the middle-tier OracleAS Cluster. Re-bindings and unbindings are not replicated.

The replication is done outside the scope of each OracleAS Cluster (OC4J). In other words, multiple OracleAS Clusters (OC4J) in an OC4J instance have visibility into the same replicated JNDI namespace.

EJB Client Routing In EJB client routing, EJB classes take on the routing functionality that mod_oc4j provides between Oracle HTTP Server and servlets/JSPs. Clients invoke EJBs using the Remote Method Invocation (RMI) protocol. The RMI protocol listener is set up by in the RMI configuration file, rmi.xml, for each OC4J instance. It is separate from the Web site configuration. EJB clients and the OC4J tools access the OC4J server through a configured RMI port. OPMN designates a range of ports that the RMI listener could be using.

When you use the "opmn:ormi://" prefix string in the EJB look up, the client retrieves the assigned RMI port automatically. The load balancing and client request routing is provided by OPMN selecting the different OC4J processes available. The algorithm used for this load balancing is the random algorithm. Multiple OPMN URLs separated by comas can be used for higher availability.


See Also:

The EJB primer section in Oracle Application Server Containers for J2EE User's Guide.

3.1.1.3.2 OC4J Distributed Caching Using Java Object Cache

Oracle Application Server Java Object Cache provides a distributed cache that can serve as a high availability solution for applications deployed to OC4J. The Java Object Cache is an in-process cache of Java objects that can be used on any Java platform by any Java application. It allows applications to share objects across requests and across users, and coordinates the life cycle of the objects across processes.

Java Object Cache enables data replication among OC4J processes even if they do not belong to the same OracleAS Cluster (OC4J), application server instance, or overall Oracle Application Server Cluster.

By using Java Object Cache, performance can be improved since shared Java objects are cached locally, regardless of which application produces the objects. This also improves availability; in the event that the source for an object becomes unavailable, the locally cached version will still be available.


See Also:

The Java Object Cache chapter in the Oracle Application Server Web Services Developer's Guide for complete information on using Java Object Cache

3.1.1.3.3 JMS High Availability

Two JMS providers are available with Oracle Application Server. Due to differing implementations, each provider achieves high availability in different ways. As such, they are discussed in two sections:

Oracle Application Server JMS (OracleAS JMS) is implemented in OC4J. Hence, it utilizes OPMN for process monitoring and restart.

Oracle JMS (OJMS) is implemented through Oracle Streams Advanced Queuing (AQ). It requires the Oracle database and can have active-active availability through the database Real Application Clusters and Transparent Application Failover (TAF) features.

Table 3-4 provides an overview of high availability and configuration characteristics of the two JMS providers. The sections following the table discuss each provider in more detail.

Table 3-4 Summary of high availability and configuration characteristics of OJMS and OracleAS JMS

Characteristic OJMS OracleAS JMS
Process-Level High Availability OPMN (JMS application) OPMN
Node-Level High Availability Real Application Clusters (AQ, TAF) OracleAS Cold Failover Cluster (Middle-Tier)
Configuration Real Application Clusters configuration, resource provider configuration dedicated JMS server, jmx.xml configuration, opmn.xml configuration
Message Store in Real Application Clusters database in dedicated JMS server/persistence files
Failover same or different machine (depending on Real Application Clusters setup) same or different machine only in active-passive configuration with OracleAS Cold Failover Cluster (Middle-Tier)

(see Section 3.1.2.1, "Oracle Application Server Cold Failover Cluster (Middle-Tier)")



Note:

The Oracle Application Server Containers for J2EE Services Guide provides detailed information and instructions on setting up OracleAS JMS and OJMS to be highly available. High availability for third party JMS providers is not discussed as it is provider-specific.

The following sections provide details on how each JMS provider achieves high availability.

Oracle Application Server JMS High Availability

High availability for OracleAS JMS can be achieved through grouping multiple instances of OC4J together in one cluster. This cluster is called Oracle Application Server Cluster (OC4J-JMS). OPMN can be used to monitor and restart OC4J processes in the event of process failure.

Oracle Application Server Cluster (OC4J-JMS) provides an environment wherein JMS applications deployed in this environment can load balance JMS requests across multiple OC4J instances or processes. Redundancy is also achieved as the failure of an OC4J instance with a JMS server does not impact the availability of the JMS service since at least one other OC4J instance is available with a JMS server.

Both the JMS client and the JMS server contain state about each other, which includes information about connections, sessions, and durable subscriptions. Application developers can configure their environment and use a few simple techniques when writing their applications to make them cluster-friendly.

OracleAS Cluster (OC4J-JMS) allows for two configurations:

  • OracleAS JMS Server Distributed Destinations

    This configuration requires multiple OC4J instances. Each instance contains a JMS server, queue destination, and application. There is no inter-process communication between JMS servers, queues, and applications in other OC4J instances. The sender and receiver of each application must be deployed together in an OC4J instance. A message enqueued to the JMS server in one OC4J process can be dequeued only from that OC4J process.

    This configuration has the following advantages:

    • High throughput is achieved since applications and JMS servers are executing within the same JVMs and no inter-process communication is required.

    • There is no single point of failure. As long as one OC4J process is running, requests can be processed.

    • Destination objects can be persistent or in-memory. Persistence is file-based.

    The disadvantage of this configuration is that there is no failover from one JMS server to another.

  • Dedicated OracleAS JMS Server

    This configuration defines that only one OC4J instance has the dedicated JMS server in an OracleAS Cluster (OC4J-JMS). The OC4J instance with the JMS server handles all messages. Message ordering is always maintained due to the single JMS server. All JMS applications use this dedicated server to host their connection factories and destination, and to service their enqueue and dequeue requests.

    Only one OC4J JVM acts as the dedicated JMS server for all JMS applications within the Oracle Application Server Cluster (OC4J-JMS). The single JVM ensures that other JVMs do not attempt to use the same set of persistent files. Other OC4J execute only applications. The single JMS server can be configured by limiting the JMS port range in the opmn.xml file to only one port for the dedicated OC4J instance. The single port value ensures that OPMN always assigns the same port value to the dedicated JMS server. This port value is used to define the connection factory in the jms.xml file that other OC4J instances in the OracleAS Cluster (OC4J-JMS) use to connect to the dedicated JMS server.

    Refer to the JMS chapter in the Oracle Application Server Containers for J2EE Services Guide for more information on how to modify the opmn.xml file for this dedicated JMS server configuration.


    See Also:

    The section "Abnormal Termination" in the Java Message Service chapter of the Oracle Application Server Containers for J2EE Services Guide. This section describes how to manage persistence files when an unexpected failure occurs.

Oracle JMS High Availability

High availability for Oracle JMS (OJMS) can be achieved using a Real Application Clusters database. AQ queues and topics should be available in the Real Application Clusters environment.

Each JMS application in Oracle Application Server uses OC4J resource providers to point to the backend Real Application Clusters database. JMS operations invoked on objects derived from these resources providers are directed to the database.

An OJMS application that uses a Real Application Clusters database must be able to handle database failover scenarios. Two failover scenarios are possible:

  • Real Application Clusters Network Failover

    In the event of the failure of a database instance, a standalone OJMS application running against a Real Application Clusters database must have code to obtain the connection again and to determine if the connection object is invalid or not. The code must re-establish the connection if necessary. Use the API method com.evermind.sql.DbUtil.oracleFatalError() to determine if a connection object is invalid. If invalid, a good strategy is to aggressively roll back transactions and recreate the JMS state, such as connections, session, and messages, that were lost. Refer to the JMS chapter in Oracle Application Server Containers for J2EE Services Guide for a code example.

  • Transparent Application Failover

    For most cases when Transparent Application Failover (TAF) is configured, an OJMS application will not be aware of a failed database instance that it is connected to. Hence, the application code need not perform any tasks to handle the failure.

    However, in some cases, OC4J may throw an ORA error when a failure occurs. OJMS passes these errors to the application as a JMSException with a linked SQL exception. To handle these exceptions, the following can be done:

    • As in the previous point, "Real Application Clusters Network Failover", provide code to use the method com.evermind.sql.DbUtil.oracleFatalError() to if the error is a fatal error. If it is, follow the approach outlined in the previous point. If it is not, the client can recover by sleeping for a short period of time and then wake up and retry the last operation.

    • Failback and transient errors caused by incomplete failover can be recovered from by attempting to use the JMS connection after a short time. Pausing allows the database failover to recover from the failure and reinstate itself.

    • In the case of transaction exceptions, such as "Transaction must roll back" (ORA-25402) or "Transaction status unknown (ORA-25405), the current operation must be rolled back and all operations past the last commit must be retried. The connection is not usable until the cause of the exception is dealt with. If the retry fails, close and re-create all connections and retry all uncommitted operations.

Clustering Best Practises

The following are best practise guidelines for working with clustered JMS servers:

  • Minimize JMS client-side state.

    • Perform work in transacted sessions.

    • Save/checkpoint intermediate program state in JMS queues/topics for full recoverability.

    • Do not depend on J2EE application state to be serializable or recoverable across JVM boundaries. Always use transient member variables for JMS objects, and write passivate/activate and serialize/deserialize functions that save and recover JMS state appropriately.

  • Do not use nondurable subscriptions on topics.

    • Nondurable topic subscriptions duplicate messages per active subscriber. Clustering and load balancing creates multiple application instances. If the application creates a nondurable subscriber, it causes the duplication of each message published to the topic. This is either inefficient or semantically invalid.

    • Use only durable subscriptions for topics. Use queues whenever possible.

  • Do not keep durable subscriptions alive for extended periods of time.

    • Only one instance of a durable subscription can be active at any given time. Clustering and load-balancing creates multiple application instances. If the application creates a durable subscription, only one instance of the application in the cluster succeeds. All other instances fail with a JMSException.

    • Create, use, and close a durable subscription in small time/code windows, minimizing the duration when the subscription is active.

    • Write application code that accommodates failure to create durable subscription due to clustering (when some other instance of the application running in a cluster is currently in the same block of code) and program appropriate back-off strategies. Do not always treat the failure to create a durable subscription as a fatal error.

3.1.2 Active-Passive

Active-passive high availability for the middle tier is achieved using a cold failover cluster. This is discussed in the following section.

3.1.2.1 Oracle Application Server Cold Failover Cluster (Middle-Tier)

A two-node OracleAS Cold Failover Cluster (Middle-Tier) can be used to achieve active-passive availability for Oracle Application Server middle-tier components. In a OracleAS Cold Failover Cluster (Middle-Tier), one node is active while the other is passive, on standby. In the event that the active node fails, the standby node is activated, and the middle-tier components continue servicing clients from that node. All middle-tier components are failed over to the new active node. No middle-tier components run on the failed node after the failover.

In the OracleAS Cold Failover Cluster (Middle-Tier) solution, a virtual hostname and a virtual IP are shared between the two nodes (the virtual hostname maps to the virtual IP in their subnet). However, only one node, the active node, can use these virtual settings at any one time. When the active node fails and the standby node is made active, the virtual IP is moved to the new active node. All requests to the virtual IP are now serviced by the new active node.

The OracleAS Cold Failover Cluster (Middle-Tier) can use the same machines as the OracleAS Cold Failover Cluster (Infrastructure) solution. In this scenario, two pairs of virtual hostnames and virtual IPs are used, one pair for the middle-tier cold failover cluster and one pair for the OracleAS Cold Failover Cluster (Infrastructure) solution. Figure 5-6 illustrates such a scenario. In this set up, the middle-tier components can failover independently from the OracleAS Infrastructure.

No shared storage is required for the middle-tier cold failover cluster except in the case where Oracle Application Server JMS file-based persistence is used (a shared disk for storing the persistence files should be set up).

Each node must have an identical mount point for the middle-tier software. One installation for the middle-tier software must be done on each node, and both installations must have the same local Oracle home path.


Note:

For instructions on installing the OracleAS Cold Failover Cluster (Middle-Tier), see the Oracle Application Server Installation Guide. For instructions on managing it, see Section 4.7, "Managing OracleAS Cold Failover Cluster (Middle-Tier)" .

3.1.2.1.1 Managing Failover

The typical deployment expected for the solution is a two-node hardware cluster with one node running the OracleAS Infrastructure and the other running the Oracle Application Server middle-tier. If either node needs to be brought down for hardware or software maintenance or crashes, the surviving node can be brought online, and the OracleAS Infrastructure or Oracle Application Server middle-tier service can be started on this node.However, since a typical middle-tier cold failover deployment does not require any shared storage (except for when OracleAS JMS file persistence is used), alternate deployments may include two standalone machines on a subnet, each with a local installation of the middle-tier software and a virtual IP which can failover between them.

The overall steps for failing over to the standby node are as follows:

  1. Stop the middle-tier service on current primary node (if node is still available).

  2. Fail over the virtual IP to the new primary node.

  3. If OracleAS JMS file based persistence is using a shared disk for the messages, the shared disk is failed over to the new primary node.

  4. Start the middle-tier service on the new primary node.

For failover management, two approaches can be employed:

  • Automated failover using a cluster manager facility

    The cluster manager offers services, which allows development of packages to monitor the state of a service. If the service or the node is found to be down, it automatically fails over the service from one node to the other node. The package can be developed to try restarting the service on a given node before failing over.

  • Manual failover

    For this approach, the failover steps outlined above are executed manually. Since both the detection of the failure and the failover itself is manual, this method may result in a longer period of service unavailability.

3.1.2.1.2 OracleAS JMS in an OracleAS Cold Failover Cluster (Middle-Tier) Environment

OracleAS JMS can be deployed in an active-passive configuration by leveraging the two-node OracleAS Cold Failover Cluster (Middle-Tier) environment. In such an environment, the OC4J instances in the active node provide OracleAS JMS services, while OC4J instances in the passive node are inactive. OracleAS JMS file-based persistence data is stored in a shared disk.

Upon the failure of the active node, the entire middle-tier environment is failed over to the passive node, including the OracleAS JMS services and the shared disk used to persist messages. The OC4J instances in the passive node are started up together with other processes for the middle-tier environment to run. This node is now the new active node. OracleAS JMS requests are then serviced by this node from thereon.

3.2 Highly Available Middle-tier Configuration Management Concepts

This section describes how configuration management can improve high availability for the middle tier. It covers the following:

3.2.1 Oracle Application Server Clusters Managed Using DCM

Distributed Configuration Management (DCM) is a management framework that enables you to manage the configurations of multiple Oracle Application Server instances. When administering an OracleAS Cluster that is managed using DCM, an administrator uses either Application Server Control Console or dcmctl commands to manage and configure common configuration information on one Oracle Application Server instance. DCM replicates the common configuration information across all Oracle Application Server instances within the OracleAS Cluster. The common configuration information for the cluster is called the cluster-wide configuration.


Note:

There is configuration information that can be configured individually, per Oracle Application Server instance within a cluster (these configuration options are also called instance-specific parameters).

This section covers the following:

3.2.1.1 What is a DCM-Managed OracleAS Cluster?

A DCM-Managed OracleAS Cluster provides distributed configuration information and lets multiple Oracle Application Server instances be configured together.

The features of a DCM-Managed OracleAS Cluster include:

  • Synchronization of configuration across instances in the DCM-Managed OracleAS Cluster.

  • OC4J distributed application deployment – deploying to one OC4J triggers deployment to all OC4Js.

  • Distributed diagnostic logging – all members of a DCM-Managed OracleAS Cluster log to same the same log file repository when the log loader is enabled.

  • A shared OC4J island is setup by default. (Replication is not enabled automatically for the applications deployed in the cluster, each application needs to be marked as "distributable" in its web.xml file, and multicast replication needs to be enabled in the replication properties for the OC4J instance).

  • Load-balancing – Oracle HTTP Server is automatically configured to share load among all DCM-Managed OracleAS Cluster members.

  • Distributed process control – DCM-Managed OracleAS Cluster membership enables the opmnctl DCM-Managed OracleAS Cluster scope start, stop, and restart commands.

Each application server instance in an DCM-Managed OracleAS Cluster has the same base configuration. The base configuration contains the cluster-wide configuration and excludes instance-specific parameters.

For Oracle Application Server high availability, when a system in an Oracle Application Server cluster is down, there is no single point of failure for DCM. DCM remains available on all the available nodes in the cluster.

Using DCM helps reduce deployment and configuration errors in a cluster; these errors could, without using DCM, be a significant cause of system downtime.

Enterprise Manager uses DCM commands to perform application server configuration and deployment. You can also issue DCM commands manually using the dcmctl command.

DCM enables the following configuration commands:

  • Create or remove a cluster

  • Add or remove application server instances to or from a cluster

  • Synchronize configuration changes across application server instances

Note the following when making configuration changes to a cluster or deploying applications to a cluster:

  • If Enterprise Manager is up and managing the cluster, you can invoke the DCM command-line tool from any host where a clustered application server instance is running. The DCM daemon must be running on each node in the cluster.

  • If Enterprise Manager is not up and managing the cluster, if you want configuration changes to by applied dynamically across the cluster, the DCM daemon must be running on each node in the cluster. To start the DCM daemon, run the DCM command-line tool, dcmctl, on each application server instance in the cluster.

3.2.1.2 Oracle Application Server DCM Configuration Repository Types

Oracle Application Server supports two types of DCM configuration repositories: Database-based and File-based DCM configuration repositories. The DCM configuration repository stores configuration information and metadata related to the instances in an OracleAS Farm, and when the OracleAS Farm contains DCM-Managed OracleAS Clusters, stores both cluster-wide configuration information and instance-specific parameters for instances in DCM-Managed OracleAS Clusters.

  • An OracleAS Database-based Farm stores repository information and protects configuration information using an Oracle database.

  • An OracleAS File-based Farm stores repository information in the file system, when the Farm contains a DCM-Managed OracleAS Cluster, the DCM configuration repository stores both cluster-wide configuration information and instance-specific parameters. Using an OracleAS File-based Farm, cluster-wide configuration information and related metadata is stored on the file system of an Oracle Application Server instance that is the repository host (host). Oracle Application Server instances that are part of a OracleAS File-based Farm depend on the repository host to store cluster-wide configuration information.


See Also:

Distributed Configuration Management Administrator's Guide

3.2.2 Manually Managed Oracle Application Server Clusters

Using a Manually Managed OracleAS Cluster, it is the administrator's responsibility to synchronize the configuration of Oracle Application Server instances within the OracleAS Cluster. A full discussion of Manually Managed OracleAS Clusters can be found in Appendix B, "Manually Managed Oracle Application Server Cluster".

3.3 Middle-tier Backup and Recovery Considerations

Once a failure has occurred in your system, it is important to recover from that failure as quickly as possible. Depending on the type of failure, recovery of a middle-tier installation involves one or both of the following tasks:


Note:

The Oracle Application Server Administrator's Guide contains all required backup and recovery strategies and procedures.

The restoration of middle-tier files can be done from backups made using procedures described in the "Backup Strategy and Procedures" chapter of the Oracle Application Server Administrator's Guide. The backups encompass both the middle-tier and Infrastructure installations and are performed Oracle home by Oracle home. Thus, the restoration of the middle tier is also performed Oracle home by Oracle home. Each restoration can be done on the same node that the backup was taken from or on a new node. The "Recovery Strategies and Procedures" chapter in the Oracle Application Server Administrator's Guide provide details on using backups for recovery.

Restoration of a middle-tier installation on the same node restores the Oracle home, Oracle Application Server configuration files, and DCM repository. The backup of the Oracle home and configuration files is done when performing a complete Oracle Application Server environment backup, which is a backup of the entire Oracle Application Server system. Additionally, any time stamped backups of the configuration files should be restored, if required.

Restoration of a middle-tier installation on a new node requires the restoration of Oracle system files, the middle-tier Oracle home, and configuration files. Because the host is new, the DCM-managed and non DCM-managed components have to be updated with the host information.

3.4 Middle-tier Applications

This section provides high availability information on middle-tier application components, which are:

3.4.1 Oracle Application Server Portal

An OracleAS Portal request's lifecycle is serviced by a number of OracleAS components. These are:

  • OracleAS Web Cache

  • Oracle HTTP Server and the following modules:

    • mod_oc4j (on middle and Infrastructure tiers)

    • mod_osso (on Infrastructure tier to access OracleAS Single Sign-On)

    • mod_plsql (on middle tier with OracleAS Portal DAD and Infrastructure tier with ORASSO DAD)

    • mod_oradav (on middle tier)

  • OC4J (the Portal Page Engine runs as a stateless servlet)

  • OracleAS Portal repository (contains OracleAS Portal schemas and also caches group memberships of users after their retrieval from Oracle Internet Directory)

  • OracleAS Single Sign-On

  • Oracle Internet Directory (including Oracle Delegated Administration Services and Oracle Directory Integration and Provisioning)

  • Various web and database portlet providers

In order for OracleAS Portal to be highly available, all these components must be highly available individually. Of particular importance is the availability of Oracle Identity Management because OracleAS Portal uses it for portlet security and management functions.

Reference the following table to find out where you can find high availability information for each of the components mentioned above.

Table 3-5 High availability information for components involved in an OracleAS Portal request

Component Where to find information
OracleAS Web Cache
See Section 3.1.1.1, "Oracle Application Server Web Cache Sub-Tier".
Oracle HTTP Server
See:

Section 3.2, "Highly Available Middle-tier Configuration Management Concepts"

Section 3.1.1.2, "Oracle HTTP Server Sub-Tier"

OC4J See:

Section 3.1.1.3, "Oracle Application Server Containers for J2EE Sub-Tier"

Note: The Portal Page Engine is stateless.

OracleAS Portal repository See:

Chapter 5, "Oracle Application Server Infrastructure High Availability" and Chapter 6, "Managing and Operating Infrastructure High Availability" of this book.

Oracle Application Server Portal Configuration Guide

OracleAS Single Sign-On
See:

Chapter 5, "Oracle Application Server Infrastructure High Availability" and Chapter 6, "Managing and Operating Infrastructure High Availability" of this book.

Oracle Identity Management Concepts and Deployment Planning Guide

Oracle Internet Directory
See:

Chapter 5, "Oracle Application Server Infrastructure High Availability" and Chapter 6, "Managing and Operating Infrastructure High Availability" of this book.

Oracle Internet Directory Administrator's Guide

Oracle Identity Management Concepts and Deployment Planning Guide

Web Provider See:

Section 3.2, "Highly Available Middle-tier Configuration Management Concepts"

Section 3.1.1.2, "Oracle HTTP Server Sub-Tier"

Section 3.1.1.3.1, "OracleAS Cluster (OC4J)"

Section 3.1.1.1, "Oracle Application Server Web Cache Sub-Tier" (OracleAS Web Cache could be providing access to the provider)

Database Providers For providers using mod_plsql, Oracle HTTP Server high availability is relevant. See Section 3.1.1.2, "Oracle HTTP Server Sub-Tier" and Section 3.2, "Highly Available Middle-tier Configuration Management Concepts".

For database high availability, see Chapter 5.

See also: Section 3.1.1.1, "Oracle Application Server Web Cache Sub-Tier" (OracleAS Web Cache could be providing access to the provider)


The following are several considerations when deploying OracleAS Portal for high availability:

3.4.1.1 Enabling Redundancy for OracleAS Portal

For redundancy, you can set up OracleAS Portal in a multiple middle-tier environment, front-ended by a load balancing router (LBR) to access the same OracleAS Metadata Repository.

The purpose of a Load Balancing Router (LBR) is to provide a single published address to the client tier and to front-end a farm of servers that actually service the requests, based on the distribution of the requests done by the LBR. The LBR itself is a very fast network device that can distribute Web requests to a large number of physical servers.

For full details on how you can configure multiple middle tiers using a LBR, see the section "Configuring Multiple Middle Tiers with a Load Balancing Router" in the Oracle Application Server Portal Configuration Guide.

3.4.1.2 Configuring Load Balancer Routers for OracleAS Portal

The purpose of a Load Balancing Router (LBR) is to provide a single published address to the client browsers, and provide a "farm" of Web servers which actually service the requests, based on the distribution of the requests done by the LBR. The LBR itself is a very fast network device which can distribute Web requests to a large number of physical servers.

If you want to install multiple OracleAS middle-tier servers to handle a large load, you could configure OracleAS Portal as illustrated in Figure 3-4.

Figure 3-4 Redundant configuration for OracleAS Portal

Description of lbr_conf.gif follows
Description of the illustration lbr_conf.gif

This example shows that the LBR balances the load between all three instances of the OracleAS Web Cache cluster. Each one of the OracleAS Web Cache clusters can in turn load balance any of the middle-tier servers which communicate with OracleAS Single Sign-On Server and OracleAS Portal.

In this example, assume that the three OracleAS Web Cache instances are wc1, wc2, and wc3, and the OracleAS middle-tier servers are svr1, svr2, and svr3. Hence, in the above example, wc1 can load balance svr1, svr2, as well svr3. wc2 and wc3 can also do the same.

All the OracleAS middle-tier servers must have Database Access Descriptor (DAD) entries for each of the databases. A good way to accomplish this is to have the middle-tier servers share a file system that contains the configuration information for the DADs, so that the OracleAS Portal instances can share cache files.

The important points to consider with this configuration include:

  • The Internet DNS maps the name www.myportal.com to the external IP address on the LBR.

  • The LBR performs load balancing of requests to www.myportal.com to svr1.company.com, svr2.company.com, and svr3.company.com, addressing the request to their IP addresses, but still containing www.myportal.com in the Host: field of the HTTP request.

  • Each of the middle-tier hosts accepts requests to www.myportal.com, and their httpd.conf files assert that name as the ServerName. Hence, the names svr1, svr2, and so on are not used.

  • Unless your LBR does port mapping, you should configure the internal servers to use the same ports as the LBR.

  • Optimal cache utilization can be realized by mounting a shared file system on which to write the cache files. If you decide not to have the middle-tier servers share a cache directory, caching will still work, but with a lower hit ratio

3.4.1.3 Session Binding for Web Clipping Portlet

The session binding feature in OracleAS Web Cache is used to bind user sessions to a given origin server to maintain state for a period of time. Although almost all components running in a default OracleAS Portal middle-tier are stateless, session binding is required for two reasons:

  • The Web Clipping Studio, used by both the OracleAS Web Clipping Portlet and the Web Page Data Source of OmniPortlet, uses HTTP session to maintain state, for which session binding must be enabled.

  • Enabling session binding forces all the user requests to go to a given OracleAS Portal middle-tier, resulting in a better cache hit ratio for the OracleAS Portal cache.

3.4.1.4 OracleAS Portal and OracleAS Web Cache

An OracleAS Portal installation includes an installation of OracleAS Web Cache. When deploying multiple OracleAS Portal middle-tiers, the OracleAS Web Cache installations should be configured as a Oracle Application Server Cluster (Web Cache).

This allows Invalidation Requests from an OracleAS Portal instance to be automatically sent to all OracleAS Web Cache instances in the OracleAS Cluster (Web Cache).


See Also:

Oracle Application Server Web Cache Administrator's Guide for information on configuring OracleAS Cluster (Web Cache).

3.4.2 Oracle Application Server Wireless

OracleAS Wireless runs as a native OC4J application. This means that availability and session replication are managed by OPMN and OracleAS Cluster (OC4J) state replication respectively. Additionally, the messaging servers run as standalone Java applications, which are managed by Application Server Control Console.

3.4.2.1 OracleAS Wireless Clustering Architecture

Each OracleAS Wireless server process which runs on a single Java Virtual Machine (JVM) is referred to as a node. Nodes within an OracleAS Cluster (OC4J) are capable of serving the same wireless applications, because the session for each client is replicated among all the nodes within an OracleAS Cluster (OC4J) in preparation for failover.

By default, the requests from the same client are always redirected to the same Wireless server process. If one process goes down, then the fault tolerance feature is supported for both stateful and stateless requests as follows:

  • Stateless Requests - Fault tolerance is achieved by redirecting the client to another working process.

  • Stateful Requests - The session state is propagated to the processes within the same OracleAS Cluster (OC4J), which enables another process in that same OracleAS Cluster (OC4J) to pick up the request from a given client if a failover occurs.

For detailed configuration steps, refer to the Oracle Application Server Wireless Administrator's Guide.

3.4.3 OracleAS Integration B2B

OracleAS Integration B2B employs several components from the Oracle Application Server stack at runtime. These include Oracle HTTP Server, Oracle Application Server Containers for J2EE, and OracleAS Metadata Repository. The OracleAS Integration B2B high availability configuration is depicted in Figure 3-5.

Figure 3-5 High availability configuration of OracleAS Integration B2B

Description of ashia022.gif follows
Description of the illustration ashia022.gif

In order for OracleAS Integration B2B services to be highly available, the following components must be highly available:

  • Oracle HTTP Server

  • OC4J transport servlet

  • OracleAS Integration B2B server JVM

  • OracleAS Infrastructure

For discussion purposes, the runtime architecture can be segmented into the following tiers:

If each of these tiers has active-active availability, the OracleAS Integration B2B service has active-active availability. Otherwise, if one of the tiers is active-passive, the OracleAS Integration B2B service is active-passive. For example, if the OracleAS Infrastructure tier uses the OracleAS Cold Failover Cluster (Infrastructure) configuration, the OracleAS Integration B2B service has active-passive availability.

Web Server and OC4J Tier

This tier consists of Oracle HTTP Server and the OC4J transport servlet instances. The servlets are deployed in OC4J containers and can utilize the high availability properties of the containers. They can be grouped together into OracleAS Clusters (OC4J) and be synchronized by DCM for consistent configuration. The OC4J instances are load balanced by mod_oc4j.

For active-active availability, the web server and OC4J tier is front-ended by a load balancer router appliance and/or OracleAS Web Cache. If OracleAS Web Cache is used, it should be configured into an OracleAS Cluster (Web Cache). Monitoring and automatic restart of OracleAS Web Cache, Oracle HTTP Server, and OC4J processes are performed by OPMN.

The transport servlets perform the tasks of forwarding requests and receiving responses from the OracleAS Integration B2B instances. The servlets do not maintain state for each request handled. They communicate with the OracleAS Integration B2B instances through Java RMI. Each instance of OracleAS Integration B2B is registered in the web.xml file of each of the OC4J containers hosting the transport servlets. The servlets forward requests to the OracleAS Integration B2B instances using the round-robin model. If any of the OracleAS Integration B2B instances fail, the servlets re-route requests to the next instance in the round-robin queue after a specified timeout period.

OracleAS Integration B2B Tier

The OracleAS Integration B2B tier consists of the OracleAS Integration B2B server runtime. This is a Java application, but its instances do not run in OC4J containers. They run in their own standalone JVM processes.

The OracleAS Integration B2B server has the following characteristics:

  • Its runtime is stateless for each request it processes. If a runtime process fails and a request message is not completely processed, the client is expected to retry the request. If the failure occurs after the initial message has been completely processed, all subsequent incomplete processing results are stored in the database, and any other runtime instances can resume processing. Each processing step is atomic.

  • It uses JDBC to access the OracleAS Metadata Repository to make changes to the OracleAS Integration B2B metadata schemas. High availability for JDBC connections is achieved by Oracle Net.

  • Only one runtime is instantiated from each OracleAS Integration B2B Oracle home.

  • Only one runtime instance exists for each OracleAS instance.

To ensure that the server has active-active availability, multiple instances of its runtime should be instantiated. Ideally, these instances should be deployed in more than one node to protect from node failure. For each instance, OPMN ensures that failure detection and automatic restart of each instance is managed.

Inbound communication to the OracleAS Integration B2B instances is received by the load balancer fronting the Oracle HTTP Servers. The load balancer distributes requests to the Oracle HTTP Server instances, which forwards the requests to the transport servlets via mod_oc4j load balancing. The transport servlets communicate the requests to the OracleAS Integration B2B instances using the RMI protocol.Outbound communication from the OracleAS Integration B2B instances occurs as follows. The instances send responses to the Oracle HTTP Servers, which are configured as proxy servers. This configuration can be accomplished by specifying the proxy host and port properties in the tip.properties file.

OracleAS Infrastructure Tier

High availability in the Infrastructure tier can be enabled by any of the high availability configurations for the OracleAS Infrastructure explained in Chapter 5, "Oracle Application Server Infrastructure High Availability". These configurations ensure that the OracleAS Metadata Repository and Oracle Identity Management components are highly available for the web server and OC4J, and OracleAS Integration B2B tiers. For active-active availability, one of the configurations described in Section 5.3.1, "Active-Active High Availability Solutions", should be used. This allows the entire OracleAS Integration B2B service stack to have active-active availability.

3.4.4 Oracle Application Server Integration InterConnect

OracleAS Integration InterConnect has a hub and spoke architecture. Figure 3-6 provides an overview of the OracleAS Integration InterConnect components integrating two spoke applications as an example.

Figure 3-6 OracleAS Integration InterConnect runtime components with two spoke application as an example

Description of ashia023.gif follows
Description of the illustration ashia023.gif

The OracleAS Integration InterConnect components are:

  • OracleAS Integration InterConnect Adapters

  • OracleAS Integration InterConnect Repository Server

  • OracleAS Integration InterConnect Hub database

For OracleAS Integration InterConnect to have high availability, all its components must be highly available. One additional requirement is for the data or message sources that provide information to the adapters to be highly available. These are the spoke applications. Because these applications are customer-dependent and not part of the Oracle Application Server product, their high availability discussion is outside the scope of this book.

For the purpose of high availability discussion, the OracleAS Integration InterConnect components can be segmented into the following tiers:

The following sections provide details on how high availability can be achieved for each tier.


See Also:

OracleAS Integration InterConnect documentation for detailed information about OracleAS Integration InterConnect components.

Adapter Tier

Except for the HTTP adapter, each adapter runs in a standalone JVM process (not OC4J) and is stateless. This JVM process can be configured as a custom OPMN application to achieve process failure detection and automatic restart. The custom application can be configured in the opmn.xml file. Refer to the Oracle Process Manager and Notification Server Administrator's Guide for instructions on how to do this. After the configuration, the adapter processes should be started using OPMN (opmnctl command).

OPMN only monitors and restarts individual processes. In order for the adapter tier to be fully redundant, multiple adapter processes are required. The adapters can be set up using an active-active or active-passive approach:

  • Active-Active

    Multiple active adapter processes can be deployed either on the same machine or separate machines. The adapter processes process incoming messages from the spoke application and deliver messages to the hub database concurrently. In the event that one adapter process fails, messages are delivered to the surviving process or processes. The adapters coordinate with each other to balance their workload form the spoke application.

  • Active-Passive

    Two adapter processes can be deployed in a cold failover cluster configuration to achieve active-passive availability.

    In a cold failover cluster, two machines can be clustered together using clusterware such as HP MC/Service Guard or Sun Cluster. This type of clustering is a commonly used solution for making adapters highly available. One node of the cluster is "cold", passively waiting to take over in the event of a failure, while the other is "hot", or actively running the adapter software. When the "hot" or "active" node fails, the clusterware restarts the software on the cold node to bring the adapter back online. Figure 3-6 shows a cold failover cluster for the adapters.

If the hub database is a Real Application Clusters database, the adapters are enabled to work with the multiple database instances in the Real Application Clusters. Real Application Clusters technology provides consistent and uninterrupted service without having to restart the adapters if a database instance fails. The adapters connect to the first of the listed available nodes in the adapter.ini or hub.ini files. If one of the Real Application Clusters nodes fails, the database connection is established with the next available node in the adapter.ini or hub.ini file recursively until a successful connection. Failover is transparent to the spoke application. Refer to the "Hub Database Tier" section below for more information on how the adapter process can be made aware of Real Application Clusters hub database instances.


See Also:

OracleAS Integration InterConnect adapters installation documentation for details on adapter.ini and hub.ini files associated with specific adapters.

High availability specific to each adapter type can be achieved as follows:

  • Database Adapter

    You can deploy multiple database adapter instances serving the same application (Figure 3-7 shows an example). Since the adapters are stateless, they share tasks coming from their spoke application, the customer database. Connection failure handling to a database is done by rolling back unfinished transactions and retrying other data sources. The same connection failure handling mechanism is also used for JMS communication (JMS over JDBC) to the Advanced Queue in the hub database.

    Figure 3-7 Example of multiple database adapter instances (showing only single spoke)

    Description of ashia024.gif follows
    Description of the illustration ashia024.gif

    After an application is designed, it can be deployed to the database adapters when the adapters are first started. Upon initial startup, the adapters fetch metadata from the hub database through the OracleAS Integration InterConnect Repository Server, similar to the way OracleAS Integration InterConnect iStudio accesses the data at design time. Once these metadata are retrieved by the adapters, they are cached locally on a file-based cache. Thus, subsequent adapter startup does not need to access the Repository Server.

    At run time, the database adapters access the customer database through JDBC. There can be multiple JDBC data sources, which the adapters iterate through should a connection fail. Tasks from the customer database are processed by all database adapter instances in coordination so that a task is only processed once. In order to have readily available data from the customer database, the database should be Real Application Clusters-enabled or be in a cold failover configuration (applies to the hub database as well). The adapters communicate with the Repository Server at run time to implement the Xref feature.

  • HTTP adapter:

    The HTTP adapter consists of a standalone Java application, an OC4J transport servlet, and Oracle HTTP Server. The Java application implements the HTTP adapter logic and communicates with a servlet in OC4J using the RMI protocol. Each Java application process communicates with only one OC4J process. The Oracle HTTP Server is required to communicate with spoke applications over HTTP.

    For high availability, more than one set of Oracle HTTP Server, OC4J, and Java application process should be deployed on redundant nodes. mod_oc4j load balancing can be used to distribute requests between Oracle HTTP Server and OC4J instances across nodes. But communication between OC4J instances and the Java application processes is one-to-one. A load balancer router can be deployed in front of the Oracle HTTP Server instances to distribute requests to these instances.

    The HTTP adapter Java application also communicates with the hub database and Repository Server. The Java application works with these components the same way as the database adapter as described above. The hub database should be made highly available through using a Real Application Clusters database or cold failover cluster configuration. The Repository Server can be made highly available using a cold failover cluster configuration.

    In the event a HTTP adapter process fails, an inbound message to the adapter process (servlet to adapter) can be lost if the message is still in the transport or RMI layer. But once the message arrives at the adapter agent layer, the message is persisted and can be picked up later when the adapter is restarted. Also, the transport servlet will not be able to enqueue any other messages to the adapter process as the RMI server fails with the adapter process. These messages in the OC4J process will not be processed as the transport servlet's doPost() method will respond with an error message stating that the RMI server is unavailable.

  • FTP/SMTP adapter:

    To achieve high availability for the FTP/SMTP adapter, multiple adapter instances can be deployed on separate machines with a load balancer routing requests to them. Since adapters process messages atomically and are stateless, if any one of the adapter instances fail, the redundant deployment allows the failure to be transparent to senders and recipients of messages.

  • MQ/AQ adapter:

    High availability specifics of this adapter are similar to those of the database adapter. This is because access to the MQ/AQ database is also through JDBC (JMS). Refer to the database adapter description above.

  • File Adapter:

    For the file adapter to achieve high availability, multiple adapter instances are required to access a network file system. If one adapter instance fails, another instance can process the requests for the failed instance.

  • OEM Adapters:

    The OEM adapter model is similar to that of the HTTP adapter, that is, it has an OC4J transport servlet, Oracle HTTP Server, and a standalone Java application. The Java application implements the adapter logic and communicates with a servlet in OC4J using the RMI protocol. Each Java application process communicates with only one OC4J process. The Oracle HTTP Server is required to communicate with spoke applications over HTTP.

Repository Server Tier

This tier consists of the Repository Server instance. Only a single instance can be actively running at any one time. Hence, the Repository Server can be deployed in a two-node cold failover cluster configuration with the nodes using shared storage. This configuration provides for node-level failover.

For Repository Server process high availability, the process can be configured as a custom application for OPMN in the opmn.xml file. This allows OPMN to monitor and automatically restart the Repository Server process if it fails. After the modification, the Repository Server process should be started using OPMN (opmnctl command).

The repository server is only used at run time for the Xref feature. Otherwise, it is only needed during design time and deployment time, when adapters are first started and fetch application metadata from the hub database).

Hub Database Tier

The hub database can be any database, including the OracleAS Metadata Repository database. It stores OracleAS Integration InterConnect metadata such as application view and common view formats. iStudio accesses the hub database at design time through RMI via the Repository Server, which is a JVM process. The Repository Server can communicate with multiple hub database instances as multiple JDBC data sources. Internally, the Repository Server iteratively retries each data source with timeouts.

The OracleAS Integration InterConnect hub database can be made highly available by using Real Application Clusters. The following are some guidelines:

  • Enable the Repository Server process to be aware of the Real Application Clusters hub database instances.

    The Repository Server process can be made aware of the Real Application Clusters database instances by specifying the list of available nodes hosting the database instances. Specifically, enter the host, port, and instance information of all the nodes in the repository.ini or hub.ini file. If a Real Application Clusters node connected to the Repository Server process fails, then the next node entry in the repository.ini and hub.ini file is used.

  • Enable the adapter processes to be aware of the Real Application Clusters hub database instances.

    The adapter processes can be made aware of the Real Application Clusters database instances by specifying the list of available nodes hosting the database instances. Specifically, enter the host, port, and instance information of all the nodes in the adapter.ini file. If a node connected to an adapter process fails, the next node entry in the adapter.ini file is used.

    The hub connections of all the OracleAS Integration InterConnect adapters and the spoke connections of the database and AQ adapters support Real Application Clusters.


See Also:

OracleAS Integration InterConnect adapters installation documentation for details on adapter.ini and hub.ini files associated with specific adapters.

3.4.5 Oracle Business Intelligence Discoverer

Web connections to OracleBI Discoverer Server are managed through the Discoverer servlet. The servlet is responsible for brokering between the client and a Discoverer session component that then manages the actual transactions. Discoverer session components are initiated and managed by the OAD (Object Activation Daemon) Each machine has an OAD that manages its own Discoverer sessions. The OAD and session component are both monitored and managed by OPMN.

Oracle Business Intelligence Discoverer can be configured for high availability in the following ways:


See Also:

The chapter on installing in a multi machine environment in the Oracle Business Intelligence Discoverer Configuration Guide for multi machine considerations and pre-requisites for providing load balancing for OracleBI Discoverer.

3.4.5.1 Oracle Business Intelligence Discoverer Preferences Server

The OracleBI DiscovererPreference Server stores individual user preferences across sessions. It is managed, like the session server, by the OAD. In a multiple machine environment, distributed session servers can be configured to access one centrally located OracleBI Discoverer Preferences Server. The latter is monitored and managed by OPMN.

The OracleBI Discoverer Preferences Server can be made highly available by deploying multiple instances that are fronted and serviced by a load balancer router and/or OracleAS Web Cache. Several considerations should be noted for managing session information for this scenario.

When deploying multiple OracleBI Discoverer middle-tiers behind a load balancer, there are two options for configuring the OracleBI Discoverer Preferences Server such that user preferences are consistent across a session:

  1. Enable session binding either in the load balancer router or in the OracleAS Web Cache tier. This will ensure that a particular user will always be directed to the machine where their local preferences are stored. For details on configuring session binding for your load balancer router, refer to instructions from your particular load balancer hardware vendor. For configuring OracleAS Web Cache session binding, see the Oracle Application Server Web Cache Administrator's Guide.

  2. Configure all the OracleBI Discoverer Servers to share a single preference server. This ensures that all user preferences are centralized, although, all preference information is now dependent on the availability of one machine.


Note:

For instructions on how to configure a centralized OracleBI Discoverer Preferences Server, see the chapter on installing in a multi machine environment in the Oracle Business Intelligence Discoverer Configuration Guide.

Protecting the OracleBI Discoverer Preferences Server

Loss of either the machine which hosts the OracleBI Discoverer Preference Server or the information stored on that server will not impact availability of OracleBI Discoverer. It will, however, mean that users will lose their stored preferences information.To limit the loss of preferences information, the data on the OracleBI Discoverer Preferences Server should be backed up regularly, in particular, the file <ORACLE_HOME>/discoverer/util/pref.txt. This file holds the actual preferences information.