Clustering Best Practices

The following topics recommend design and deployment practices that maximize the scalability, reliability, and performance of applications hosted by a WebLogic Server cluster:

General Design Considerations

The following sections describe general design guidelines for clustered applications.

Strive for Simplicity

Distributed systems are complicated by nature. For a variety of reasons, make simplicity a primary design goal. Minimize “moving parts” and do not distribute algorithms across multiple objects.

Minimize Remote Calls

You improve performance and reduce the effects of failures by minimizing remote calls.

Session Facades Reduce Remote Calls

Avoid accessing EJB entity beans from client or servlet code. Instead, use a session bean, referred to as a facade, to contain complex interactions and reduce calls from web applications to RMI objects. When a client application accesses an entity bean directly, each getter method is a remote call. A session facade bean can access the entity bean locally, collect the data in a structure, and return it by value.

Transfer Objects Reduce Remote Calls

EJBs consume significant system resources and network bandwidth to execute—they are unlikely to be the appropriate implementation for every object in an application.

Use EJBs to model logical groupings of an information and associated business logic. For example, use an EJB to model a logical subset of the line items on an invoice—for instance, items to which discounts, rebates, taxes, or other adjustments apply.

In contrast, an individual line item in an invoice is fine-grained—implementing it as an EJB wastes network resources. Implement objects that simply represents a set of data fields, which require only get and set functionality, as transfer objects.

Transfer objects (sometimes referred to as value objects or helper classes) are good for modeling entities that contain a group of attributes that are always accessed together. A transfer object is a serializable class within an EJB that groups related attributes, forming a composite value. This class is used as the return type of a remote business method.

Clients receive instances of this class by calling coarse-grained business methods, and then locally access the fine-grained values within the transfer object. Fetching multiple values in one server round-trip decreases network traffic and minimizes latency and server resource usage.

Distributed Transactions Increase Remote Calls

Avoid transactions that span multiple server instances. Distributed transactions issue remote calls and consume network bandwidth and overhead for resource coordination.

Web Application Design Considerations

The following sections describe design considerations for clustered servlets and JSPs.

Configure In-Memory Replication

Design for Idempotence

Failures or impatient users can result in duplicate servlet requests. Design servlets to tolerate duplicate requests.

Programming Considerations

EJB Design Considerations

The following sections describe design considerations for clustered RMI objects.

Design Idempotent Methods

It is not always possible to determine when a server instance failed with respect to the work it was doing at the time of failure. For instance, if a server instance fails after handling a client request but before returning the response, there is no way to tell that the request was handled. A user that does not get a response retries, resulting in an additional request.

Failover for RMI objects requires that methods be idempotent. An idempotent method is one that can be repeated with no negative side-effects.

Follow Usage and Configuration Guidelines

The following table summarizes usage and configuration guidelines for EJBs. For a list of configurable cluster behaviors, see Table 11-2.

Object Type

Usage

Configuration

EJBs of all types

Use EJBs to model logical groupings of an information and associated business logic. See Transfer Objects Reduce Remote Calls.

Configure clusterable homes

Stateful session beans

Recommended for high volume, heavy-write transactions.

Remove stateful session beans when finished to minimize EJB container overhead. A stateful session bean instance is associated with a particular client, and remains in the container until explicitly removed by the client, or removed by the container when it times out. Meanwhile, the container may passivate inactive instances to disk. This consumes overhead and can affect performance.

Note:

Although unlikely, the current state of a stateful session bean can be lost. For example, if a client commits a transaction involving the bean and there is a failure of the primary server before the state change is replicated, the client will fail over to the previously-stored state of the bean. If it is critical to preserve bean state in all possible failover scenarios, use an entity EJB rather than a stateful session EJB.

Configure clusterable homes

Configure in-memory replication for EJBs

Stateless Session Beans

Scale better than stateful session beans which are instantiated on a per client basis, and can multiply and consume resources rapidly.

When a home creates a stateless bean, it returns a replica-aware stub that can route to any server where the bean is deployed. Because a stateless bean holds no state on behalf of the client, the stub is free to route any call to any server that hosts the bean.

Configure clusterable homes.

Configure Cluster Address.

Configure methods to be idempotence to support failover during method calls. (Failover is default behavior if failure occurs between method calls.or if the method fails to connect to a server).

The methods on stateless session bean homes are automatically set to be idempotent. It is not necessary to explicitly specify them as idempotent.

Read-only Entity Beans

Recommended whenever stale data is tolerable—suitable for product catalogs and the majority of content within many applications. Reads are performed against a local cache that is invalided on a timer basis. Read-only entities perform three to four times faster than transactional entities.

Note:

A client can successfully call setter methods on a read-only entity bean, however the data will never be moved into the persistent store.

Configure clusterable homes.

Configure Cluster Address.

Methods are configured to be idempotent by default.

Read-Write Entity Beans

Best suited for shared persistent data that is not subject to heavy request and update.If the access/update load is high, consider session beans and JDBC.

Recommended for applications that require high data consistency, for instance, customer account maintenance. All reads and writes are performed against the database.

Use the isModified method to reduce writes.

For read-mostly applications, characterized by frequent reads, and occasional updates (for instance, a catalog)—a combination of read-only and read-write beans that extend the read-only beans is suitable. The read-only bean provides fast, weakly consistent reads, while the read-write bean provides strongly consistent writes.

Configure clusterable homes

Configure methods to be idempotence to support failover during method calls. (Failover is default behavior if failure occurs between method calls.or if the method fails to connect to a server).

The methods on read-only entity beans are automatically set to be idempotent.

Cluster-Related Configuration Options

The following table lists key behaviors that you can configure for a cluster, and the associated method of configuration.

Table 11-2 Cluster-Related Configuration Options
Configurable Behavior or Resource	How to Configure
clusterable homes	Set `home-is-clusterable` in `weblogic-ejb-jar.xml` to “true.
idempotence	At bean level, set `stateless-bean-methods-are-idempotent` in `weblogic-ejb-jar.xml` to “true”. At method level, set `idempotent-methods` in `weblogic-ejb-jar.xml`
in-memory replication for EJBs	Set `replication-type` in `weblogic-ejb-jar.xml` to “InMemory”.
Cluster Address	The cluster address identifies the Managed Servers in the cluster. The cluster address is used in entity and stateless beans to construct the host name portion of URLs. The cluster address can be assigned explicitly, or generated automatically by WebLogic Server for each request. For more information, see Cluster Address.
`clients-on-same- server`	Set `clients-on-same-server` in `weblogic-ejb-jar.xml` to “True” if all clients that will access the EJB will do so from the same server on which the bean is deployed. If `clients-on-same-server` is “True” the server instance will not multicast JNDI announcements for the EJB when it is deployed, hence reducing the startup time for a large clusters.
Load balancing algorithm for entity bean and entity EJBs homes	`home-load-algorithm` in `weblogic-ejb-jar.xml` specifies the algorithm to use for load balancing between replicas of the EJB home. If this element is not defined, WebLogic Server uses the algorithm specified by the `weblogic.cluster.defaultLoadAlgorithm` attribute in `config.xml`.
Custom load balancing for entity EJBs, stateful session EJBs, and stateless session	Use `home-call-router-class-name` in `weblogic-ejb-jar.xml` to specify the name of a custom class to use for routing bean method calls for these types of beans. This class must implement `weblogic.rmi.cluster. CallRouter()`. For more information, see The WebLogic Cluster API.
Custom load balancing for stateless session bean	Use `stateless-bean-call-router-class-name` in `weblogic-ejb-jar.xml` to specify the name of a custom class to use for routing stateless session bean method calls. This class must implement `weblogic.rmi.cluster.CallRouter()`. For more information, see The WebLogic Cluster API.
Configure stateless session bean as clusterable	Set `stateless-bean-is-clusterable` in `weblogic-ejb-jar.xml` to “true” to allow the EJB to be deployed to a cluster.
Load balancing algorithm for stateless session beans.	Use `stateless-bean-load-algorithm` in `weblogic-ejb-jar.xml` to specify the algorithm to use for load balancing between replicas of the EJB home. If this property is not defined, WebLogic Server uses the algorithm specified by the `weblogic.cluster.defaultLoadAlgorithm` attribute in `config.xml`.
Machine	The WebLogic Server Machine resource associates server instances with the computer on which it runs. For more information, see Configure Machine Names.
Replication groups	Replication groups allow you to control where HTTP session states are replicated. For more information, see Configure Replication Groups

State Management in a Cluster

Different services in a WebLogic Server cluster provide varying types and degrees of state management. This list defines four categories of service that are distinguished by how they maintain state in memory or persistent storage:

Stateless services—A stateless service does not maintain state in memory between invocations.
Conversational services—A conversational service is dedicated to a particular client for the duration of a session. During the session, it serves all requests from the client, and only requests from that client. Throughout a session there is generally state information that the the application server must maintain between requests. Conversational services typically maintain transient state in memory, which can be lost in the event of failure. If session state is written to a shared persistent store between invocations, the service is stateless. If persistent storage of state is not required, alternatives for improving performance and scalability include:

Session state can be sent back and forth between the client and server under the covers, again resulting in a stateless service. This approach is not always feasible or desirable, particularly with large amounts of data.
More commonly, session state may be retained in memory on the application server between requests. Session state can be paged out from memory as necessary to free up memory. Performance and scalability are still improved in this case because updates are not individually written to disk and the data is not expected to survive server failures.

Cached services—A cached service maintains state in memory and uses it to process requests from multiple clients. Implementations of cached services vary in the extent to which they keep the copies of cached data consistent with each other and with associated data in the backing store.
Singleton services—A singleton service is active on exactly one server in the cluster at a time and processes requests from multiple clients. A singleton service is generally backed by private, persistent data, which it caches in memory. It may also maintain transient state in memory, which is either regenerated or lost in the event of failure. Upon failure, a singleton service must be restarted on the same server or migrated to a new server.

Table 11-3 summarizes how Java EE and WebLogic support different each of these categories of service.

Application Deployment Considerations

Deploy clusterable objects to the cluster, rather than to individual Managed Servers in the cluster. For information and recommendations, see Deploying Applications to WebLogic Server.

Architecture Considerations

For information about alternative cluster architectures, load balancing options, and security options, see Cluster Architectures.

Avoiding Problems

The following sections present considerations to keep in mind when planning and configuring a cluster.

Naming Considerations

Administration Server Considerations

To start up WebLogic Server instances that participate in a cluster, each Managed Server must be able to connect to the Administration Server that manages configuration information for the domain that contains the cluster. For security purposes, the Administration Server should reside within the same DMZ as the WebLogic Server cluster.

The Administration Server maintains the configuration information for all server instances that participate in the cluster. The config.xml file that resides on the Administration Server contains configuration data for all clustered and non-clustered servers in the Administration Server’s domain. You do not create a separate configuration file for each server in the cluster.

The Administration Server must be available in order for clustered WebLogic Server instances to start up. Note, however, that once a cluster is running, a failure of the Administration Server does not affect ongoing cluster operation.

The Administration Server should not participate in a cluster. The Administration Server should be dedicated to the process of administering servers: maintaining configuration data, starting and shutting down servers, and deploying and undeploying applications. If the Administration Server also handles client requests, there is a risk of delays in accomplishing administration tasks.

There is no benefit in clustering an Administration Server; the administrative objects are not clusterable, and will not failover to another cluster member if the administrative server fails. Deploying applications on an Administration Server can reduce the stability of the server and the administrative functions it provides. If an application you deploy on the Administration Server behaves unexpectedly, it could interrupt operation of the Administration Server.

For these reasons, make sure that the Administration Server’s IP address is not included in the cluster-wide DNS name.

Firewall Considerations

If your configuration includes a firewall, locate your proxy server or load-balancer in your DMZ, and the cluster, both Web and EJB containers, behind the firewall. Web containers in DMZ are not recommended. See Basic Firewall for Proxy Architectures.

If you place a firewall between the servlet cluster and object cluster in a multi-tier architecture, bind all servers in the object cluster to public DNS names, rather than IP addresses. Binding those servers with IP addresses can cause address translation problems and prevent the servlet cluster from accessing individual server instances.

If the internal and external DNS names of a WebLogic Server instance are not identical, use the ExternalDNSName attribute for the server instance to define the server's external DNS name. Outside the firewall the ExternalDNSName should translate to external IP address of the server. Set this attribute in the Administration Console using the Server—>Configuration—>General tab. See Server—>Configuration—>General in Administration Console Online Help.

In any cluster architecture that utilizes one or more firewalls, it is critical to identify all WebLogic Server instances using publicly-available DNS names, rather than IP addresses. Using DNS names avoids problems associated with address translation policies used to mask internal IP addresses from untrusted clients.

The following figure describes the potential problem with using IP addresses to identify WebLogic Server instances. In this figure, the firewall translates external IP requests for the subnet “xxx” to internal IP addresses having the subnet “yyy.”

The following steps describe the connection process and potential point of failure:

If there was no translation between external and internal IP addresses, the firewall would pose no problems to the client in the above scenario. However, most security policies involve hiding (and denying access to) internal IP addresses.

Evaluate Cluster Capacity Prior to Production Use

The architecture of your cluster will influence the capacity of your system. Before deploying applications for production use, evaluate performance to determine if and where you may need to add servers or server hardware to support real-world client loads. Testing software such as LoadRunner from Mercury Interactive allows you to simulate heavy client usage.

Using Clusters

Clustering Best Practices

General Design Considerations

Strive for Simplicity

Minimize Remote Calls

Session Facades Reduce Remote Calls

Transfer Objects Reduce Remote Calls

Distributed Transactions Increase Remote Calls

Web Application Design Considerations

Configure In-Memory Replication

Design for Idempotence

Programming Considerations

EJB Design Considerations

Design Idempotent Methods

Follow Usage and Configuration Guidelines

Cluster-Related Configuration Options

State Management in a Cluster

Application Deployment Considerations

Architecture Considerations

Avoiding Problems

Naming Considerations

Administration Server Considerations

Firewall Considerations

Evaluate Cluster Capacity Prior to Production Use