Sun Java System Application Server 9.1 Deployment Planning Guide

Chapter 2 Planning your Deployment

Before deploying the Application Server, first determine the performance and availability goals, and then make decisions about the hardware, network, and storage requirements accordingly.

This chapter contains the following sections:

Establishing Performance Goals

At its simplest, high performance means maximizing throughput and reducing response time. Beyond these basic goals, you can establish specific goals by determining the following:

What types of applications and services are deployed, and how do clients access them?
Which applications and services need to be highly available?
Do the applications have session state or are they stateless?
What request capacity or throughput must the system support?
How many concurrent users must the system support?
What is an acceptable average response time for user requests?
What is the average think time between requests?

You can calculate some of these metrics using a remote browser emulator (RBE) tool, or web site performance and benchmarking software that simulates expected application activity. Typically, RBE and benchmarking products generate concurrent HTTP requests and then report the response time for a given number of requests per minute. You can then use these figures to calculate server activity.

The results of the calculations described in this chapter are not absolute. Treat them as reference points to work against, as you fine-tune the performance of the Application Server and your applications.

This section discusses the following topics:

Estimating Throughput

In broad terms, throughput measures the amount of work performed by Application Server. For Application Server, throughput can be defined as the number of requests processed per minute per server instance. High availability applications also impose throughput requirements on HADB, since they save session state data periodically. For HADB, throughput can be defined as volume of session data stored per minute, which is the product of the number of HADB requests per minute, and the average session size per request.

As described in the next section, Application Server throughput is a function of many factors, including the nature and size of user requests, number of users, and performance of Application Server instances and back-end databases. You can estimate throughput on a single machine by benchmarking with simulated workloads.

High availability applications incur additional overhead because they periodically save data to HADB. The amount of overhead depends on the amount of data, how frequently it changes, and how often it is saved. The first two factors depend on the application in question; the latter is also affected by server settings.

HADB throughput can be defined as the number of HADB requests per minute multiplied by the average amount of data per request. Larger throughput to HADB implies that more HADB nodes are needed and a larger store size.

Estimating Load on Application Server Instances

Consider the following factors to estimate the load on Application Server instances:

Maximum Number of Concurrent Users

Users interact with an application through a client, such as a web browser or Java program. Based on the user’s actions, the client periodically sends requests to the Application Server. A user is considered active as long as the user’s session has neither expired nor been terminated. When estimating the number of concurrent users, include all active users.

The following figure illustrates a typical graph of requests processed per minute (throughput) versus number of users. Initially, as the number of users increases, throughput increases correspondingly. However, as the number of concurrent requests increases, server performance begins to saturate, and throughput begins to decline.

Identify the point at which adding concurrent users reduces the number of requests that can be processed per minute. This point indicates when optimal performance is reached and beyond which throughput start to degrade. Generally, strive to operate the system at optimal throughput as much as possible. You might need to add processing power to handle additional load and increase throughput.

Think Time

A user does not submit requests continuously. A user submits a request, the server receives and processes the request, and then returns a result, at which point the user spends some time before submitting a new request. The time between one request and the next is called think time.

Think times are dependent on the type of users. For example, machine-to-machine interaction such as for a web service typically has a lower think time than that of a human user. You may have to consider a mix of machine and human interactions to estimate think time.

Determining the average think time is important. You can use this duration to calculate the number of requests that need to be completed per minute, as well as the number of concurrent users the system can support.

Average Response Time

Response time refers to the amount of time Application Server takes to return the results of a request to the user. The response time is affected by factors such as network bandwidth, number of users, number and type of requests submitted, and average think time.

In this section, response time refers to the mean, or average, response time. Each type of request has its own minimal response time. However, when evaluating system performance, base the analysis on the average response time of all requests.

The faster the response time, the more requests per minute are being processed. However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines, as the following diagram illustrates:

Sorry: graphics not available in this version of the manual.

A system performance graph similar to this figure indicates that after a certain point, requests per minute are inversely proportional to response time. The sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).

In the figure, the point of the peak load is the point at which requests per minute start to decline. Prior to this point, response time calculations are not necessarily accurate because they do not use peak numbers in the formula. After this point, (because of the inversely proportional relationship between requests per minute and response time), the administrator can more accurately calculate response time using maximum number of users and requests per minute.

Use the following formula to determine T_response, the response time (in seconds) at peak load:

T_response = n/r - T_think

where

n is the number of concurrent users
r is the number requests per second the server receives
T_think is the average think time (in seconds)

To obtain an accurate response time result, always include think time in the equation.

Example 2–1 Calculation of Response Time

If the following conditions exist:

Maximum number of concurrent users, n, that the system can support at peak load is 5,000.
Maximum number of requests, r, the system can process at peak load is 1,000 per second.

Average think time, T_think, is three seconds per request.

Thus, the calculation of response time is:

T_response = n/r - T_think = (5000/ 1000) - 3 sec. = 5 - 3 sec.

Therefore, the response time is two seconds.

After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application. Response time, along with throughput, is one of the main factors critical to the Application Server performance.

Requests Per Minute

If you know the number of concurrent users at any given time, the response time of their requests, and the average user think time, then you can calculate the number of requests per minute. Typically, start by estimating the number of concurrent users that are on the system.

For example, after running web site performance software, the administrator concludes that the average number of concurrent users submitting requests on an online banking web site is 3,000. This number depends on the number of users who have signed up to be members of the online bank, their banking transaction behavior, the time of the day or week they choose to submit requests, and so on.

Therefore, knowing this information enables you to use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Since requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute is acceptable as a trade-off for better response time, or alternatively, if a slower response time is acceptable as a trade-off for more requests per minute.

Experiment with the requests per minute and response time thresholds that are acceptable as a starting point for fine-tuning system performance. Thereafter, decide which areas of the system require adjustment.

Solving for r in the equation in the previous section gives:

r = n/(T_response + T_think)

Example 2–2 Calculation of Requests Per Second

For the values:

n = 2,800 concurrent users
T_response = 1 (one second per request average response time)
T_think = 3, (three seconds average think time)

The calculation for the number of requests per second is:

r = 2800 / (1+3) = 700

Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.

Estimating Load on the HADB

To calculate load on the HADB, consider the following factors:

For instructions on configuring session persistence, see Chapter 8, Configuring High Availability Session Persistence and Failover, in Sun Java System Application Server 9.1 High Availability Administration Guide.

HTTP Session Persistence Frequency

The number of requests per minute received by the HADB depends on the persistence frequency. Persistence Frequency determines how often Application Server saves HTTP session data to the HADB.

The persistence frequency options are:

web-method (default): the server stores session data with every HTTP response. This option guarantees that stored session information will be up to date, but leads to high traffic to HADB.
time-based: the session is stored at the specified time interval. This option reduces the traffic to HADB, but does not guarantee that the session information will be up to date.

The following table summarizes the advantages and disadvantages of persistence frequency options.

Table 2–1 Comparison of Persistence Frequency Options


Persistence Frequency Option	Advantages	Disadvantages
web-method	Guarantees that the most up-to-date session information is available.	Potentially increased response time and reduced throughput.
time-based	Better response time and potentially better throughput.	Less guarantee that the most updated session information is available after the failure of an application server instance.

HTTP Session Size and Scope

The session size per request depends on the amount of session information stored in the session.

Tip –

To improve overall performance, reduce the amount of information in the session as much as possible.

It is possible to fine-tune the session size per request through the persistence scope settings. Choose from the following options for HTTP session persistence scope:

session: The server serializes and saves the entire session object every time it saves session information to HADB.
modified-session: The server saves the session only if the session has been modified. It detects modification by intercepting calls to the bean’s setAttribute() method. This option will not detect direct modifications to inner objects, so in such cases the SFSB must be coded to call setAttribute() explicitly.
modified-attribute: The server saves only those attributes that have been modified (inserted, updated, or deleted) since the last time the session was stored. This has the same drawback as modified-session but can significantly reduce HADB write throughput requirements if properly applied.

To use this option, the application must:

Call setAttribute() or removeAttribute() every time it modifies session state.
Make sure there are no cross references between attributes.
Distribute the session state across multiple attributes, or at least between a read-only attribute and a modifiable attribute.

The following table summarizes the advantages and disadvantages of the persistence scope options.

Table 2–2 Comparison of Persistence Scope Options


Persistence Scope Option	Advantages	Disadvantages
modified-session	Provides improved response time for requests that do not modify session state.	During the execution of a web method, typically `doGet()` or `doPost(`), the application must call a session method: `setAttribute()` if the attribute was changed `removeAttribute()` if the attribute was removed.
session	No constraint on applications.	Potentially poorer throughput and response time as compared to the `modified-session` and the `modified-attribute` options.
modified-attribute	Better throughput and response time for requests in which the percentage of session state modified is low.	As the percentage of session state modified for a given request nears 60%, throughput and response time degrade. In such cases, the performance is worse than the other options because of the overhead of splitting the attributes into separate records.

Stateful Session Bean Checkpointing

For SFSB session persistence, the load on HADB depends on the following:

Number of SFSBs enabled for checkpointing.
Which SFSB methods are selected for checkpointing, and how often they are used.
Size of the session object.
Which methods are transactional.

Checkpointing generally occurs after any transaction involving the SFSB is completed (even if the transaction rolls back).

For better performance, specify a small set of methods for checkpointing. The size of the data that is being checkpointed and the frequency of checkpointing determine the additional overhead in response time for a given client interaction.

Planning the Network Configuration

When planning how to integrate the Application Server into the network, estimate the bandwidth requirements and plan the network in such a way that it can meet users’ performance requirements.

The following topics are covered in this section:

Estimating Bandwidth Requirements

To decide on the desired size and bandwidth of the network, first determine the network traffic and identify its peak. Check if there is a particular hour, day of the week, or day of the month when overall volume peaks, and then determine the duration of that peak.

During peak load times, the number of packets in the network is at its highest level. In general, if you design for peak load, scale your system with the goal of handling 100 percent of peak volume. Bear in mind, however, that any network behaves unpredictably and that despite your scaling efforts, it might not always be able handle 100 percent of peak volume.

For example, assume that at peak load, five percent of users occasionally do not have immediate network access when accessing applications deployed on Application Server. Of that five percent, estimate how many users retry access after the first attempt. Again, not all of those users might get through, and of that unsuccessful portion, another percentage will retry. As a result, the peak appears longer because peak use is spread out over time as users continue to attempt access.

Calculating Bandwidth Required

Based on the calculations made in Establishing Performance Goals, determine the additional bandwidth required for deploying the Application Server at your site.

Depending on the method of access (T-1 lines, ADSL, cable modem, and so on), calculate the amount of increased bandwidth required to handle your estimated load. For example, suppose your site uses T-1 or higher-speed T-3 lines. Given their bandwidth, estimate how many lines are needed on the network, based on the average number of requests generated per second at your site and the maximum peak load. Calculate these figures using a web site analysis and monitoring tool.

Example 2–3 Calculation of Bandwidth Required

A single T-1 line can handle 1.544 Mbps. Therefore, a network of four T-1 lines can handle approximately 6 Mbps of data. Assuming that the average HTML page sent back to a client is 30 kilobytes (KB), this network of four T-1 lines can handle the following traffic per second:

6,176,000 bits/8 bits = 772,000 bytes per second

772,000 bytes per second/30 KB = approximately 25 concurrent response pages per second.

With traffic of 25 pages per second, this system can handle 90,000 pages per hour (25 x 60 seconds x 60 minutes), and therefore 2,160,000 pages per day maximum, assuming an even load throughout the day. If the maximum peak load is greater than this, increase the bandwidth accordingly.

Estimating Peak Load

Having an even load throughout the day is probably not realistic. You need to determine when the peak load occurs, how long it lasts, and what percentage of the total load is the peak load.

Example 2–4 Calculation of Peak Load

If the peak load lasts for two hours and takes up 30 percent of the total load of 2,160,000 pages, this implies that 648,000 pages must be carried over the T-1 lines during two hours of the day.

Therefore, to accommodate peak load during those two hours, increase the number of T-1 lines according to the following calculations:

648,000 pages/120 minutes = 5,400 pages per minute

5,400 pages per minute/60 seconds = 90 pages per second

If four lines can handle 25 pages per second, then approximately four times that many pages requires four times that many lines, in this case 16 lines. The 16 lines are meant for handling the realistic maximum of a 30 percent peak load. Obviously, the other 70 percent of the load can be handled throughout the rest of the day by these many lines.

Configuring Subnets

If you use the separate tier topology, where the application server instances and HADB nodes are on separate host machines, you can improve performance by having all HADB nodes on a separate subnet. This is because HADB uses the User Datagram Protocol (UDP). Using a separate subnet reduces the UDP traffic on the machines outside of that subnet. Note, however, that all HADB nodes must be on the same subnet.

You can still run the management client from a different subnet as long as all the nodes and management agents are on the same subnet. All hosts and ports should be accessible within all node agents and node must not be blocked by firewalls, blocking of UDP, and so on.

HADB uses UDP multicast, so any subnet containing HADB nodes must be configured for multicast.

Choosing Network Cards

For greater bandwidth and optimal network performance, use at least 100 Mbps Ethernet cards or, preferably, 1 Gbps Ethernet cards between servers hosting the Application Server and the HADB nodes.

Network Settings for HADB

Note –

HADB uses UDP multicast and thus you must enable multicast on your system’s routers and host network interface cards. If HADB spans multiple sub-networks, you must also enable multicast on the routers between the sub-networks. For best results, put HADB nodes all on same network. Application server instances may be on a different sub network.

The following suggestions will enable HADB to work optimally in the network:

Use switched routers so that each network interface has a dedicated 100 Mbps or better Ethernet channel.
When running HADB on a multi-CPU machine hosting four or more HADB nodes, use 1 Gbps Ethernet cards. If the average session size is greater than 50 KB, use 1 Gbps Ethernet cards even if there are less than four HADB nodes per machine.
If you suspect network bottlenecks within HADB nodes:
- Run network monitoring software on your HADB servers to diagnose the problem.
- Consider replacing any 100 Mbps Ethernet cards in the network with 1 Gbps Ethernet cards.

Planning for Availability

This section contains the following topics:

Rightsizing Availability

To plan availability of systems and applications, assess the availability needs of the user groups that access different applications. For example, external fee-paying users and business partners often have higher quality of service (QoS) expectations than internal users. Thus, it may be more acceptable to internal users for an application feature, application, or server to be unavailable than it would be for paying external customers.

The following figure illustrates the increasing cost and complexity of mitigating against decreasingly probable events. At one end of the continuum, a simple load-balanced cluster can tolerate localized application, middleware, and hardware failures. At the other end of the scale, geographically distinct clusters can mitigate against major catastrophes affecting the entire data center.

To realize a good return on investment, it often makes sense identify availability requirements of features within an application. For example, it may not be acceptable for an insurance quotation system to be unavailable (potentially turning away new business), but brief unavailability of the account management function (where existing customers can view their current coverage) is unlikely to turn away existing customers.

Using Clusters to Improve Availability

At the most basic level, a cluster is a group of application server instances—often hosted on multiple physical servers—that appear to clients as a single instance. This provides horizontal scalability as well as higher availability than a single instance on a single machine. This basic level of clustering works in conjunction with the Application Server’s HTTP load balancer plug-in, which accepts HTTP and HTTPS requests and forwards them to one of the application server instances in the cluster. The ORB and integrated JMS brokers also perform load balancing to application server clusters. If an instance fails, become unavailable (due to network faults), or becomes unresponsive, requests are redirected only to existing, available machines. The load balancer can also recognize when an failed instance has recovered and redistribute load accordingly.

The HTTP load balancer also provides a health checker program that can monitor servers and specific URLs to determine whether they are available. You must carefully manage the overhead of health checking so that it does not become a large processing burden itself.

For stateless applications or applications that only involve low-value, simple user transactions, a simple load balanced cluster is often all that is required. For stateful, mission-critical applications, consider using HADB for session persistence. For an overview of HADB, see High-Availability Database in Chapter 1, Product Concepts Application Server Administration Guide.

To perform online upgrades of applications, it is best to group the application server instances into multiple clusters. The Application Server has the ability to quiesce both applications and instances. Quiescence is the ability to take an instance (or group of instances) or a specific application offline in a controlled manner without impacting the users currently being served by the instance or application. As one instance is quiesced, new users are served by the upgraded application on another instance. This type of application upgrade is called a rolling upgrade. For more information on upgrading live applications, see Upgrading Applications Without Loss of Availability in Sun Java System Application Server 9.1 High Availability Administration Guide.

Adding Redundancy to the System

One way to achieve high availability is to add hardware and software redundancy to the system. When one unit fails, the redundant unit takes over. This is also referred to as fault tolerance. In general, to maximize high availability, determine and remove every possible point of failure in the system.

Identifying Failure Classes

The level of redundancy is determined by the failure classes (types of failure) that the system needs to tolerate. Some examples of failure classes are:

System process
Machine
Power supply
Disk
Network failures
Building fires or other preventable disasters
Unpredictable natural catastrophes

Duplicated system processes tolerate single system process failures, as well as single machine failures. Attaching the duplicated mirrored (paired) machines to different power supplies tolerates single power failures. By keeping the mirrored machines in separate buildings, a single-building fire can be tolerated. By keeping them in separate geographical locations, natural catastrophes like earthquakes can be tolerated.

Using HADB Redundancy Units to Improve Availability

To improve availability, HADB nodes are always used in Data Redundancy Units (DRUs) as explained in Establishing Performance Goals.

Using HADB Spare Nodes to Improve Fault Tolerance

Using spare nodes improves fault tolerance. Although spare nodes are not mandatory, they provide maximum availability.

Planning Failover Capacity

Failover capacity planning implies deciding how many additional servers and processes you need to add to the Application Server deployment so that in the event of a server or process failure, the system can seamlessly recover data and continue processing. If your system gets overloaded, a process or server failure might result, causing response time degradation or even total loss of service. Preparing for such an occurrence is critical to successful deployment.

To maintain capacity, especially at peak loads, add spare machines running Application Server instances to the existing deployment.

For example, consider a system with two machines running one Application Server instance each. Together, these machines handle a peak load of 300 requests per second. If one of these machines becomes unavailable, the system will be able to handle only 150 requests, assuming an even load distribution between the machines. Therefore, half the requests during peak load will not be served.

Design Decisions

Design decisions include whether you are designing the system for peak or steady-state load, and the number of machines in various roles and their sizes.

Designing for Peak or Steady State Load

In a typical deployment, there is a difference between steady state and peak workloads:

If the system is designed to handle peak load, it can sustain the expected maximum load of users and requests without degrading response time. This implies that the system can handle extreme cases of expected system load. If the difference between peak load and steady state load is substantial, designing for peak loads can mean spending money on resources that are often idle.
If the system is designed to handle steady state load, it does not have all the resources required to handle the expected peak load. Thus, the system has a slower response time when peak load occurs.

How often the system is expected to handle peak load will determine whether you want to design for peak load or for steady state.

If peak load occurs often—say, several times per day—it may be worthwhile to expand capacity to handle it. If the system operates at steady state 90 percent of the time, and at peak only 10 percent of the time, then it may be preferable to deploy a system designed around steady state load. This implies that the system’s response time will be slower only 10 percent of the time. Decide if the frequency or duration of time that the system operates at peak justifies the need to add resources to the system.

System Sizing

Based on the load on the application server instances, the load on the HADB, and failover requirements, you can determine:

Number of Application Server Instances

To determine the number of applications server instances (hosts) needed, evaluate your environment on the basis of the factors explained in Estimating Load on Application Server Instances to each application server instance, although each instance can use more than one Central Processing Unit (CPU).

Number of HADB Nodes

As a general guideline, plan to have one HADB node for each CPU in the system. For example, use two HADB nodes for a machine that has two CPUs.

Note –

If you have more than one HADB node per machine (for example, if you are using bigger machines), then you must ensure that there is enough redundancy and scalability on the machines; for example multiple uninterruptible power supplies and independent disk controllers.

Alternatively, use the following procedure.

To determine the required number of HADB nodes

Determine the following parameters:
- Maximum number of concurrent users, n_users.
- Average BLOB size, s.
- Maximum transaction rate per user, referred to as NTPS.

Determine the size in Gigabytes of the maximum primary data volume, V _data.

Use the following formula:

V_data = n_users^. s

Determine the maximum HADB data transfer rate, R _dt.

This reflects the data volume shipped into HADB from the application side. Use the following formula:

R_dt = n_users^. s ^. NTPS

Determine the number of nodes, N _NODES,.

Use the following formula:

N_NODES = V_data /5GB

Round this value up to an even number, since nodes work in pairs.

Number of HADB Hosts

Determine the number of HADB hosts based on data transfer requirements. This calculation assumes all hosts have similar hardware configurations and operating systems, and have the necessary resources to accommodate the nodes they run.

To calculate the number of hosts

Determine the maximum host data transfer rate, R _max..

Determine this value empirically, because it depends on network and host hardware. Note this is different from the maximum HADB data transfer rate, R _dt, determined in the previous section.

Determine the number of hosts needed to accommodate this data

Updating a volume of data V distributed over a number of hosts N _HOSTS causes each host to receive approximately 4V/N _HOSTS of data. Determine the number of hosts needed to accommodate this volume of data with the following formula:

N_HOSTS = 4^. R_dt / R_max

Round this value up to the nearest even number to get the same number of hosts for each DRU.

Add one host on each DRU for spare nodes.

If each of the other hosts run N data nodes, let this host run N spare nodes. This allows for single-machine failure taking down N data nodes.

Each host needs to run at least one node, so if the number of nodes is less than the number of hosts (N_NODES < N_HOSTS), adjust N_NODES to be equal to N_HOSTS. If the number of nodes is greater than the number of hosts, (N_NODES \> N_HOSTS), several nodes can be run on the same host.

HADB Storage Capacity

The HADB provides near-linear scaling with the addition of more nodes until the network capacity is exceeded. Each node must be configured with storage devices on a dedicated disk or disks. All nodes must have equal space allocated on the storage devices. Make sure that the storage devices are allocated on local disks.

Suppose the expected size session data is x MB. HADB replicates data on mirror nodes, and therefore requires 2x MB of storage. Further, HADB uses indexes to enable fast access to data. The two nodes will require an additional 2x MB for indexes, for a total required storage capacity of 4x. Therefore, HADB’s expected storage capacity requirement is four times the expected data volume.

To account for future expansion without loss of data from HADB, you must provide additional storage capacity for online upgrades because you might want to refragment the data after adding new nodes. In this case, a similar amount (4x) of additional space on the data devices is required. Thus, the expected storage capacity is eight times the expected data volume.

Additionally, HADB uses disk space as follows:

Space for temporary storage of log buffer. This space is four times the log buffer size. The log buffer keeps track of operations related to data. The default value of the log buffer size is 48 MB.
Space for internal administration purpose. This space is one percent of the storage device size.

The following table summarizes the HADB storage space requirements for a session data of x MB.

Table 2–3 HADB Storage Space Requirement for Session Size of X MB


Condition	HADB Storage Space Required
Addition or removal of HADB nodes while online is not required.	4x MB + (4*log buffer size) + 1% of device size
Addition or removal of HADB nodes while online is required.	8x MB + (4*log buffer size) + 1% of device size

If the HADB runs out of device space, it will not accept client requests to insert or update data. However, it will accept delete operations. If the HADB runs out of device space, it returns error codes 4593 or 4592 and writes corresponding error messages to the history files. For more information on these messages, seeChapter 14, HADB Error Messages, in Sun Java System Application Server 9.1 Error Message Reference.

Planning Message Queue Broker Deployment

The Java Message Service (JMS) API is a messaging standard that allows J2EE applications and components to create, send, receive, and read messages. It enables distributed communication that is loosely coupled, reliable, and asynchronous. The Sun Java System Message Queue, which implements JMS, is integrated with Application Server, enabling you to create components such as message-driven beans (MDBs).

Sun Java System Message Queue (MQ) is integrated with Application Server using a connector module, also known as a resource adapter, as defined by the J2EE Connector Architecture Specification (JCA) 1.5. A connector module is a standardized way to add functionality to the Application Server. J2EE components deployed to the Application Server exchange JMS messages using the JMS provider integrated via the connector module. By default, the JMS provider is the Sun Java System Message Queue, but if you wish you can use a different JMS provider, as long as it implements JCA 1.5.

Creating a JMS resource in Application Server creates a connector resource in the background. So, each JMS operation invokes the connector runtime and uses the MQ resource adapter in the background.

In addition to using resource adapter APIs, Application Server uses additional MQ APIs to provide better integration with MQ. This tight integration enables features such as connector failover, load balancing of outbound connections, and load balancing of inbound messages to MDBs. These features enable you to make messaging traffic fault-tolerant and highly available.

Multi-Broker Clusters

MQ Enterprise Edition supports using multiple interconnected broker instances known as a broker cluster. With broker clusters, client connections are distributed across all the brokers in the cluster. Clustering provides horizontal scalability and improves availability.

A single message broker scales to about eight CPUs and provides sufficient throughput for typical applications. If a broker process fails, it is automatically restarted. However, as the number of clients connected to a broker increases, and as the number of messages being delivered increases, a broker will eventually exceed limitations such as number of file descriptors and memory.

Having multiple brokers in a cluster rather than a single broker enables you to:

Provide messaging services despite hardware failures on a single machine.
Minimize downtime while performing system maintenance.
Accommodate workgroups having different user repositories.
Deal with firewall restrictions.

However, having multiple brokers does not ensure that transactions in progress at the time of a broker failure will continue on the alternate broker. While MQ will re-establish a failed connection with a different broker in a cluster, it will lose transactional messaging and roll back transactions in progress. User applications will not be affected, except for transactions that could not be completed. Service failover is assured since connections continue to be usable.

Thus, MQ does not support high availability persistent messaging in a cluster. If a broker restarts after failure, it will automatically recover and complete delivery of persistent messages. Persistent messages may be stored in a database or on the file system. However if the machine hosting the broker does not recover from a hard failure, messages may be lost.

The Solaris platform with Sun Cluster Data Service for Sun Message Queue supports transparent failover of persistent messages. This configuration leverages Sun Cluster’s global file system and IP failover to deliver true high availability and is included with Java Enterprise System.

Master Broker and Client Synchronization

In a multi-broker configuration, each destination is replicated on all of the brokers in a cluster. Each broker knows about message consumers that are registered for destinations on all other brokers. Each broker can therefore route messages from its own directly-connected message producers to remote message consumers, and deliver messages from remote producers to its own directly-connected consumers.

In a cluster configuration, the broker to which each message producer is directly connected performs the routing for messages sent to it by that producer. Hence, a persistent message is both stored and routed by the message’s home broker.

Whenever an administrator creates or destroys a destination on a broker, this information is automatically propagated to all other brokers in a cluster. Similarly, whenever a message consumer is registered with its home broker, or whenever a consumer is disconnected from its home broker—either explicitly or because of a client or network failure, or because its home broker goes down—the relevant information about the consumer is propagated throughout the cluster. In a similar fashion, information about durable subscriptions is also propagated to all brokers in a cluster.

Configuring Application Server to Use Message Queue Brokers

The Application Server’s Java Message Service represents the connector module (resource adapter) for the Message Queue. You can manage the Java Message Service through the Admin Console or the asadmin command-line utility.

MQ brokers (JMS hosts) run in a separate JVM from the Application Server process. This allows multiple Application Server instances or clusters to share the same set of MQ brokers.

In Application Server, a JMS host refers to an MQ broker. The Application Server’s Java Message Service configuration contains a JMS Host List (also called AddressList) that contains all the JMS hosts that will be used.

Managing JMS with Admin Console

In the Admin Console, you can set JMS properties using the Java Message Service node for a particular configuration. You can set properties such as Reconnect Interval and Reconnect Attempts. For more information, see Chapter 4, Configuring Java Message Service Resources, in Sun Java System Application Server 9.1 Administration Guide.

The JMS Hosts node under the Java Message Service node contains a list of JMS hosts. You can add and remove hosts from the list. For each host, you can set the host name, port number, and the administration user name and password. By default, the JMS Hosts list contains one MQ broker, called “default_JMS_host,” that represents the local MQ broker integrated with the Application Server.

Configure the JMS Hosts list to contain all the MQ brokers in the cluster. For example, to set up a cluster containing three MQ brokers, add a JMS host within the Java Message Service for each one. Message Queue clients use the configuration information in the Java Message Service to communicate with MQ broker.

Managing JMS with asadmin

In addition to the Admin Console, you can use the asadmin command-line utility to manage the Java Message Service and JMS hosts. Use the following asadmin commands:

Configuring Java Message Service attributes: asadmin set
Managing JMS hosts:
- asadmin create-jms-host
- asadmin delete-jms-host
- asadmin list-jms-hosts
Managing JMS resources:
- asadmin create-jms-resource
- asadmin delete-jms-resource
- asadmin list-jms-resources
  
  For more information on these commands, see Sun Java System Application Server 9.1 Reference Manualor the corresponding man pages.

Java Message Service Type

There are two types of integration between Application Server and MQ brokers: local and remote. You can set this type attribute on the Admin Console’s Java Message Service page.

Local Java Message Service

If the Type attribute is LOCAL, the Application Server will start and stop the MQ broker. When Application Server starts up, it will start the MQ broker specified as the Default JMS host. Likewise, when the Application Server instance shuts down, it shuts down the MQ broker. LOCAL type is most suitable for standalone Application Server instances.

With LOCAL type, use the Start Arguments attribute to specify MQ broker startup parameters.

Remote Java Message Service

If the Type attribute is REMOTE, Application Server will use an externally configured broker or broker cluster. In this case, you must start and stop MQ brokers separately from Application Server, and use MQ tools to configure and tune the broker or broker cluster. REMOTE type is most suitable for Application Server clusters.

With REMOTE type, you must specify MQ broker startup parameters using MQ tools. The Start Arguments attribute is ignored.

Default JMS Host

You can specify the default JMS Host in the Admin Console Java Message Service page. If the Java Message Service type is LOCAL, then Application Server will start the default JMS host when the Application Server instance starts.

To use an MQ broker cluster, delete the default JMS host, then add all the MQ brokers in the cluster as JMS hosts. In this case, the default JMS host becomes the first JMS host in the JMS host list.

You can also explicitly set the default JMS host to one of the JMS hosts. When the Application Server uses a Message Queue cluster, the default JMS host executes MQ-specific commands. For example, when a physical destination is created for a MQ broker cluster, the default JMS host executes the command to create the physical destinations, but all brokers in the cluster use the physical destination.

Example Deployment Scenarios

To accommodate your messaging needs, modify the Java Message Service and JMS host list to suit your deployment, performance, and availability needs. The following sections describe some typical scenarios.

For best availability, deploy MQ brokers and Application Servers on different machines, if messaging needs are not just with Application Server. Another option is to run an Application Server instance and an MQ broker instance on each machine until there is sufficient messaging capacity.

Default Deployment

Installing the Application Server automatically creates a domain administration server (DAS). By default, the Java Message Service type for the DAS is LOCAL. So, starting DAS will also start its default MQ broker.

Creating a new domain will also create a new broker. By default, when you add a standalone server instance or a cluster to the domain, its Java Message Service will be configured as REMOTE and its default JMS host will be the broker started by DAS.

Default Deployment illustrates an example default deployment with an Application Server cluster containing three instances.

Using an MQ Broker Cluster with an Application Server Cluster

To configure an Application Server cluster to use an MQ broker cluster, add all the MQ brokers as JMS hosts in the Application Server’s Java Message Service. Any JMS connection factories created and MDBs deployed will then use the JMS configuration specified.

The following figure illustrates an example deployment with three MQ brokers in an broker cluster and three Application Server instances in a cluster.

Specifying an Application-Specific MQ Broker Cluster

In some cases, an application may need to use a different MQ broker cluster than the one used by the Application Server cluster. Specifying an Application-Specific MQ Broker Cluster illustrates an example of such a scenario. To do so, use the AddressList property of a JMS connection factory or the activation-config element in an MDB deployment descriptor to specify the MQ broker cluster.

For more information about configuring connection factories, see Admin Console Tasks for JMS Connection Factories in Sun Java System Application Server 9.1 Administration Guide. For more information about MDBs, see Using Message-Driven Beans in Sun Java System Application Server 9.1 Developer’s Guide.

Application Clients

When an application client or standalone application accesses a JMS administered object for the first time, the client JVM retrieves the Java Message Service configuration from the server. Further changes to the JMS service will not be available to the client JVM until it is restarted.