Sun Java logo     Previous      Contents      Index      Next     

Sun logo
Sun Java System Application Server Standard and Enterprise Edition 7 2004Q2 Update 3 Administration Guide 

Chapter 19
Managing Clusters (Enterprise Edition)

The use of Sun Java™ System Application Server clusters provides high availability for your J2EE applications. This chapter explains how to configure and manage a group of load balanced clusters. It also provides important information about maintaining and operating clusters to fully utilize the failover capabilities of Sun Java™ System Application Server Enterprise Edition.

This chapter mainly discusses HTTP clusters. For information on working with RMI/IIOP clusters, see Configuring RMI/IIOP Failover.

This chapter includes the following sections:


About Application Server Clusters

A cluster is a group of Application Server instances that work together as one logical entity. A cluster provides a runtime environment for one or more J2EE applications.

The use of clusters in Sun Java System Application Server helps you to achieve:

A few important facts about Sun Java System Application Server clusters are listed here:

Defining a Cluster

To define a cluster, specify a name for the cluster in the loadbalancer.xml file. You also set up other parameters for the cluster.

Use the name attribute of the cluster element to define the name of the cluster (cluster is a subelement of the element loadbalancer in the loadbalancer.xml file).

To define a new cluster, do the following:

  1. In the loadbalancer.xml file, create a new cluster sub-element for the loadbalancer element.
  2. Give a unique name to the name attribute of cluster.

  3. Note

    You can create more than one cluster in the loadbalancer.xml file.


  4. Specify a URL for the health checker for the cluster.
  5. The health checker URL is suffixed to the URL of the listener for the Application Server instance to formulate the URL at which health check requests are sent to the instance. For example, if the listener URL of an instance is http://www.example.com:80 and the health checker URL is /fortune, the health check requests for that instance are sent to http://www.example.com:80/fortune

    The health check requests are HTTP requests. The response from the instance should be in the range of 100 to 500.

    You specify the healthcheck URL by using the url attribute of the health-checker element in the loadbalancer.xml file (the health-checker element is a subelement of the cluster element.)

  6. Specify the time interval in seconds after which this health check is performed by using the interval-in-seconds attribute of the health-checker element. This is a numeric value. The default value is 30 seconds.
  7. Specify the timeout value in seconds for the health check request by using the timeout-in-seconds attribute of the health-checker element. This is the time period within which the response should be returned by the Application Server instance to the health check request sent by the load balancer failing which the Application Server instance is considered unavailable. The default value is 10 seconds.
  8. Code Example 19-1  Defining a Cluster

    <cluster name=mycluster1>
    <health-checker url="/" interval-in-seconds="10" time-out-in-seconds="20"/>
    </cluster>

Requirements for Adding an Application Server Instance to a Cluster

The following conditions must be true before you add an Application Server instance to a cluster:

Adding Application Server Instances to a Cluster

For adding Application Server instances to a cluster, use the instance sub element of the cluster element in the loadbalancer.xml file.

For detailed information of the attributes of the sub-element, instance, see Elements in the loadbalancer.xml File.

To add an Application Server instance to a cluster, do the following:

  1. In the loadbalancer.xml file, create a new instance sub-element for the cluster element.
  2. Provide a name to the Application Server instance using the name attribute of instance. This name need not be the same as the name of the Application Server instance that you specified when creating the instance. This name must be unique across all the clusters referenced in a single loadbalancer.xml file. That is, no two Application Server instances within a cluster can have the same name.
  3. Specify whether the Application Server instance is enabled or disabled by specifying true or false for the enabled attribute of instance. The default value is true, that is the Application Server instance is enabled.
  4. Specify the quiescing period in minutes by using the disable-timeout-in-minutes attribute of instance. This is a numeric value. The default value is 31, which is one minute more than the default session idle timeout of 30 minutes.
  5. If you need to disable an Application Server instance, this is the time interval after which the load balancer stops sending assigned requests to that instance. For more information on quiescing and quiescing period, see Disabling and Quiescing an Application Server Instance in a Cluster.

  6. Specify the listeners for the Application Server instance by providing the URL of the HTTP listener for the Application Server instance.
  7. You provide this information using the listeners attribute of the instance element in the loadbalancer.xml file. You can assign one or more listeners to the Application Server instance. To specify multiple listeners for an Application Server instance, separate the URLs of the listeners with a space. For example, https://www.example.com:41 https://www.example.com:42.


    Note

    After you add an HTTP listener to an Application Server instance, you must restart the SNMP monitoring subagent. If you do not do this, it is possible that information about the new listener will not be available through the SNMP monitoring subagent.


Here is an example where an Application Server instance named myinstance1 has been added to the cluster named mycluster1. Two listeners have been specified for the Application Server instance. The health checker for this cluster has also been configured.

<cluster name="mycluster1">

<instance name="myinstance1" enabled="true" disable-timeout-in-minutes="36" listeners="http://www.example.com:41 https://www.example.com:42">

</instance>

<health-checker url="/" interval-in-seconds="10" time-out-in-seconds="15" />

</cluster>

Defining Multiple Clusters

You can set up a single load balancer to serve multiple clusters. To do this, set up each cluster as explained in Defining a Cluster and Adding Application Server Instances to a Cluster.


Note

When you configure the session persistence settings, create a different session store for each cluster. For more information on creating a session store for a cluster, see Managing the Session Store for Clusters.



Administering Application Server Clusters

This section describes the following topics:


Note

The procedures mentioned in this chapter contain UNIX specific examples. The same commands and examples, with appropriate modifications, are applicable for Microsoft Windows platforms.


Starting a Cluster

To start a cluster, do the following:

  1. Start all Application Server instances in the cluster.
  2. To start all instances in the cluster, use the cladmin command as follows:

    ./cladmin [--instancefile instance_file_location] [--passwordfile password_file_location]start-instance

    where:

    instance_file_location is the location of the clinstance.conf file.

    password_file_location is the location of the clpassword.conf file.

    For more information on the cladmin command, see Appendix F, "Using the cladmin Command for Administration (Enterprise Edition)."

  3. Enable all the Application Server instances in the cluster.
  4. Set the value of the enabled attribute of the instance element to true in the loadbalancer.xml file.

Deploying a Web Application

To deploy a web application on a cluster, do the following:

  1. Deploy the web application on all Application Server instances in the cluster.
  2. To deploy the web application to all instances in the cluster, use the cladmin command as follows:

    ./cladmin deploy [--instancefile instance_file_location] [--passwordfile password_file_location][--secure | -s] [--virtualservers virtual_servers] [--type application|ejb|web|connector] [--contextroot contextroot] [--force=true|false] [--precompilejsp=true|false] [--name component_name] [--upload=true|false] [--retrieve local_dirpath] [--instance instance_name] filepath

    where:

    instance_file_location is the location of the clinstance.conf file.

    password_file_location is the location of the clpassword.conf file.

    For more information on the cladmin command, see Appendix F, "Using the cladmin Command for Administration (Enterprise Edition)." For more information on deploying an application to an Application Server instance, see Tools for Deployment.

  3. In the loadbalancer.xml file, create a new web-module sub-element (within the cluster element) for the cluster to which you want to deploy the application.
  4. Provide the context root URL for the web module that you want to deploy to the cluster by using the context-root attribute of the web-module element.

    Note

    One cluster cannot have more than one web-module element with the same context-root.


  5. Specify whether the web application is enabled or disabled by specifying true or false for the enabled attribute of web-module. The default value is true, that is, the web application is enabled.
  6. Typically, when you deploy applications to a cluster, you enable the application. Enabling an application in a cluster means that the load balancer will route requests to this application.

  7. Specify the quiescing period in minutes by specifying a value for the disable-timeout-in-minutes attribute of web-module. This is a numeric value. The default value is 31, which is one minute more than the default session idle timeout of 30 minutes.
  8. If you disable the web application, this is the time interval after which the load balancer stops sending assigned requests to the web application. For more information on quiescing, see Disabling and Quiescing a Web Application in a Cluster.

Here is an example in which a web application with the context root /fortune has been added to a cluster called cluster1.

<cluster name="cluster1">

<web-module context-root="/fortune" enabled="true" />

</cluster>

Undeploying a Web Application

To undeploy a web application from a cluster, do the following:

  1. Disable the web application from the cluster by setting the value (in the loadbalancer.xml file) of the enabled attribute of the web-module element to false.
  2. After the quiescing period is over, undeploy the web application from all the Application Server instances in the cluster.
  3. To undeploy the web application from all instances in the cluster, use the cladmin command as follows:

    ./cladmin undeploy [--instancefile instance_file_location] [--passwordfile password_file_location][--secure | -s] [--type application|ejb|web|connector] [--instance instance_name] component_name

    where:

    instance_file_location is the location of the clinstance.conf file

    password_file_location is the location of the clpassword.conf file

  4. In the loadbalancer.xml file, remove the web-module sub-element (within the cluster element) that references the undeployed application.

  5. Note

    After an application is undeployed, the session state information for that application is not immediately removed from the HADB. This session state information is removed by the Sun Java System Application Server in the subsequent cycle when the timed-out sessions are removed.


Redeploying an Existing Web Application on a Cluster

There may be cases when you want to make changes to a deployed web application that require you to undeploy the web application and then again deploy it. For example, you may want to deploy an upgraded version of a web application. In such cases, follow these steps:

  1. Disable the web application by setting the value (in the loadbalancer.xml file) of the enabled attribute of the corresponding web-module element to false.
  2. After the quiescing period is over, redeploy the web application on the Application Server instances in the cluster.
  3. Enable the web application in the cluster by setting the value (in the loadbalancer.xml file) of the enabled attribute of the corresponding web-module element to true.

Stopping an Application Server Instance in a Cluster

To stop an Application Server instance in a cluster, do the following:

  1. Disable the Application Server instance by setting the value (in the loadbalancer.xml file) of the enabled attribute of the instance element to false.
  2. After the quiescing period is over, plus a sufficient time to allow the requests being served by the Application Server instance to complete, stop the Application Server instance.

Removing an Application Server Instance from a Cluster

To remove an Application Server instance from a cluster, do the following:

  1. Disable the Application Server instance by setting the value (in the loadbalancer.xml file) of the enabled attribute of the corresponding instance element to false.
  2. After the quiescing period plus a sufficient time to allow the requests being served by the Application Server instance to complete, remove the entries corresponding to the Application Server instance in loadbalancer.xml file.

Once you remove an Application Server instance from a cluster, any assigned request for that instance is routed by the load balancer to other instances as part of the round-robin algorithm. If the Application Server instance to which the request is routed is part of the same cluster and the session has not been invalidated, the session is failed over. Otherwise, the Application Server instance to which the request is routed treats the request as an unassigned request.

Stopping a Cluster

To stop a cluster, do the following:

  1. Disable all Application Server instances from the cluster by setting the value (in the loadbalancer.xml file) of the enabled attribute of the corresponding instance element to false.
  2. After the quiescing period is over, plus a sufficient time to allow the requests being served by the Application Server instance to complete, stop all Application Server instances that belong to the cluster.
  3. To stop all instances in the cluster, use the cladmin command as follows:

    ./cladmin [--instancefile instance_file_location] [--passwordfile password_file_location] stop-instance

Removing a Cluster

To remove a cluster from a loadbalancer, do the following:

  1. Disable each web application deployed to the Application Server instances in the cluster by setting the value (in the loadbalancer.xml file) of the enabled attribute of the corresponding web-module element to false.
  2. (Optional) After all the web applications have been quiesced, undeploy the web applications from all the individual Application Server instances in the cluster. To undeploy each application from all the instances in the cluster, use the cladmin command. For information on how to do this, see Using Multiple Load Balancers.
  3. Remove all entries for the cluster (including entries for the Application Server instances and the web applications in the cluster) from the loadbalancer.xml file.

Sample Cluster Configuration in a loadbalancer.xml File

Here are sample entries for a cluster in the loadbalancer.xml file. Note that in a complete loadbalancer.xml file, there will be other entries, for example related to load balancer polling interval, HTTPS routing enabling, and so on. For more information about the other possible entries, see Known Issues in Load Balancing Requests.

<loadbalancer name="loadbalancer1">

  <cluster name="cluster1">

    <instance name="myinstance1" enabled="true" listeners="http://www.example.com:41 https://www.example.com:42">

    </instance>

  <instance name="myinstance2" enabled="true" listeners="http://www.example.com:43 https://www.example.com:44">

    </instance>

      <web-module context-root="/fortune" enabled="true" />

      <web-module context-root="/shopping" enabled="true" />

      <health-checker url="/" interval-in-seconds="10" />

  </cluster>

</loadbalancer>


Online Reconfiguration of HTTP Clusters

Suppose an Application Server instance in a cluster has received requests and is in the process of serving them. If you now want to stop this instance for any reason (for example to add a JDBC resource), you would want this instance to complete serving these requests before you stop it. Similarly, if you want to undeploy a web application from a cluster, you would want all instances in the cluster that are serving requests for these application to complete serving the requests before you undeploy the web application. Quiescing is a process that helps achieve this.

This topic contains the following sections:

Disabling and Quiescing an Application Server Instance in a Cluster

Quiescing an Application Server instance is the process of shutting it down in a phased manner. First, the load balancer stops sending any unassigned requests to the instance and instead diverts these unassigned requests to other available instances in the cluster.

However, for a time interval called the quiescing period, the load balancer still sends to the Application Server instance the assigned requests that it was serving. For information about specifying the quiescing period for an Application Server instance, see Adding Application Server Instances to a Cluster.

Next, after the quiescing period is over, the load balancer stops sending to the Application Server instance the assigned requests that the instance was serving and fails over these assigned requests to other available instances in the cluster.


Note

These assigned requests are honored only if high availability has been enabled for the corresponding web applications. For more information on enabling high availability for web applications, see Making a Web Application Distributable.


Even after the quiescing period is over, the load balancer allows the instance to serve all the requests that the load balancer dispatched to the instance before the quiescing period ended. For such requests, the load balancer returns the results to the client even after the quiescing period is over. Therefore, you should wait for the quiescing period plus a sufficient time to allow the requests being served by the Application Server instance to complete before you shut down the Application Server instance.

In some cases, however, these requests may be very long running, and it is possible that some such requests will not be served when you shut down the Application Server instance.

Quiescing does not start as soon as you make changes to the loadbalancer.xml file and save the changes. The load balancer polls the loadbalancer.xml file periodically to check if that the time stamp of the file has changed from the last time polling happened. For more information, see Monitoring the Load Balancer Plug-in. If the load balancer detects that the time stamp of the file has changed from the last time, it reloads the entire configuration of the loadbalancer.xml file. Therefore, if you disable an Application Server instance, the load balancer will start quiescing the Application Server instance when it next polls the loadbalancer.xml file. Similarly, if you mark an instance as enabled in the load balancer.xml file, the instance will be enabled when the load balancer next polls the loadbalancer.xml file.

As an example, let us take an Application Server instance whose quiescing period is 31 minutes. At the beginning of the quiescing period, the load balancer stops sending all unassigned requests to the Application Server instance but continues to send the assigned requests to the Application Server instance. After 31 minutes, the loadbalancer stops sending even the assigned requests to the Application Server instance and fails over the assigned requests to other available Application Server instances in the cluster. However, if at 30 minutes and 50 seconds the load balancer had sent an assigned request to the Application Server instance, the Application Server instance will be allowed to serve this assigned request even after 31 minutes are over.

To enable quiescing, you use the enabled attribute of the instance element in the loadbalancer.xml file. In a running cluster, setting the value of enabled for an Application Server instance in that cluster to false in the corresponding loadbalancer.xml file(s) initiates quiescing for that Application Server instance (when the load balancer next polls the loadbalancer.xml file).

You specify the quiescing period by specifying an appropriate value for the disable-timeout-in-minutes subelement of the instance element in the loadbalancer.xml file. The default value of disable-timeout-in-minutes is 31 minutes, which is one minute more than the default session-idle-timeout value of 30 minutes.

You can use the load balancer logs to ascertain if the Application Server instance has been quiesced. Once an instance has been quiesced, the log will contain the following entry:

Instance: instance_name quiesced successfully over the cluster: cluster_name.

instance_name is the name specified for the instance in the loadbalancer.xml file. cluster_name is the name of the cluster, in which the instance exists, as specified in the loadbalancer.xml file.

Disabling and Quiescing a Web Application in a Cluster

If you disable a web application on a cluster, the load balancer stops sending any unassigned request for the web application to Application Server instances in the cluster. If this web application is deployed on other clusters that the load balancer is serving, these unassigned requests are sent to Application Server instances in the other clusters.

However, assigned requests for this web application that are being serviced by Application Server instances in the cluster will continue to be serviced till the quiescing period ends. After the quiescing period is over, the load balancer stops sending even these assigned requests to Application Server instances in the cluster.

You use the enabled attribute of the web-module element in the loadbalancer.xml file to disable a web application in a cluster. In a running cluster, setting the value of enabled for a web application in that cluster to false disables the web application on the cluster. As in the case of Application Server instances, the quiescing does not start immediately after you make the changes and save them. The quiescing starts after the next load balancer polling occurs.

Specify the quiescing period by specifying an appropriate value for the disable-timeout-in-minutes subelement of the web-module element in the loadbalancer.xml file. The default value of disable-timeout is 31 minutes, which is one minute more than the default session-idle-timeout value of 30 minutes.

You can use the load balancer logs to ascertain if the web application has been quiesced. Once a web application has been quiesced, the log will contain the following entry:

Application: application_name quiesced successfully over the cluster cluster_name where,

Modifying the Quiescing Period While Quiescing is On

Suppose an Application Server instance or a web application is being quiesced and that the quiescing period is more than the load balancer polling interval.

You now make some other changes to the loadbalancer.xml file before the next load balancer polling interval. When the load balancer next polls the loadbalancer.xml file, it detects a change in the time stamp from the last time it polled the file. Because the quiescing period is more than the load balancer polling period, the quiescing is not over yet. However, the load balancer reloads the entire configuration and starts the quiescing process again.

You can use this feature to alter the quiescing period of an Application Server instance or a web application that is being quiesced. As an example, consider the following scenario.

The quiescing period of an Application Server instance is 30 minutes and the load balancer polling interval is 1 second. You now disable the Application Server instance. When the load balancer polling occurs again, the load balancer starts to quiesce the instance.

Suppose after 10 minutes, you want to reduce the overall quiescing period from 30 minutes to 15 minutes, that is, you want the quiescing period to be over in the next 5 minutes. For this, you change the quiescing period of the Application Server instance to 5 minutes. When the load balancer polling occurs again, the load balancer quiesces the instance once more (because the quiescing is not over yet), with a new quiescing period of 5 minutes.

The overall quiescing period is therefore 15 minutes (10 minutes during the first cycle and 5 minutes during the second cycle).

Obviously, while using this feature, you cannot reduce the quiescing period to a value less than the load balancer polling interval.

For information about setting the load balancer polling interval, see Monitoring the Load Balancer Plug-in.


Caution

Unless you want to use this feature, if the quiescing period is more than the load balancer polling interval, you must ensure that you do not make any other changes to the loadbalancer.xml file before the Application Server instance or the web application has been quiesced. Otherwise, the load balancer will quiesce the Application Server instance or the web application more than once as explained earlier in the section.


Using Multiple Clusters for Online Upgrades Without Loss of Service

You can use the load balancer and multiple clusters to upgrade components within the Sun Java System Application Server without any loss of service. A component can, for example, be a JVM, the Sun Java System Application Server, or a web application.

To achieve this, do the following:

  1. Stop one of the clusters. For more information on stopping a cluster, see Stopping a Cluster.
  2. Upgrade the component in that cluster.
  3. Start the cluster. For more information on starting a cluster, see Starting a Cluster.
  4. Repeat the process with the other clusters, one by one.

Because sessions within one cluster will never failover to sessions within another cluster, there is no risk of version mismatch caused by a session failing over from an Application Server instance that is running one version of the component failing over to another Application Server instance (in a different cluster) that is running a different version of the component. A cluster in this way acts as a safe boundary for session failover for the Application Server instances within it.


Note

This approach is not possible in the following cases:

  • When you change the schema of the high-availability database (HADB). For more information, see Chapter 21, "Administering the High-Availability Database on Unix(Enterprise Edition)."
  • When you perform an application upgrade that involves a change to the application database schema.


Caution

You must upgrade all instances in a cluster together. Otherwise, there is a risk of version mismatch caused by a session failing over from one Application Server instance to another where the Application Server instances have different versions of components running.


Reconfiguring an Application Server Instance in a Running Cluster

There may be cases where you want to make changes to the configuration of an (enabled) Application Server instance that require restarting the Application Server instance. For example, you may want to add a JDBC resource to the Application Server instance, which will necessitate the restarting of the Application Server instance. In such cases, follow these steps:

  1. Split the cluster into two clusters with equally distributed Application Server instances by editing the loadbalancer.xml file. There should be a minimum of two listeners configured for each cluster (for failover).
  2. Disable an Application Server instance in one of the clusters by setting the value of the enabled attribute of the instance element to false in the loadbalancer.xml file.
  3. After the quiescing period is over, plus a sufficient time to allow the requests being served by the Application Server instance to complete, reconfigure or upgrade the Application Server instance.
  4. Restart the Application Server instance.
  5. Enable the Application Server instance by setting the value (in the loadbalancer.xml file) of the enabled attribute of the instance element to true.
  6. Follow steps 2 to 5 for all the Application Server instances in both clusters until all instances are reconfigured.
  7. Join the two clusters back to form a single cluster to return to your original configuration.


Using Multiple Load Balancers

To improve fault tolerance and scalability you can configure a set of clusters using multiple load balancers. You can also have multiple load balancers (on multiple web servers) serving a single Application Server cluster.

All the load balancers must have identical configurations and the loadbalancer.xml files for all the load balancers must be identical. It is recommended that you have a master copy of the loadbalancer.xml file and use a script to distribute this master copy to all the load balancers in your system. This ensures that the configuration change is done simultaneously to all the configured load balancers. This is particularly useful when enabling or disabling clusters, instances, and applications.

Typically, requests are sent to these load balancers through a single source, for example a hardware load balancer or a DNS-based load distribution mechanism.

Multiple load balancers can also be used for serving two entirely different set of clusters. For example, you might want your secure (HTTPS) applications to be handled by one load balancer and the rest of the applications by another load balancer. In this case, the loadbalancer.xml files for these load balancers will be different.



Previous      Contents      Index      Next     


Copyright 2005 Sun Microsystems, Inc. All rights reserved.