Oracle® Containers for J2EE Configuration and Administration Guide 10g (10.1.3.1.0) Part Number B28950-01 |
|
|
View PDF |
This chapter discusses the application clustering framework provided in OC4J 10g (10.1.3.1.0). It includes these topics:
OC4J provides a flexible framework for creating a clustered environment for development and production purposes. An application cluster is the same set of applications hosted by two or more OC4J instances. The OC4J application clustering framework supports:
Replication of objects and values contained in an HTTP session or a stateful session Enterprise JavaBean (SFSB) instance.
In-memory replication using multicast or peer-to-peer communication, or persistence of state data to a database.
Load balancing of incoming requests across OC4J instances.
Transparent failover across applications within an application cluster.
Configuration within an OC4J instance at either the global server or application level.
A new <cluster>
element, which contains a number of new subelements, has been added to the XML schema definition for these files to provide a single mechanism for management of application clustering. See "Overview of the <cluster> Element" for descriptions of this element and its subelements.
The following features are no longer included in the application clustering framework in OC4J 10g (10.1.3).
The notion of islands, part of the clustering framework in previous OC4J releases, is no longer supported in OC4J.
In previous releases, an island was essentially a group of OC4J instances within an Oracle Application Server cluster across which HTTP session data was replicated. Although islands reduced overhead by not replicating data across the entire cluster, they increased configuration and management overhead. In addition, islands were applicable only to Web applications; EJB applications could not utilize the island configuration.
In OC4J 10g (10.1.3), you can still effectively limit the number of nodes to which to replicate data by using the write-quota
attribute of the <cluster>
element. This attribute makes it possible to control the extent of state replication.
See "Managing the Number of JVMs to Which Application State Data Is Replicated" and "Overview of the <cluster> Element" for details on the write-quota
attribute.
The loadbalancer.jar
archive, which provided load-balancing functionality in previous OC4J releases, was deprecated in the previous release of OC4J and has been removed from the current release.
The following XML elements are deprecated in OC4J 10g (10.1.3.1.0) and should no longer be used to configure clustering:
The <cluster-config>
element in server.xml
, the OC4J configuration file
The cluster-island
attribute of the <web-site>
element in a *-web-site.xml
Web site configuration file
The new <cluster>
element is now used for all application cluster management.
Application clustering is enabled by adding the <cluster>
element to the orion-application.xml
file of each application to be clustered in an OC4J instance. For deployed applications, this file is located in the ORACLE_HOME
/j2ee/
instance
/application-deployments/
applicationName
directory. See "Overview of the <cluster> Element" for descriptions of this element and its subelements.
This section includes the following topics:
Application clustering can be enabled globally for all applications running within an OC4J instance, as well as on a per-application basis.
Enabling clustering for all applications
Application clustering can be enabled by default for all applications deployed to an OC4J instance, through ORACLE_HOME
/j2ee/
instance
/config/application.xml
, the configuration file for the default
application. All other applications deployed into the OC4J instance inherit default properties from this application, including the application clustering configuration.
Enabling clustering for a specific application
Application clustering is defined in the application-specific ORACLE_HOME
/j2ee/
instance
/application-deployments/
app_name
/orion-application.xml
file. Settings in this file override the global configuration, as well as the configuration inherited from a parent application.
Note: Application clustering can also be configured at the time the application is deployed by using Oracle Enterprise Manager 10g Application Server Control Console, through either the deployment tasks or the deployment plan editor.See the Oracle Containers for J2EE Deployment Guide for details. |
Any changes made to a particular application's orion-application.xml
file in one OC4J instance must be replicated to the corresponding XML files in other OC4J instances for all applications within an Oracle Application Server cluster. For more information, see "Replicating Changes Across a Cluster".
At the application level, application clustering can be configured at the time the application is deployed into an OC4J instance by using the deployment plan editor, which sets values in each application's orion-application.xml
file. See the Oracle Containers for J2EE Deployment Guide for details on using the deployment plan editor.
Important: An empty<distributable /> tag must be added to the web.xml file for all Web modules that are part of an application configured to use application clustering. After deployment, this J2EE standard Web module descriptor is in the ORACLE_HOME /j2ee/ instance /applications/ app_name / web_module /WEB-INF directory within OC4J. |
A replication policy defines when replication of HttpSession
or a stateful session bean state occurs, and whether all attributes and variable values or only changed values are replicated. Replication can be an expensive process, and replicating data too frequently can affect server performance; however, replicating data too infrequently can result in lost data in the event of server failure.
The replication policy applied to all Web modules and EJB components within an application is specified in the <replication-policy>
element within the application's orion-application.xml
configuration file. The syntax of this element is as follows:
<replication-policy trigger="onSetAttribute|onRequestEnd|onShutdown" scope="modifiedAttributes|allAttributes" />
The trigger
attribute specifies when replication occurs. By default, the onRequestEnd
policy is applied, as it provides frequent replication of data while ensuring that data is not lost if the JVM terminates unexpectedly.
See Table 9-1 for an overview of trigger
attribute values.
The scope
attribute defines what data is replicated: Either all attribute or variable values, or only changed values. By default, only modified HTTP session attributes are replicated; for stateful session beans, all member variables are replicated.
See Table 9-2 for an overview of scope
attribute values.
Table 9-1 <replication-policy> trigger Attribute Values
trigger Value | HttpSession | Stateful Session Bean |
---|---|---|
|
Replicate each change made to an HTTP session attribute at the time the value is modified. From a programmatic standpoint, replication occurs each time This option can be resource intensive in cases where the session is being extensively modified. |
Not applicable. |
|
Queue all changes made to HTTP session attributes, then replicate all changes just before the HTTP response is sent. |
Replicate the current state of the bean after each EJB method call. The state is replicated frequently, but offers higher reliance. |
|
Replicate the current state of the HTTP session whenever the JVM is terminated gracefully, such as with Ctrl-C. State is not replicated if the host is terminated unexpectedly, as in the case of a system crash. Because session state was not previously replicated, all session data is sent across the network at once upon JVM termination, which can impact network performance. This option can also significantly increase the amount of time needed for the JVM to shut down. |
Replicate the current state of the bean whenever the JVM is terminated gracefully. State is not replicated if the host is terminated unexpectedly, as in the case of a system crash. Because bean state was not previously replicated, all state data is sent across the network at once upon JVM termination, which can impact network performance. This option may also significantly increase the amount of time needed for the JVM to shut down. |
Table 9-2 <replication-policy> scope Attribute Values
scope Value | HttpSession | Stateful Session Bean |
---|---|---|
|
Replicate only modified HTTP session attributes; that is, values changed by calling |
Not applicable. |
|
Replicate all attribute values set on the HTTP session. |
Replicate all member variable values set on the stateful session bean. |
The <replication-policy>
element in orion-application.xml
does not allow you to distinguish between Web and EJB components within an application. However, you can specify a different replication policy for an EJB component in the replication
attribute of the <session-deployment>
element within the component-specific orion-ejb-jar.xml
configuration file.
See Table 9-3 for valid values for the replication
attribute. For example:
<session-deployment name="MyStatefulVM" replication="onShutdown" /> <session-deployment name="MyEntity2" replication="onRequestEnd" />
The values in this file overrides the corresponding settings in orion-application.xml
, effectively enabling you to set the replication policy for an EJB component in orion-ejb-jar.xml
and the policy for Web components in orion-application.xml
.
Table 9-3 Stateful Session EJB Replication Policy Configuration
replication Value | Description |
---|---|
|
Replicate the current state of the bean after each EJB method call. The state is replicated more frequently, but offers higher reliability in the event of host failure. This is the default value. |
|
Replicate the current state of the bean whenever the JVM is terminated gracefully. State is not replicated if the host is terminated unexpectedly, as in the case of a system crash or a " |
|
Do not replicate data. |
You can effectively limit the number of JVMs to which state data is replicated by using the write-quota
attribute of the <cluster>
element. This functionality makes it possible to reduce network traffic and related overhead by controlling the extent of state replication.
The default value for write-quota
is 1
, indicating that state will be replicated to one other JVM within an Oracle Application Server cluster.
An application group member actually runs on a JVM, not an Oracle Application Server node. It is possible to construct architectures and configurations in which multiple JVMs are running per node as components of the cluster.
To force state replicas to be stored on separate physical nodes, which provides failover protection for hardware outages, set the allow-colocation
attribute to false
. This will require the state replication manager to select a peer (or peers if write-quota
is greater than 1
) running on a separate physical node (or nodes) to store its state replicas.
To replicate state to all JVMs within the Oracle Application Server cluster, you must specify the total number of JVMs within the cluster as the value of write-quota
.
By default, OC4J instances will replicate data to other instances asynchronously. However, you can enable synchronous replication by including the <synchronous-replication>
subelement within the <cluster>
element. This will force a replicating OC4J instance to wait for an acknowledgement that the data was received from at least one other peer instance before continuing with replication.
Multicast IP replication is the default replication protocol used in a standalone OC4J installation. In this mode, OC4J uses multicast packages to send and receive HTTP session and stateful session bean state changes. These packages are sent over the network to be picked up by other OC4J processes using the same multicast address and port. Lost messages are identified and retransmitted, providing a reliable transmission service.
The configuration must specify the same multicast address and port on all OC4J instances. The default values used by OC4J multicast are 230.230.0.1
for the address and 45566
for the port. These values can be changed in the appropriate XML configuration file, if necessary.
Multicast replication can be enabled between multiple application instances simply by adding an empty <cluster>
element to orion-application.xml
file for each instance:
<orion-application ...> ... <cluster/> </orion-application>
The next example specifies a new multicast address and port, using the ip
and port
attributes.
The optional bind_addr
attribute, which can be used to specify which Network Interface Card (NIC) to bind to. This is useful if you have OC4J host machines with multiple network cards, each with a specific IP address, and you wish to define which NIC is used to send and receive the multicast messages.
<orion-application ...> ... <cluster allow-colocation="false"> <replication-policy trigger="onShutdown" scope="allAttributes" /> <protocol> <multicast ip="225.130.0.0" port="45577" bind_addr="226.83.24.10" /> </protocol> </cluster> </orion-application>
The multicast-based and peer-to-peer-based replication mechanisms provided by OC4J are built on the JavaGroups communication protocol stack. Ideally, you should use one of these OC4J mechanisms to provide in-memory replication of state data, as they utilize OC4J-specific configurations.
However, you do have the option of utilizing your own JavaGroups configuration within the OC4J clustering framework. This feature is enabled by specifying one of the following items in the <property-config>
subelement within the <cluster>
element:
A string containing the JavaGroups configuration properties
A URL to an XML configuration file containing this information
See "Overview of the <cluster> Element" for details.
OC4J supports replication in a peer-to-peer (P2P) topology, using TCP to establish connections between instances within an Oracle Application Server cluster. The state data held in each application instance is then unicast to each OC4J instance.
Two peer-to-peer configurations are supported:
Dynamic peer-to-peer, in which Oracle Process Manager and Notification Server (OPMN) is used to enable peer nodes to dynamically discover and communicate with one another. This configuration is the default used in an Oracle Application Server environment where OPMN is used to manage the various components, including OC4J.
See "Configuring Dynamic OPMN-Managed Peer-to-Peer Replication" for details.
Static peer-to-peer, in which each node in the cluster is explicitly configured to recognize at least one other peer node. This configuration is supported only in a standalone OC4J environment, with a relatively small number of standalone OC4J instances clustered together.
See "Configuring Static Peer-to-Peer Replication" for details.
In an Oracle Application Server environment, Oracle Process Manager and Notification Server (OPMN) is utilized to provide dynamic peer-to-peer replication. In this replication model, each Oracle Application Server node registers itself with OPMN. The node then queries OPMN for the list of available nodes, enabling it to dynamically discover and communicate with other nodes within the cluster.
Note: To use this feature, all nodes hosting the application must be first be members of a cluster utilizing either the OPMN dynamic multicast discovery or static discovery server mechanism.See "Supported Clustering Models" for details. |
Each node sends periodic ONS (heartbeat) messages to OPMN to inform OPMN of current status, enabling OPMN to maintain a real-time list of available peer nodes, and to notify nodes when one has failed. In the event that a node is lost, another node is able to service its requests.
<orion-application ...> ... <cluster> <protocol> <peer> <opmn-discovery /> </peer> </protocol> </cluster> </orion-application>
In this configuration, the host address and port of at least one other peer node are supplied to enable for peer-to-peer communication. As a node becomes aware of each of its peers, it also becomes aware each peer's peer(s) - with the end result that all of the nodes in the cluster become aware of one another.
The key challenge in this configuration is in ensuring that host and port definitions are kept up to date, which may present a significant management effort. The following elements and attributes affect the configuration:
The start-port
attribute of the <peer>
element specifies the initial port on the host that the local OC4J process will try to bind to for peer communication. If this port is not available, OC4J will continue to increment this port until an available port is found.
The <node>
element specifies a peer node. The host
and port
attributes of the element define the name of the node address and the port that will be used for peer communication.
The range
attribute of the <peer>
element applies to the ports specified in each <node>
element, not to the value of the start-port attribute. The range attribute defines the number of times to increment the port
value if the specified port is not available on a node.
The following example illustrates static peer-to-peer configurations as specified in the orion-application.xml
application deployment descriptor deployed with the sample
application to three cluster nodes.
In this configuration, each node specifies one other node as its peer. The result is that all of the nodes within the cluster are able to establish connections with one another. This scenario will work only if each node is started in succession; that is, www1.company.com
must be started before www2.company.com
. Otherwise, www2.company.com
will not be able to "see" www1.company.com
.
First, www1.company.com
specifies www2.company.com
as its peer:
<orion-application ...> ... <cluster> <protocol> <peer start-port="7900" range="10" timeout="6000"> <node host="www2.company.com" port="7900" /> </peer> </protocol> </cluster> </orion-application>
Next, www2.company.com
specifies www3.company.com
as its peer:
<orion-application ...> ... <cluster> <protocol> <peer start-port="7900" range="10" timeout="6000"> <node host="www3.company.com" port="7900" /> </peer> </protocol> </cluster> </orion-application>
Finally, www3.company.com
specifies www1.company.com
as its peer:
<orion-application ...> ... <cluster> <protocol> <peer start-port="7900" range="10" timeout="6000"> <node host="www1.company.com" port="7900" /> </peer> </protocol> </cluster> </orion-application>
An alternative configuration could have all of the nodes specifying the same node as a peer. For example, you could have the www1.company.com
and www3.company.com
nodes both specify www2.company.com
as a peer. In this configuration, www2.company.com
would have to be the first node started; the other nodes would then connect to this node, and establish connections with one another.
The new clustering framework provides the ability to replicate an HTTP session and stateful session bean state to a database. Data is persisted outside of the clustered OC4J framework, enabling the entire session to be recovered in the event of a catastrophic failure of all of the OC4J instances within the cluster. The full HTTP session or stateful session bean object is replicated to the database.
The connection to the database is created using a data source, which is specified in the data-source
attribute of the <database>
subelement of <protocol>
. Set the value of the data-source
attribute to the data source's jndi-name
as specified in data-sources.xml
.
The data source specified must already exist within the OC4J instance. See the Oracle Containers for J2EE Services Guide for details on creating and using data sources.
The following example configures the application to replicate data to the database accessed through the MyOracleDS data source.
<orion-application ...> ... <cluster> <protocol> <database data-source="jdbc/MyOracleDS"/> </protocol> </cluster> </orion-application>
Session data is persisted to the following tables in the database:
OC4J_HTTP_SESSION
, which stores metadata for an HTTP session
OC4J_HTTP_SESSION_VALUE
, which stores the values set by the application user on the HTTP session
OC4J_EJB_SESSION
, which stores the current state of a stateful session bean
The tables are created by OC4J the first time database replication is invoked. See Appendix C, "Overview of the Session State Tables" for details on the table schema.
The length of time session data is stored in the database is based on the session's time-to-live (TTL) value. A session is considered expired when the difference between the current database time and the time the session was last accessed is greater than the session timeout value. The actual equation for determining a session's TTL is:
(Current Database Time - Last Accessed Time) > Max Inactive Time
Expired sessions are removed from the database on the next execution of the OC4J task manager. See "Configuring the OC4J Task Manager" for instructions on setting the task manager interval.
In the event that the OC4J server terminates without proper session termination, orphan records will be created in the database. These records will also be deleted the next time the task manager runs.
Clustering can be disabled globally or for a specific application using the Boolean enabled
attribute of the <cluster>
element. Setting this attribute to false
in an application's orion-application.xml
file effectively removes the application from the cluster.
The <cluster>
element serves as the single mechanism for application clustering configuration. It is used exclusively in the ORACLE_HOME
/j2ee/
instance
/config/application.xml
file to configure application clustering at the global level, and in application-specific orion-application.xml
files for application-level clustering configuration.
<cluster>
Contains the application clustering configuration for an enterprise application running within an OC4J instance.
Subelements of <cluster>
:
<property-config> <flow-control-policy> <replication-policy> <protocol> <synchronous-replication>
Attributes:
enabled
: Whether clustering is enabled for the application. The default is true
. Setting this value at the application level overrides the value inherited from the parent application, including the default
application.
group-name
: The name to use when establishing the replication group channels. If not supplied, the application name as defined in server.xml
, the OC4J server configuration file, is used by default, and new group channels are created for each enterprise application.
If a value is specified, the application and all child applications will use the channels associated with this group name.
This attribute is ignored if the <database>
tag is included.
allow-colocation
: Whether to allow application state to be replicated to a node residing on the same host machine. The default is true
. However, this attribute should be set to false
if multiple hosts are available.
If multiple OC4J instances are instantiated on the same machine, different listener ports must be specified for each instance in the default-web-site.xml
, jms.xml
, and rmi.xml
configuration files.
write-quota
: The number of other application group members (JVMs) to which the application state should be replicated. This attribute makes it possible to reduce overhead by limiting the number of JVMs to which state is written, similar to the islands concept used in previous OC4J releases.
The default is 1
JVM.
This attribute is ignored if the <database>
tag is included.
cache-miss-delay
: The length of time, in milliseconds, to wait in-process for another group member to respond with a session if the session cannot be found locally. If the session cannot be found, the request will pause for the entire length of time specified.
The default is 1000
milliseconds. In installations where heavy request loads are expected, this value should be increase, for example to 5000
. Setting this value higher also prevents the OC4J instance from creating a replica of session data within itself if allow-colocation
is set to true
.
This attribute is ignored if the <database>
tag is included.
<property-config>
Contains data required to use the JavaGroups group communication protocol to replicate session state across nodes in the cluster.
Attributes:
url
: A link to a JavaGroups XML configuration file.
property-string
: A string containing the properties that define how the JavaGroups JChannel should be created.
<replication-policy>
The replication policy to apply, which defines when replication of data occurs and what data is replicated.
Attributes:
trigger
: The frequency at which replication occurs. See Table 9-1 for the values for this attribute.
scope
: What data is replicated. See Table 9-2 for the values for this attribute.
<protocol>
Defines the mechanism to use for data replication. Only one mechanism can be specified.
Subelements:
<multicast> <peer> <database>
<multicast>
Contains the configuration required to use multicast communication for replication. This is the default protocol used.
Attributes:
ip
: The multicast address to use. The OC4J default is 230.230.0.1
.
port
: The multicast port to use. The OC4J default is port 45566
.
bind_addr
: The Network Interface Card (NIC) to bind to. This is useful if you have OC4J host machines with multiple network cards, each with a specific IP address.
<peer>
Contains the configuration required to use peer-to-peer (P2P) communication for replication.
Subelements:
<opmn-discovery> <node>
Attributes:
start-port
: The initial port on the node to attempt to allocate for peer communication. OC4J will continue to increment this value until an available port is found. The default is port 7800
. Valid only for configuring static peer-to-peer replication in a standalone OC4J installation.
range
: The number of times to increment the port value specified in each <node> subelement while looking for a potential peer node. The default is 5
increments. Valid only for configuring static peer-to-peer replication in a standalone OC4J installation.
timeout
: The length of time, in milliseconds, to wait for a response from a peer while looking for a potential peer node. The default is 3000
milliseconds. Valid only for configuring static peer-to-peer replication in a standalone OC4J installation.
bind_addr
: The Network Interface Card (NIC) to bind to. This is useful if you have OC4J host machines with multiple network cards, each with a specific IP address.
<opmn-discovery>
Configures OC4J to use dynamic
peer-to-peer replication in an Oracle Application Server environment.
<node>
Contains the host name and port of a node to poll if using static peer-to-peer communication. One or more instances of this element can be supplied within a <peer>
element.
Attributes:
host
: The host name of the peer node as a URL.
port
: The port on the node to use for peer-to-peer communication. The default is port 7800
.
<database>
Contains the connection information required to persist state data to a database.
Attributes:
data-source
: The name of a data source containing the database connection information. This must be the value of the data source's jndi-name
as specified in data-sources.xml
.
<flow-control-policy>
Controls the amount of memory to allocate to the handling of clustering messages during replication. This element is intended to prevent out-of-memory errors by gating the amount of data (bytes) sent from one node to another during replication.
Attributes:
enabled
: Whether flow control is enabled. The default is true
.
max-bytes
: The maximum number of bytes the receiving node can accept. After this value is reached, the sending node must wait for an acknowledgement from the receiver before additional messages can be received. The default value is 500000
.
min-bytes
: The minimum number of bytes the receiving node can accept without triggering an acknowledgement that more bytes should be sent. If the bytes received is below this value, the receiver will acknowledge that it can accept more bytes from the sender. The default is 0
.
threshold
: If min-bytes
is not specified, this factor value is applied to incoming requests to determine the value of that attribute. The default value is 0.25
.
<synchronous-replication>
If included, a node replicating application data will wait for an acknowledgement that the data update was received from at least one other peer node before continuing with replication. This element is optional; the default behavior is for nodes to continue replicating data to other nodes asynchronously.
Attributes:
timeout
: The length of time, in milliseconds, to wait for a response from a peer node. If this value is exceeded, replication should continue, although no acknowledgement will be sent. The default value is 10000 milliseconds (10 seconds).