Skip Headers
Oracle® Coherence Developer's Guide
Release 3.7

Part Number E18677-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

6 Setting Up a Cluster

This chapter provides instructions for completing common tasks that are associated with setting up a cluster.

The following sections are included in this chapter:

6.1 Overview of Setting Up Clusters

Coherence provides a default out-of-box cluster configuration that is used for demonstration purposes. It allows clusters to be quickly created and often requires little or no configuration changes. However, beyond demonstration, the default setup should not be used. Instead, unique clusters should be set up based on the network environment in which they run and based on the requirements of the applications that use them. A cluster that runs in single-server mode can be configured for unit testing and trivial development.

At a minimum, setting up a cluster includes defining the cluster's name and the cluster's multicast address. If multicast is undesirable or unavailable in an environment, then setting up the Well Known Addresses (WKA) feature is required. The rest of the tasks presented in this chapter are typically used when setting up a cluster and are completed when the default settings must be changed.

Clusters are set up within an operational override file (tangosol-coherence-override.xml). Each cluster member uses an override file to specify unique values that override the default configuration that is defined in the operational deployment descriptor. See "Specifying an Operational Configuration File" for detailed information on using an operational override file. In addition, refer to Appendix A, "Operational Configuration Elements," for descriptions and usage information for all the operational elements that are discussed in this chapter.

6.2 Specifying a Cluster's Name

A cluster name is a user-defined name that uniquely identifies a cluster from other clusters that run on the network. Cluster members must specify the same cluster name to join and cluster. A cluster member does not start if the wrong name is specified when attempting to join an existing cluster. A unique cluster name is often used with a unique multicast port to create distinct clusters on the same network.

Note:

A cluster member uses a system generated cluster name if a name is not explicitly specified. Using the system generated name (and the out-of-box multicast defaults) increases the chance of having overlapping cluster configurations on the network. This can lead to cluster members accidentally joining an unexpected cluster.

To specify a cluster name, edit the operational override file and add a <cluster-name> element, within the <member-identity> element, that includes the cluster name. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <member-identity>
         <cluster-name system-property="tangosol.coherence.cluster">MyCluster
         </cluster-name>
      </member-identity>
   </cluster-config>
</coherence>

The tangosol.coherence.cluster system property is used to specify the cluster name instead of using the operational override file. For example:

-Dtangosol.coherence.cluster=name

6.3 Specifying a Cluster Member's Identity

A set of identifiers are used to give a cluster member an identity within the cluster. The identity information is used to differentiate cluster members and conveys the members' role within the cluster. Some identifiers are also used by the cluster service when performing cluster tasks. Lastly, the identity information is valuable when displaying management information (for example, JMX) and facilitates interpreting log entries. The following list describes each of the identifiers:

To specify member identity information, edit the operational override file and add the member identity elements within the <member-identity> element as demonstrated below:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <member-identity>
         <site-name system-property="tangosol.coherence.site">pa-1</site-name>
         <rack-name system-property="tangosol.coherence.rack">100A</rack-name>
         <machine-name system-property="tangosol.coherence.machine">prod001
         </machine-name>
         <process-name system-property="tangosol.coherence.process">JVM1
         </process-name>
         <member-name system-property="tangosol.coherence.member">C1</member-name>
         <role-name system-property="tangosol.coherence.role">Server</role-name>
      </member-identity>
   </cluster-config>
</coherence>

The following system properties are used to specify a cluster member's identity information instead of using the operational override file.

-Dtangosol.coherence.site=pa-1 -Dtangosol.coherence.rack=100A  -Dtangosol.coherence.machine=prod001 -Dtangosol.coherence.process=JVM1 
-Dtangosol.coherence.member=C1 -Dtangosol.coherence.role=Server

6.4 Configuring Multicast Communication

Cluster members use multicast communication to discover other cluster members and when a message must be communicated to multiple members of the cluster. The cluster protocol makes very judicious use of multicast and avoids things such as multicast storms. By default, data is only transmitted over multicast if it is intended for more than 25% of the cluster members. The vast majority of traffic is transmitted using unicast even when multicast is enabled. For typical partitioned cache based clusters, most transmissions is point-to-point and only cluster membership and partition ownership is broadcast to the entire cluster.

Multicast communication is configured in an operational override file within the <multicast-listener> node. Many system properties are also available to configure multicast communication when starting a cluster member.

The following topics are included in this section:

6.4.1 Specifying a Cluster's Multicast Address

A multicast address (IP address and port) can be specified for a cluster member. Cluster members must use the same multicast address and port to join and cluster. Distinct clusters on the same network must use different multicast addresses.

A cluster member uses a default multicast address if an address is not explicitly specified. The default value depends on the release version and follows the convention of {build}.{major version}.{minor version}.{patch} for the address and {major version}.{minor version}.{patch} for the port.

Note:

Using the default multicast address and port (and the system generated cluster name) increases the chance of having overlapping cluster configurations on the network. This can lead to cluster members accidentally joining an unexpected cluster. Always use a unique port value to create a distinct cluster.

To specify a cluster multicast address, edit the operational override file and add both an <address> and <port> element and specify the address and port to be used by the cluster member. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <multicast-listener>
         <address system-property="tangosol.coherence.clusteraddress">224.3.6.0
         </address>
         <port system-property="tangosol.coherence.clusterport">3059</port>
      </multicast-listener>
   </cluster-config>
</coherence>

The tangosol.coherence.clusteraddress and tangosol.coherence.clusterport system properties are used to specify the cluster multicast address instead of using the operational override file. For example:

-Dtangosol.coherence.clusteraddress=224.3.6.0 -Dtangosol.coherence.clusterport=3059

6.4.1.1 Changing the Multicast Socket Interface

The multicast socket is bound to the same network interface (NIC) as the unicast listener IP address. A different NIC for multicast can be configured but, with rare exception, it is strongly discouraged as it can lead to partial failure of the cluster.

With two NICs, the interface (and thus network) used for multicast traffic is different from the interface (and thus network) used for unicast (UDP/IP) and TCP-ring (TCP/IP) traffic. Communication on one interface (or network) continues to succeed even if the other interface has failed; this scenario prolongs failure detection and failover. Since the clustering protocol handles member (and thus interface) failure, it is preferable to have all communication fail so that a failed member is quickly detected and removed from the cluster.

To change the default multicast network interface, edit the operational override file and add an <interface> element that specifies the IP address to which the multicast socket binds. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <multicast-listener>
         <interface>192.168.0.1</interface>
      </multicast-listener>
   </cluster-config>
</coherence>

6.4.2 Disabling Multicast Communication

Multicast traffic may be undesirable or may be disallowed in some network environments. In this case, use the Well Known Addresses feature to prevent Coherence from using multicast. This disables multicast discovery and also disable multicast for all data transfers; unicast (point-to-point) is used instead. Coherence is designed to use point-to-point communication as much as possible, so most application profiles do not see a substantial performance impact. See "Using Well Known Addresses".

Note:

Disabling multicast does puts a higher strain on the network. However, this only becomes an issue for large clusters with greater than 100 members.

6.4.3 Specifying the Multicast Time-to-Live

The time-to-live value (TTL) setting designates how far multicast UDP/IP packets can travel on a network. The TTL is expressed in terms of how many hops a packet survives; each network interface, router, and managed switch is considered one hop.

The TTL value should be set to the lowest integer value that works. Setting the value too high can use unnecessary bandwidth on other LAN segments and can even cause the operating system or network devices to disable multicast traffic. Typically, setting the TTL value to 1 works on a simple switched backbone. A value of 2 or more may be required on an advanced backbone with intelligent switching. A value of 0 is used for single server clusters that are used for development and testing. See "Enabling Single-Server Mode" for more information on single server clusters.

To specify the TTL, edit the operational override file and add a <time-to-live> element that includes the TTL value. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <multicast-listener>
         <time-to-live system-property="tangosol.coherence.ttl">3</time-to-live>
      </multicast-listener>
   </cluster-config>
</coherence>

The tangosol.coherence.ttl system property is used to specify the TTL value instead of using the operational override file. For example:

-Dtangosol.coherence.ttl=3

6.4.4 Specifying the Multicast Join Timeout

The multicast join timeout defines how much time a cluster member waits to join a cluster. If the timeout is reached and an existing cluster is not detected, then the cluster member starts its own cluster and elects itself as the senior cluster member. A short timeout can be specified during development and testing. A timeout of 30 seconds is generally adequate for production environments.

Note:

The first member of the cluster waits the full duration of the join timeout before it assumes the role of the senior member. If the cluster startup timeout is less than the join timeout, then the first member of the cluster fails during cluster startup. The cluster member timeout is specified using the packet publisher timeout (<timeout-milliseconds>). See "packet-delivery".

To specify the join timeout, edit the operational override file and add a <join-timeout-milliseconds> element that includes the timeout value. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <multicast-listener>
         <join-timeout-milliseconds>6000</join-timeout-milliseconds>
      </multicast-listener>
   </cluster-config>
</coherence>

6.4.5 Changing the Multicast Threshold

Cluster members use both multicast and unicast communication when sending cluster packets. The multicast threshold value is used to determine whether to use multicast for packet delivery or unicast. Setting the threshold higher or lower can force a cluster to favor one style of communication over the other. The threshold setting is not used if multicast communication is disabled.

The multicast threshold is a percentage value and is in the range of 1% to 100%. In a cluster of n members, a cluster member that is sending a packet to a set of destination nodes (not counting itself) of size d (in the range of 0 to n-1) sends a packet using multicast only if the following hold true:

  • The packet is being sent over the network to multiple nodes (d > 1).

  • The number of nodes is greater than the specified threshold (d > (n-1) * (threshold/100)).

    For example, in a 25 member cluster with a multicast threshold of 25%, a cluster member only uses multicast if the packet is destined for 6 or more members (24 * .25 = 6).

Setting this value to 1 allows the cluster to use multicast for basically all multi-point traffic. Setting this value to 100 forces the cluster to use unicast for all multi-point traffic except for explicit broadcast traffic (for example, cluster heartbeat and discovery) because the 100% threshold is never exceeded. With the setting of 25 (the default) a cluster member sends the packet using unicast if it is destined for less than one-fourth of all nodes, and sends the packet using multicast if it is destined for one-fourth or more of all nodes.

To specify the multicast threshold, edit the operational override file and add a <multicast-threshold-percent> element that includes the threshold value. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <multicast-listener>
         <multicast-threshold-percent>40</multicast-threshold-percent>
      </multicast-listener>
   </cluster-config>
</coherence>

6.5 Specifying a Cluster Member's Unicast Address

Cluster members use unicast for direct member-to-member (point-to-point) communication, which makes up the majority of communication on the cluster. A default unicast address (IP address and ports) is used but can be specified as required within the <unicast-listener> element.

The unicast listener, as configured out-of-box, selects the unicast address as follows:

To specify a cluster member's unicast address, edit the operational override file and add both an <address> and <port> element (and optionally a <port-auto-adjust> element) and specify the address and port to be used by the cluster member. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <address system-property="tangosol.coherence.localhost">192.168.0.1
         </address>
         <port system-property="tangosol.coherence.localport">8090</port>
         <port-auto-adjust system-property="tangosol.coherence.localport.adjust">
            true
         </port-auto-adjust>
      </unicast-listener>
   </cluster-config>
</coherence>

The tangosol.coherence.localhost, tangosol.coherence.localport, and tangosol.coherence.localport.adjust system properties are used to specify the unicast address instead of using the operational override file. For example:

-Dtangosol.coherence.localhost=192.168.0.1 -Dtangosol.coherence.localport=8090 -Dtangosol.coherence.localport.adjust=true

6.6 Using Well Known Addresses

The Well Known Addresses (WKA) feature is a mechanism that allows cluster members to discover and join a cluster using unicast instead of multicast. WKA is most often used when multicast networking is undesirable or unavailable in an environment or when an environment is not properly configured to support multicast. All cluster multicast communication is disabled if WKA is enabled.

WKA is enabled by specifying a small subset of cluster members (referred to as WKA members) that are able to start a cluster. The optimal number of WKA members varies based on the cluster size. Generally, WKA members should be about 10% of the cluster. One or two WKA members for each switch is recommended.

WKA members are expected to remain available over the lifetime of the cluster but are not required to be simultaneously active at any point in time. Only one WKA member must be operational for cluster members to discover and join the cluster. In addition, after a cluster member has joined the cluster, it receives the addresses of all cluster members and then broadcasts are performed by individually sending messages to each cluster member. This allows a cluster to operate even if all WKA members are stopped. However, new cluster members are not able to join the cluster unless they themselves are a WKA member or until a WKA member is started. In this case, the senior-most member of the cluster polls the WKA member list and allows the WKA member to rejoin the existing cluster.

There are two ways to specify WKA members. The first method explicitly defines a list of addresses. The second method uses an address provider implementation to get a list of WKA addresses. Both methods are configured in an operational override file within the <well-known-addresses> subelement of the <unicast-listener> element.

The following topics are included in this section:

6.6.1 Specifying WKA Member Addresses

WKA members are explicitly specified within the <socket-address> element. Any number of <socket-address> elements can be specified and each must define both the address and port of a WKA member by using the <address> and <port> elements. If a cluster member specifies its own address, then the cluster member is a WKA member when it is started. The list of WKA members must be the same on every cluster member to ensure that different cluster members do not operate independently from the rest of the cluster. The following example specifies two WKA members:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <well-known-addresses>
            <socket-address id="1">
               <address>192.168.0.100</address>
               <port>8088</port>
            </socket-address>
            <socket-address id="2">
               <address>192.168.0.101</address>
               <port>8088</port>
            </socket-address>
         </well-known-addresses>
      </unicast-listener>
   </cluster-config>
</coherence>

Note:

When setting up a WKA member, the port value must match the port value that is specified for the member's unicast listener port. See "Specifying a Cluster Member's Unicast Address" for more information on setting the unicast port.

Using WKA System Properties

A single WKA member can be specified using the tangosol.coherence.wka and tangosol.coherence.wka.port system properties instead of specifying the address in an operational override file. The system properties are intended for demonstration and testing scenarios to quickly specify a single WKA member. For example:

-Dtangosol.coherence.wka=192.168.0.100 -Dtangosol.coherence.wka.port=8088

To create additional system properties to specify multiple WKA member addresses, an operational override file must be used to define multiple WKA member addresses and a system-property attribute must be defined for each WKA member address element. The attributes must include the system property names to be used to override the elements. The below example defines two addresses including system properties:

Note:

Defining additional system properties to specify a list of WKA members can be used during testing or in controlled production environments. However, the best practice is to exclusively use an operational override file to specify WKA members in production environments. This ensure the same list of WKA members exists on each cluster member.
<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <well-known-addresses>
            <socket-address id="1">
               <address system-property="tangosol.coherence.wka"></address>
               <port system-property="tangosol.coherence.wka.port"></port>
            </socket-address>
            <socket-address id="2">
               <address system-property="tangosol.coherence.wka2"></address>
               <port system-property="tangosol.coherence.wka2.port"></port>
            </socket-address>
         </well-known-addresses>
      </unicast-listener>
   </cluster-config>
</coherence>

For the above example, the WKA member addresses are specified using the system properties as follows:

-Dtangosol.coherence.wka=192.168.0.102 -Dtangosol.coherence.wka.port=8090 -Dtangosol.coherence.wka2=192.168.0.103 -Dtangosol.coherence.wka2.port=8094

See "Creating Custom System Properties" for more information on defining system properties.

6.6.2 Specifying a WKA Address Provider

A WKA address provider offers a programmatic way to define WKA members. A WKA address provider must implement the com.tangosol.net.AddressProvider interface. Implementations may be as simple as a static list or as complex as using dynamic discovery protocols. The address provider must return a terminating null address to indicate that all available addresses have been returned. The address provider implementation is called when the cluster member starts.

Note:

implementations must exercise extreme caution since any delay with returned or unhandled exceptions causes a discovery delay and may cause a complete shutdown of the cluster service on the member. Implementations that involve more expensive operations (for example, network fetch) may choose to do so asynchronously by extending the com.tangosol.net.RefreshableAddressProvider class.

To use a WKA address provider implementation, add an <address-provider> element and specify the fully qualified name of the implementation class within the <class-name> element. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <well-known-addresses>
            <address-provider>
               <class-name>package.MyAddressProvider</class-name>
            </address-provider>
         </well-known-addresses>
      </unicast-listener>
   </cluster-config>
</coherence>

As an alternative, the <address-provider> element supports the use of a <class-factory-name> element that is used to specify a factory class for creating AddressProvider instances, and a <method-name> element to specify the static factory method on the factory class that performs object instantiation. The following example gets an address provider instance using the getAddressProvider method on the MyAddressProviderFactory class.

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <well-known-addresses>
            <address-provider>
               <class-factory-name>package.MyAddressProviderFactory
               </class-factory-name>
               <method-name>getAddressProvider</method-name>
            </address-provider>
         </well-known-addresses>
      </unicast-listener>
   </cluster-config>
</coherence>

Any initialization parameters that are required for a class or class factory implementation can be specified using the <init-params> element. Initialization parameters are accessible by implementations that support the com.tangosol.run.xml.XmlConfigurable interface or implementations that include a public constructor with a matching signature. The following example sets the iMaxTime parameter to 2000.

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <well-known-addresses>
            <address-provider>
               <class-name>package.MyAddressProvider</class-name>
               <init-params>
                  <init-param>
                     <param-name>iMaxTime</param-name>
                     <param-value>2000</param-value>
                  </init-param>
               </init-params>
            </address-provider>
         </well-known-addresses>
      </unicast-listener>
   </cluster-config>
</coherence>

6.7 Enabling Single-Server Mode

Single-Server mode is a cluster that is constrained to run on a single computer and does not access the network. Single-Server mode offers a quick way to start and stop a cluster and is typically used during unit testing or development.

To enable single-server mode, edit the operational override file and add a <time-to-live> element that is set to 0 and a unicast <address> element that is set to an address that is routed to loopback. On most computers, setting the address to 127.0.0.1 works. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <address system-property="tangosol.coherence.localhost">127.0.0.1
         </address>
      </unicast-listener>
      <multicast-listener>
         <time-to-live system-property="tangosol.coherence.ttl">0</time-to-live>
      </multicast-listener>
   </cluster-config>
</coherence>

The tangosol.coherence.ttl and tangosol.coherence.localhost system properties are used to enable single-server mode instead of using the operational override file. For example:

-Dtangosol.coherence.ttl=0 -Dtangosol.coherence.localhost=127.0.0.1

6.8 Configuring Death Detection

Death detection is a cluster mechanism that quickly detects when a cluster member has failed. Failed cluster members are removed from the cluster and all other cluster members are notified about the departed member. Death detection allows the cluster to differentiate between actual member failure and an unresponsive member, such as the case when a JVM conducts a full garbage collection.

Death detection works by creating a ring of TCP connections between all cluster members. TCP communication is sent on the same port that is used for cluster UDP communication. Each cluster member issues a unicast heartbeat, and the most senior cluster member issues the cluster heartbeat, which is a broadcast message. Each cluster member uses the TCP connection to detect the death of another node within the heartbeat interval. Death detection is enabled by default and is configured within the <tcp-ring-listener> element.

The following topics are included in this section:

6.8.1 Changing TCP-Ring Settings

Several settings are used to change the default behavior of the TCP-ring listener. This includes changing the amount of attempts and time before determining that a computer that is hosting cluster members has become unreachable. These default to 3 and 15 seconds, respectively. The TCP/IP server socket backlog queue can also be set and defaults to the value used by the operating system.

To change the TCP-ring settings, edit the operational override file and add the following TCP-Ring elements:

Note:

The values of the <ip-timeout> and <ip-attempts> elements should be high enough to insulate against allowable temporary network outages.
<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <tcp-ring-listener>
         <ip-timeout system-property="tangosol.coherence.ipmonitor.pingtimeout">
         10s
         </ip-timeout>
         <ip-attempts>2</ip-attempts>
         <listen-backlog>10</listen-backlog>
      </tcp-ring-listener>
   </cluster-config>
</coherence>

The tangosol.coherence.ipmonitor.pingtimeout system property is used to specify a timeout instead of using the operational override file. For example:

-Dtangosol.coherence.ipmonitor.pingtimeout=20s

6.8.2 Changing the Heartbeat Interval

The death detection heartbeat interval can be changed. A higher interval alleviates network traffic but also prolongs detection of failed members. The default heartbeat value is 1 second.

To change the death detection heartbeat interval, edit the operational override file and add a <heartbeat-milliseconds> element that includes the heartbeat value. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <packet-publisher>
         <packet-delivery>
            <heartbeat-milliseconds>5000</heartbeat-milliseconds>    
         </packet-delivery>
      </packet-publisher>
   </cluster-config>
</coherence>

6.8.3 Disabling Death Detection

Death detection is enabled by default and must be explicitly disabled. Disabling death detection can alleviate network traffic but also prolongs the detection of failed members. If disabled, a cluster member uses the packet publisher's resend timeout interval to determine that another member has stopped responding to UDP packets. By default, the timeout interval is set to 5 minutes. See "Changing the Packet Resend Timeout" for more details.

To disable death detection, edit the operational override file and add an <enabled> element that is set to false. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <tcp-ring-listener>
         <enabled>false</enabled>
      </tcp-ring-listener>
   </cluster-config>
</coherence>

6.9 Specifying Cluster Priorities

The cluster priority mechanism allows a priority value to be assigned to a cluster member and to different threads running within a member.

The following topics are included in this section:

6.9.1 Specifying a Cluster Member's Priority

A cluster member's priority is used as the basis for determining tie-breakers between members. If a condition occurs in which one of two members is ejected from the cluster, and in the rare case that it is not possible to objectively determine which of the two is at fault and should be ejected, then the member with the lower priority is ejected.

To specify a cluster member's priority, edit the operational override file and add a <priority> element, within the <member-identity> node, that includes a priority value between 1 and 10 where 1 is the highest priority. For example:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <member-identity>
         <priority system-property="tangosol.coherence.priority">1</priority>
      </member-identity>
   </cluster-config>
</coherence>

The tangosol.coherence.priority system property can also be used to specify a cluster member's priority instead of using the operational override file. For example:

-Dtangosol.coherence.priority=1

6.9.2 Specifying Thread Priority

Multiple cluster components support thread priority. The priority is used as the basis for determining Java thread execution importance. The components include: the multicast listener, the unicast listener, the TCP ring listener, the packet speaker, the packet publisher, and the incoming message handler. The default priority setup gives the packet publisher the highest priority followed by the incoming message handler followed by the remaining components.

Thread priority is specified within each component's configuration element (<unicast-listener>, <multicast-listener>, <packet-speaker>, <packet-publisher>, <tcp-ring-listener>, and <incoming-message-handler> elements, respectively). For example, to specify a thread priority for the unicast listener, edit the operational override file and add a <priority> element, within the <unicast-listener> node, that includes a priority value between 1 and 10 where 1 is the highest priority:

<?xml version='1.0'?>

<coherence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
   xsi:schemaLocation="http://xmlns.oracle.com/coherence/
   coherence-operational-config coherence-operational-config.xsd">
   <cluster-config>
      <unicast-listener>
         <priority>5</priority>
      </unicast-listener>
   </cluster-config>
</coherence>