15 Troubleshooting Multicast Configuration

This chapter provides suggestions for troubleshooting IP multicast configuration problems. Using IP multicasting, WebLogic Server instances in a cluster can share a single IP address and port number. This capability enables all members of a cluster to be treated as a single entity and enables members of the cluster to communicate among themselves.

This chapter includes the following sections:

For general information on using and configuring multicast within a cluster, see Cluster Configuration and config.xml.

For information on configuring a multicast address from the Console, see "Clusters: Configuration: Multicast" in the Oracle WebLogic Server Administration Console Online Help.

For general cluster troubleshooting suggestions, see Chapter 14, "Troubleshooting Common Problems."

Verifying Multicast Address and Port Configuration

The first step in troubleshooting multicast problems is to verify that you have configured the multicast address and port correctly. A multicast address must be correctly configured for each cluster.

Multicast address and port configuration problems are among the most common reasons why a cluster does not start or a server fails to join a cluster. The following considerations apply to multicast addresses:

  • The multicast address must be an IP address between 224.0.0.0 and 239.255.255.255 or a host name with an IP address in this range.

  • The default multicast address used by WebLogic Server is 239.192.0.0.

  • Do not use any x.0.0.1 multicast address where x is between 0 and 9, inclusive.

Possible Errors

The following types of errors commonly occur due to multicast configuration problems:

  • Unable to create a multicast socket for clustering

  • Multicast socket send error

  • Multicast socket receive error

Checking the Multicast Address and Port

To check the multicast address and port, do one of the following:

  • Check the cluster multicast address and port through the WebLogic Server Administration Console.

  • Check the multicast information of the <cluster> element in config.xml.

Identifying Network Configuration Problems

After you verify that the multicast address and port are configured correctly, determine whether network problems are interfering with multicast communication.

Physical Connections

Ensure that no physical problems exist in your network.

  • Verify the network connection for each machine that hosts servers within the cluster.

  • Verify that all components of the network, including routers and DNS servers, are connected and functioning correctly.

Address Conflicts

Address conflicts within a network can disrupt multicast communications.

  • Use the netstat utility to verify that no other network resources are using the cluster multicast address.

  • Verify that each machine has a unique IP address.

nsswitch.conf Settings on UNIX Systems

On UNIX systems, you may encounter the UnkownHostExceptions error. This error can occur at random times even when the server is not under a heavy load. Check /etc/nsswitch.conf and change the order to 'files,DNS,NIS' to avoid this error.

For more information, see the nsswitch.conf man page for your system.

Using the MulticastTest Utility

After you verify that the multicast address and port are configured correctly and there are no physical or configuration problems with your network, you can use utils.MulticastTest to verify that multicast is working and to determine if unwanted traffic is occurring between different clusters.

For instructions on using the MulticastTest utility, see MulticastTest in "Using the Oracle WebLogic Server Java Utilities" in Command Reference for Oracle WebLogic Server.

If MulticastTest fails and the machine is multihomed, ensure that the primary address is being used. See Multicast and Multihomed Machines.

Note:

You should set -Djava.net.preferIPv4Stack=true when specifying an IPv4 format address for the multicast address on Linux machines running dual IPv4/IPv6 stacks.

Tuning Multicast Features

The following sections describe how to tune various features of WebLogic Server to work with multicasting.

Multicast Timeouts

Multicast timeouts can occur during a Network Interface Card (NIC) failover. Timeouts can result in an error message like the following:

<Error><Cluster><Multicast socket receive error:
java.io.InterruptedIOException: Receive timed out>

When this error occurs, you can:

  • Disable the NIC failover.

  • Disable the igmp snooping switch. This switch is part of the Internet Group Management Protocol (IGMP) and is used to prevent multicast flood problems on the managed switch.

  • On Windows 2000, check the IGMP level to ensure that multicast packets are supported.

  • Set the Multicast Time-To-Live to the following:

    MulticastTTL=32
    

    For more information, see Configure Multicast Time-To-Live (TTL).

Cluster Heartbeats

Each WebLogic Server instance in a cluster uses multicast to broadcast regular heartbeat messages that advertise its availability. By monitoring heartbeat messages, server instances in a cluster determine when a server instance has failed.

The following sections describe possible solutions when cluster heartbeat problems occur.

Multicast Send Delay

Multicast Send Delay specifies the amount of time the server waits to send message fragments through multicast. This delay helps to avoid OS-level buffer overflow. This can be set via the MulticastSendDelay attribute of the Cluster MBean. For more information, see the MBean Reference for Oracle WebLogic Server.

Operating System Parameters

If problems still occur after setting the Multicast Send Delay, you may need to set the following operating system parameters related to UDP settings:

  • xdp_xmit_hiwat

  • udp_recv_hiwat

If these parameters are set to a lower value (8K for example) there may be a problem if the multicast packet size is set to the maximum allowed (32K). Try setting these parameters to 64K.

Multicast Storms

A multicast storm is the repeated transmission of multicast packets on a network. Multicast storms can stress the network and attached stations, potentially causing end-stations to hang or fail.

Increasing the size of the multicast buffers can improve the rate at which announcements are transmitted and received, and prevent multicast storms. See Configure Multicast Buffer Size.

Multicast and Multihomed Machines

The following considerations apply when using multicast in a multihomed environment:

  • Ensure that you have configured a UnixMachine instance from the WebLogic Server Administration Console and have specified an InterfaceAddress for each server instance to handle multicast traffic.

  • Run /usr/sbin/ifconfig -a to check the MAC address of each machine in the multihomed environment. Ensure that each machine has a unique MAC address. If machines use the same MAC address, this can cause multicast problems.

Multicast in Different Subnets

If multicast problems occur when cluster members are in different subnets you should configure Multicast-Time-To-Live. The value of the Multicast Time-To-Live (TTL) parameter for the cluster must be high enough to ensure that routers do not discard multicast packets before they reach their final destination.

The Multicast TTL parameter sets the number of network hops a multicast message makes before the packet can be discarded. Configuring the Multicast TTL parameter appropriately reduces the risk of losing the multicast messages that are transmitted among server instances in the cluster.

For more information, see Configure Multicast Time-To-Live (TTL).

Debugging Multicast

If you are still having problems with the multicast address after performing the troubleshooting tips above, gather debugging information for multicast.

Debugging Utilities

The following utilities can help you debug multicast configuration problems.

MulticastMonitor

MulticastMontior is a standalone Java command line utility that monitors multicast traffic on a specific multicast address and port. The syntax for this command is:

java weblogic.cluster.MulticastMonitor <multicast_address> <multicast_port> <domain_name> <cluster_name>

MulticastTest

The MulticastTest utility helps you debug multicast problems when you configure a WebLogic cluster. The utility sends out multicast packets and returns information about how effectively multicast is working on your network.

Debugging Flags

The following debug flags are specific to multicast:

  • DebugCluster

  • DebugClusterHeartBeats

  • DebugClusterFragments

Setting Debug Flags on the Command Line

Set these flags from the command line during server startup by adding the following options:

  • -Dweblogic.debug.DebugCluster=true

  • -Dweblogic.debug.DebugClusterHeartBeats=true

  • -Dweblogic.debug.DebugClusterFragments=true

Setting Debug Attributes Using WLST

Set debug attributes using these WLST commands:

connect()
edit()
startEdit()
servers=cmo.getServers()
for s in servers:
  d=s.getServerDebug()
  d.setDebugCluster(true)
activate()

Miscellaneous Issues

The following sections describe miscellaneous multicast issues you may encounter.

Multicast on AIX

AIX version 5.1 does not support IPv4 mapped multicast addresses. If you are using an IPv4 multicast address, you cannot join a multicast group even if you are switching to IPv6. When running MulticastTest on AIX, use the order on the command line specified in the following example:

java -Djava.net.preferIPv4Stack=true utils.Multicast <options>

Additionally, verify the following settings on AIX to properly configure cluster operations:

  • Set the MTU size to 1500 by executing the following command and rebooting the machine:

    chdev -1 lo0 -a mtu=1500 -P
    
  • Ensure that the following has been added to /etc/netsvc.conf:

    hosts=local,bind4
    

    This line is required to ensure that only IPv4 addresses are sent to name services for IP resolution.

File Descriptor Problems

Depending on the operating system, there may be problems with the number of file descriptors open. On UNIX, you can use loses to determine how many files on disk a process has open. If a problem occurs, you may need to increase the number of file descriptors on the machine.

Other Resources for Troubleshooting Multicast Configuration

The following resources may be helpful in resolving multicast problems: