System Administration Guide: Network Interfaces and Network Virtualization

Configuring for Probe-Based Failure Detection

Probe-based failure detection involves the use of target systems, as explained in Probe-Based Failure Detection. In identifying targets for probe-based failure detection, the in.mpathd daemon operates in two modes: router target mode or multicast target mode. In the router target mode, the multipathing daemon probes targets that are defined in the routing table. If no targets are defined, then the daemon operates in multicast target mode, where multicast packets are sent out to probe neighbor hosts on the LAN.

Preferably, you should set up host targets for the in.mpathd daemon to probe. For some IPMP groups, the default router is sufficient as a target. However, for some IPMP groups, you might want to configure specific targets for probe-based failure detection. To specify the targets, set up host routes in the routing table as probe targets. Any host routes that are configured in the routing table are listed before the default router. IPMP uses the explicitly defined host routes for target selection. Thus, you should set up host routes to configure specific probe targets rather than use the default router.

To set up host routes in the routing table, you use the route command. You can use the -p option with this command to add persistent routes. For example, route -p add adds a route which will remain in the routing table even after you reboot the system. The -p option thus allows you to add persistent routes without needing any special scripts to recreate these routes every system startup. To optimally use probe-based failure detection, make sure that you set up multiple targets to receive probes.

The sample procedure that follows shows the exact syntax to add persistent routes to targets for probe-based failure detection. For more information about the options for the route command, refer to the route(1M) man page.

Consider the following criteria when evaluating which hosts on your network might make good targets.

Make sure that the prospective targets are available and running. Make a list of their IP addresses.
Ensure that the target interfaces are on the same network as the IPMP group that you are configuring.
The netmask and broadcast address of the target systems must be the same as the addresses in the IPMP group.
The target host must be able to answer ICMP requests from the interface that is using probe-based failure detection.

How to Manually Specify Target Systems for Probe-Based Failure Detection

Add a route to a particular host to be used as a target in probe-based failure detection.
$ route -p add -host destination-IP gateway-IP -static
where destination-IP and gateway-IP are IPv4 addresses of the host to be used as a target. For example, you would type the following to specify the target system 192.168.10.137, which is on the same subnet as the interfaces in IPMP group itops0:
$ route -p add -host 192.168.10.137 192.168.10.137 -static
This new route will be automatically configured every time the system is restarted. If you want to define only a temporary route to a target system for probe-based failure detection, then do not use the -p option.

Add routes to additional hosts on the network to be used as target systems.

How to Configure the Behavior of the IPMP Daemon

Use the IPMP configuration file /etc/default/mpathd to configure the following system-wide parameters for IPMP groups.

FAILURE_DETECTION_TIME
TRACK_INTERFACES_ONLY_WITH_GROUPS
FAILBACK

On the system with the IPMP group configuration, assume the Primary Administrator role or become superuser.

The Primary Administrator role includes the Primary Administrator profile. To create the role and assign the role to a user, see Chapter 2, Working With the Solaris Management Console (Tasks), in System Administration Guide: Basic Administration.

Edit the /etc/default/mpathd file.

Change the default value of one or more of the three parameters.
1. Type the new value for the FAILURE_DETECTION_TIME parameter.
  FAILURE_DETECTION_TIME=n
  where n is the amount of time in seconds for ICMP probes to detect whether an interface failure has occurred. The default is 10 seconds.
2. Type the new value for the FAILBACK parameter.
  FAILBACK=[yes | no]
  - yes– The yes value is the default for the failback behavior of IPMP. When the repair of a failed interface is detected, network access fails back to the repaired interface, as described in Detecting Physical Interface Repairs.
  - no – The no value indicates that data traffic does not move back to a repaired interface. When a failed interfaces is detected as repaired, the INACTIVE flag is set for that interface. This flag indicates that the interface is currently not to be used for data traffic. The interface can still be used for probe traffic.
    
    For example, the IPMP group ipmp0 consists of two interfaces, ce0 and ce1. In the /etc/default/mpathd file, the FAILBACK=no parameter is set. If ce0 fails, then it is flagged as FAILED and becomes unusable. After repair, the interface is flagged as INACTIVE and remains unusable because of the FAILBACK=no setting.
    
    If ce1 fails and only ce0 is in the INACTIVE state, then ce0's INACTIVE flag is cleared and the interface becomes usable. If the IPMP group has other interfaces that are also in the INACTIVE state, then any one of these INACTIVE interfaces, and not necessarily ce0, can be cleared and become usable when ce1 fails.
3. Type the new value for the TRACK_INTERFACES_ONLY_WITH_GROUPS parameter.
  TRACK_INTERFACES_ONLY_WITH_GROUPS=[yes | no]
  Note –
  For information about this parameter and the anonymous group feature, see Failure Detection and the Anonymous Group Feature.
  - yes– The yes value is the default for the behavior of IPMP. This parameter causes IPMP to ignore network interfaces that are not configured into an IPMP group.
  - no – The no value sets failure and repair detection for all network interfaces, regardless of whether they are configured into an IPMP group. However, when a failure or repair is detected on an interface that is not configured into an IPMP group, no action is triggered in IPMP to maintain the networking functions of that interface. Therefore, theno value is only useful for reporting failures and does not directly improve network availability.

Restart the in.mpathd daemon.
# pkill -HUP in.mpathd