IP Network Multipathing Administration Guide

Multipathing Daemon

The in.mpathd multipathing daemon detects failures and repairs by sending out probes on all the interfaces that are part of a group. The in.mpathd multipathing daemon also detects failures and repairs by monitoring the RUNNING flag on each interface in the group. When an interface is part of a group, and has a test address, the daemon starts sending out probes to determine failures on that interface. If the daemon does not receive any replies to five consecutive probes, the interface is considered to have failed. Or, if the RUNNING flag is not set, the interface is considered to have failed. The probing rate depends on the failure detection time. By default, failure detection time is 10 seconds. Thus, the probing rate is one probe every two seconds. To avoid synchronization in the network, probing is not periodic. If five consecutive probes fail, in.mpathd determines that the interface has failed. The daemon performs a failover of the network access from the failed interface to another functional interface in the group. If a standby interface is configured, the interface is chosen for failover of the IP addresses, and broadcasts and multicast memberships. If no standby interface exists, the interface with the least number of IP addresses is chosen. Refer to the man page in.mpathd(1M) for more information.

The following two examples show a typical configuration. The following two examples also show how the configuration automatically changes when an interface fails. When the hme0 interface fails, notice that all addresses move from hme0 to hme1.

Example 1–1 Interface Configuration Before an Interface Failure

hme0: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
     inet 19.16.85.19 netmask ffffff00 broadcast 19.16.85.255
     groupname test
hme0:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
     index 2 inet 19.16.85.21 netmask ffffff00 broadcast 129.146.85.255
hme1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
     inet 19.16.85.20 netmask ffffff00 broadcast 19.16.85.255
     groupname test
hme1:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
     index 2 inet 19.16.85.22 netmask ffffff00 broadcast 129.146.85.255
hme0: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
     inet6 fe80::a00:20ff:feb9:19fa/10
     groupname test
hme1: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
     inet6 fe80::a00:20ff:feb9:1bfc/10
     groupname test

Example 1–2 Interface Configuration After an Interface Failure

hme0: flags=19000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 2
        inet 0.0.0.0 netmask 0 
        groupname test
hme0:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> 
        mtu 1500 index 2 inet 19.16.85.21 netmask ffffff00 broadcast 129.146.85.255
hme1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 19.16.85.20 netmask ffffff00 broadcast 19.16.85.255
        groupname test
hme1:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
        index 2 inet 19.16.85.22 netmask ffffff00 broadcast 129.146.85.255
hme1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 6
        inet 19.16.85.19 netmask ffffff00 broadcast 19.16.18.255
hme0: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER,FAILED> mtu 1500 index 2
        inet6 fe80::a00:20ff:feb9:19fa/10 
        groupname test
hme1: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
        inet6 fe80::a00:20ff:feb9:1bfc/10 
        groupname test

You can see that the FAILED flag is set on hme0 to indicate that hme0 has failed. You can also see that hme1:2 is now created. hme1:2 was originally hme0. The address 19.16.85.19 then becomes accessible through hme1. Multicast memberships that are associated with 19.16.85.19 can still receive packets, but now through hme1. When the failover of address 19.16.85.19 from hme0 to hme1 occurred, a dummy address 0.0.0.0 was created on hme0. The dummy address is removed when a subsequent failback takes place. The dummy address is created so that hme0 can still be accessed. hme0:1 cannot exist without hme0.

Similarly, failover of the IPv6 address from hme0 to hme1 occurred. In IPv6, multicast memberships are associated with interface indexes. Multicast memberships also fail over from hme0 to hme1. All the addresses that in.ndpd configures also move. This action is not shown in the examples.

The in.mpathd daemon continues to probe through the failed NIC, hme0. After the daemon receives 10 consecutive replies for a default failure detection time of 10 seconds, the daemon determines that the interface is repaired. Then, the daemon invokes the failback. After failback, the original configuration is reestablished.

See in.mpathd(1M) man page for a description of all error messages that are logged on the console during failures and repairs.