Enclosure Failure and Recovery

In the enclosure failure scenario we assume that the IPFE is co-located with the application servers in its target set. In this case, IPFE-A1 is in an enclosure with Server1, Server2, and Server3.

When the enclosure containing IPFE-A1, Server1, Server2, and Server3 fails:
  • All connections to all servers in the enclosure fail.
  • IPFE-A2 detects that IPFE-A1 is down and starts servicing TSA1.
  • Clients with existing connections to TSA1 detect that TSA1 is unavailable and send traffic to TSA2.
  • Depending on configuration, IPFE-A2 optionally sends a TCP RST (for TCP connections) or a configured ICMP message in response to client connection requests to TSA1.
When the enclosure recovers:
  • IPFE-A2 detects that IPFE-A1 has recovered and relinquishes control of TSA1.
  • IPFE-A1 takes over control of TSA1.
  • Since TSA1 did not have any existing connections during the failure, no special handling of existing connections is required.
  • Over a time, clients are expected to route new connections to TSA1, resulting in connections to recovered servers in the associated target set.
  • In the interim, there is a substantial imbalance between the two IPFEs as well as between the servers in the two TSAs. The IPFEs monitor the traffic for imbalances and distribute new connections to reduce the imbalance.