Business Services Server Fault Tolerance

When a component or machine in the system goes down or is brought down, other components and machines in the system should gracefully degrade and reconnect when the component or machine is back up. A system is considered to be fault-tolerant when these conditions exist:

  • Error messages to the user and administrator are meaningful when a component of the system cannot be contacted.

  • Connections can be reestablished when a component of the system is restarted without administrative interaction on other components of the system.

The components that are relevant for the business services server to be fault-tolerant are the enterprise server and the security server.

The connection to the enterprise server is fault-tolerant. If the enterprise server is down, the exceptions that are returned from a called web service are descriptive and indicate the problem. When the enterprise server comes back up, subsequent web service calls connect correctly without restarting or any further administration of the business services server. If connections to the enterprise server times out, the connections are reestablished.

The connection from the business services server to the security server is based on a token. If the security server is down or cannot be contacted, the exception message that is returned to the web service caller indicates that the server login failed. When the security server comes back up, the token is validated without administrator interaction.