N1 Provisioning Server 3.1, Blades Edition, Control Center Management Guide

Overview

As with any complex system, when farms transition from state to state, errors can occur. You must be able to remedy these errors quickly. Use the following general strategy to resolve an error state:

  1. Determine that the farm request failed.

  2. Diagnose the problem by determining the error state.

  3. Fix the problem, for example, replace a failed server, free farm resources, resolve networking issue, and so forth. Then run the farm -af command to activate the farm.

  4. Alternatively, you can bypass the problem, for example, delete the request and return to the prior condition of the farm or delete the farm and start over.

Every device in a logical server farm is continuously monitored for availability. The monitoring facility alerts in case of a device failure. The N1 Provisioning Server software automatically brings up another identically configured physical device to replace the failed device. In these cases, failover is expected behavior and no error message is generated.


Note –

Most error states can be diagnosed and resolved by the administrator. However, in some rare cases, error states must be resolved by a Sun Service provider.


At a high level, types of failures include resource layer device failure, that is, device and networking failures, configuration errors, or not enough resources available, software configuration errors, and software error/control plane error. The following list describes potential failure points in farm activation:

Other points of failure exist. Given the variety of devices and systems involved, there are a number of failure points to investigate. However, you know you have a problem if the following situations occur: