6.7 Reprovisioning a Compute Node when Provisioning Fails

Compute node provisioning is a complex orchestrated process involving various configuration and installation steps and several reboots. Due to connectivity fluctuations, timing issues or other unexpected events, a compute node may become stuck in an intermittent state or go into error status. The solution is to reprovision the compute node.

Warning

Reprovisioning is to be applied only to compute nodes that fail to complete provisioning.

For correctly provisioned and running compute nodes, reprovisioning functionality is blocked in order to prevent incorrect use that could lock compute nodes out of the environment permanently or otherwise cause loss of functionality or data corruption.

Reprovisioning a Compute Node when Provisioning Fails

  1. Log in to the Oracle Private Cloud Appliance Dashboard.

  2. Go to the Hardware View tab.

  3. Roll over the compute nodes that are in Error status or have become stuck in the provisioning process.

    A pop-up window displays a summary of configuration and status information.

    Figure 6.1 Compute Node Information and Reprovision Button in Hardware View

    Screenshot showing the Hardware View tab of the Oracle Private Cloud Appliance Dashboard. The pop-up window displays details of a compute node and has a Reprovision button.

  4. If the compute node provisioning is incomplete and the server is in error status or stuck in an intermittent state for several hours, click the Reprovision button in the pop-up window.

  5. When the confirmation dialog box appears, click OK to start reprovisioning the compute node.

If compute node provisioning should fail after the server was added to the Oracle VM server pool, additional recovery steps could be required. The cleanup mechanism associated with reprovisioning may be unable to remove the compute node from the Oracle VM configuration. For example, when a server is in locked state or owns the server pool master role, it must be unconfigured manually. In this case you need to perform operations in Oracle VM Manager that are otherwise not permitted. You may also need to power on the compute node manually.

Removing a Compute Node from the Oracle VM Configuration

  1. Log into the Oracle VM Manager user interface.

    For detailed instructions, see Section 4.2, “Logging in to the Oracle VM Manager Web UI”.

  2. Go to the Servers and VMs tab and verify that the server pool named Rack1_ServerPool does indeed contain the compute node that fails to provision correctly.

  3. If the compute node is locked due to a running job, abort it in the Jobs tab of Oracle VM Manager.

    Detailed information about the use of jobs in Oracle VM can be found in the Oracle VM Manager User's Guide. Refer to the section entitled Jobs Tab.

  4. Remove the compute node from the Oracle VM server pool.

    Refer to the section entitled Edit Server Pool in the Oracle VM Manager User's Guide. When editing the server pool, move the compute node out of the list of selected servers. The compute node is moved to the Unassigned Servers folder.

  5. Delete the compute node from Oracle VM Manager.

    Refer to the Oracle VM Manager User's Guide and follow the instructions in the section entitled Delete Server.

When the failing compute node has been removed from the Oracle VM configuration, return to the Oracle Private Cloud Appliance Dashboard, to reprovision it. If the compute node is powered off and reprovisioning cannot be started, power on the server manually.