Note:

Replace a Faulty ESXi Host from your Oracle Cloud VMware Solution Cluster on Oracle Cloud Infrastructure

Introduction

This guide details the replacing a malfunctioning ESXi host within your Oracle Cloud VMware Solution cluster. The process involves using the Oracle Cloud Infrastructure (OCI) Console and performing the remaining configuration steps through vCenter and NSX Manager.

Objectives

Prerequisites

Before replacing a faulty ESXi host in your Oracle Cloud VMware Solution cluster, ensure you meet the following requirements:

Task 1: Initiate Host Replacement from OCI Console

In this task, we will initiate the host replacement process within the OCI Console for your Oracle Cloud VMware Solution cluster.

  1. Log in to the OCI Console and navigate to the specific VMware SDDC that contains the faulty ESXi host requiring replacement.

    SDDC Details page

  2. Locate the cluster within your SDDC that contains the malfunctioning ESXi host. For this tutorial, let us assume the faulty host is Cls2-Standard3-1 which is located in the Cls2-Standard3 cluster.

    Cluster Details page

  3. Click three dots displayed to the right of the faulty ESXi host and select Replace Host.

    Initiate Host Replacement

    Note: Before proceeding, ensure your tenancy has been allowlisted through a Support Request. For more information, see the prerequisites section.

  4. In the Replace Host window, enter the following information.

    • Release Name: Select a compatible release name from the drop-down menu. This version should align with the currently used build number in your vCenter cluster to ensure compatibility.

      Select VMware Minor Version

      Note: The Replace Host workflow simplifies compatibility by displaying only minor versions within your SDDC major ESXi version (like ESXi 7 or 8) that have been officially made available by Oracle Cloud VMware Solution. This lets you choose the exact version that matches your existing set up, ensuring everything works together seamlessly.

      Example: Imagine your SDDC was created with ESXi 8 Update 1c-build 22088125-1. The release name drop-down menu will show all updates for ESXi 8 that Oracle Cloud VMware Solution has officially made available, from Update 1c-build 22088125-1 to the latest version in ESXi 8 for example Update 2-build 22380479-1. You will not see versions not offered by Oracle Cloud VMware Solution, eliminating any compatibility issues.

      VMware Minor Versions Available

    • Billing Grace Period: Review the information message regarding the replacement host creation. The process creates a new loaner host with a 24-hour grace period for billing purposes. This replacement host is billed after the grace period on an hourly basis. Once you complete the required configuration steps in your VMware environment (vCenter Server, NSX Manager and so on) and terminate the original faulty host, the billing is automatically switched to the new permanent host.

      Note: Be aware that failing to terminate the original faulty host within 24 hours will result in charges for both hosts; the original host with its existing billing commitment and the new replacement host with its hourly billing.

  5. Review the settings and potential billing implications and click Confirm to initiate the host replacement process.

  6. (Optional) If you accidentally initiated the host replacement and want to cancel it, locate the warning banner displayed at the top of the cluster details page. Click Cancel Replacement to stop the process.

    Cancel Replacement

  7. Navigate to the Work Requests section within your chosen cluster.

    Monitor Work Request for Create ESXi Host

    This section allows you to monitor the progress of the create ESXi host task, which is part of the replacement process.

    Work Request Details Page

    After 20-25 mins, you should see the work request complete successfully.

    Work Request Successful Completion

  8. Verify replacement status.

    • Cluster Details Page: Upon successful completion of the replacement process, the newly added host should display Active state. Conversely, the original faulty host should now be marked as Updating. On the cluster details page, a banner will appear, highlighting the need for termination of the faulty host within a specific time frame to prevent double billing.

      Cluster Details Page

    • Original Faulty Host: On the details page for the faulty host, a similar banner will appear, reminding you to terminate the host to avoid double billing.

      Faulty host details page and status

    • New Replacement Host: Unlike the faulty host, the new replacement host will not have a Pricing Interval End Date. This value will be inherited from the faulty host once it is terminated. However, the replacement host does have a Grace Period End Date. If the faulty host remains unterminated after this date, you will be charged hourly for the replacement host.

      Replacement Host - Details page

Task 2: Get ESXi Host Information and Default vCenter Password

In this task, we will gather essential details from the OCI Console, including the newly created ESXi host information and the vCenter default password.

  1. Open the OCI Console, navigate to Compute and Instances. Identify and note down the host information.

    1. From the list of instances, select the newly added ESXi host.

    2. Note down the Private IPv4 address and Internal FQDN details for later use.

      Gathering new ESXi host details

    3. Get Attached Block Volumes iSCSI target server details.

      • Access iSCSI attachment information.

        Attached Block Volume - Information

      • Access iSCSI target server details.

        Attached Block Volume - iSCSI Target Server Details

      • Note down the same iSCSI target information for all the attached block volumes.

  2. Access the SDDC details page within the OCI Console. Locate and securely store the vCenter default password. You will need this password when adding the ESXi host to vCenter in a later task.

    Access SDDC Details

    Note: Ensure you store the vCenter password securely. Avoid sharing it in plain text or storing it in unencrypted locations.

Task 3: Add and Configure the New ESXi Host in vCenter

In this task, we will add a newly created ESXi host to your vCenter cluster and configure its network settings.

  1. To add the ESXi host to vCenter, open vCenter Server and locate the desired data center where you want to add the ESXi host. You can find this data center in the inventory pane.

    vCenter Inventory Page

  2. Right-click on the chosen data center and select Add Host.

    vCenter - Add Host to Datacenter

  3. In the Add Host wizard, enter the following information.

    • Host Name or IP Address: Enter the FQDN for the new ESXi host noted in Task 2 and click Next.

      vCenter - Add Host - Provide Name

    • Connection Settings: Enter the log in credentials for the ESXi host. Username should be root and password should be the default vCenter password obtained from the OCI Console SDDC details page. Click Next.

      vCenter - Add Host - Connection Settings

    • Host Summary: Review the summarized information about the host and click Next.

      vCenter - Add Host - Host Summary

    • Host Lifecycle: Deselect Manage host with an image and click Next.

      vCenter - Add Host - Host Lifecycle

    • Assign License: Select an existing vSphere license from the available options to assign a license to the new ESXi host and click Next.

      vCenter - Add Host - Assign License

    • Lockdown Mode: Select Normal lockdown mode, which is the standard setting used with Oracle Cloud VMware Solution deployments. You can adjust this setting if needed based on your specific environment and click Next.

      vCenter - Add Host - Lockdown Mode

    • VM Location: Keep the default settings for VM placement and click Next.

      vCenter - Add Host - VM Location

    • Review and Finish: Review all the configuration details one last time and click Finish to submit the task and add the ESXi host to your vCenter cluster.

      vCenter - Add Host - Review and Finish

  4. Set ESXi host to maintenance mode.

    Once the ESXi host is successfully added, right-click on it within the vCenter inventory and select Enter Maintenance Mode. This takes the host offline, allowing you to configure its network settings.

    vCenter - Enter Maintenance Mode

    Validate that the host has successfully entered into maintenance mode.

    vCenter - Enter Maintenance Mode Successful

  5. Verify host status in NSX Manager (Optional).

    In NSX Manager, the new ESXi host should be listed under Other Nodes and NSX Configuration status as Not Configured.

    NSX Manager - Verfiy Host addition

  6. Add the ESXi host to the Distributed Switch.

    1. Navigate to the Networking view within vCenter Server.

    2. Select the Distributed Switch (DSwitch) associated with the cluster where the ESXi host will reside.

    3. Right-click on the DSwitch or click Actions and select Add and Manage Hosts.

      vDS - Add and Manage Hosts

    4. In the Add and Manage Hosts window, enter the following information.

      • Add Hosts: Select Add Hosts and click Next.

        vDS - Add and Manage Hosts - Add Hosts

      • Select Hosts: Select the newly added ESXi host from the list and ensure it is currently in maintenance mode. Click Next.

        vDS - Add and Manage Hosts - Select Host

      • Manage Physical Adapters: Select vmnic0 and vmnic1 from the drop-down menu.

        vDS - Add and Manage Hosts - Select Physical Adapters

      • Manage VMkernel Adapters: Assign each VMkernel adapter (vmk) to a specific port group as shown.

        VMKernal Adapter Port Group
        vmk0 Management Networking
        vmk1 vMotion
        vmk2 vSAN
        vmk3 Replication
        vmk4 Provisioning

        vDS - Add and Manage Hosts - Assign vmk adapters to Port groups

      • Migrate VM Networking: Keep the default values for migrating VM networking.

        vDS - Add and Manage Hosts - Migrate VM Networking

    5. Review all configuration details and click Finish to submit the changes and add the ESXi host to the Distributed Switch.

      vDS - Add and Manage Hosts - Review and Finish

  7. Move the ESXi host to the vCenter cluster.

    1. Once the network configuration is complete, you can move the ESXi host to the intended vCenter cluster. Right-click on the host and select Move to.

      vCenter - Move Host to Cluster

    2. In the Move To window, select the cluster and click OK.

      vCenter - Move Host to Cluster - Move To

    3. In the Move Host into Cluster window, keep the default selection Put all of this host’s virtual machines in the cluster’s root resource pool and click Ok to complete the move.

      vCenter - Move Host in to Cluster - Options

Task 4: Verify NSX Configuration

Within NSX Manager, you can now observe the configuration status of the newly added ESXi host. NSX automatically pushes the configuration to the host and integrates it into the cluster.

Monitor the NSX configuration for successful completion. This process typically takes at least 5 minutes. The NSX Configuration first changes to Success and Node Status shows as Unknown, after a few minutes it changes to Down and then to Up.

NSX - Monitor NSX Configuration Status

Once the configuration finishes, verify that the NSX Configuration status displays as Success and Up within NSX Manager. This confirms that the ESXi host has been successfully configured for NSX.

NSX - NSX Configuration Complete

Task 5: Configure the Datastores

This task covers configuring datastores for your newly added ESXi host. The specific steps depend on whether you are using Virtual Machine File System (VMFS) Datastores backed by OCI Block Storage or vSAN datastore with Dense shaped instances.

Scenario 1: Configure Standard Shaped Instances (VMFS Datastores)

Follow these steps to configure VMFS datastores using OCI Block Storage.

  1. Ensure all the OCI Block Volumes attached to the other ESXi hosts in the cluster are also attached to the newly added host.

  2. Copy the iSCSI attachment information for all the block volumes you attached in step 1. You will need this information later.

  3. Access iSCSI storage adapters.

    1. In vCenter Server, select the newly added ESXi host.

    2. Navigate to Configure and Storage Adapters.

  4. Configure iSCSI targets servers.

    1. From the right-hand pane, select the iSCSI storage adapter.

    2. Select the Dynamic Discovery tab and click Add to add iSCSI target server.

    vCenter - Configure VMFS Storage

  5. Add all the iSCSI target server IPs you gathered in step 2.

    vCenter - Configure iSCSI Target Servers

  6. Once all iSCSI servers are added, select the iSCSI adapter again and click Rescan Adapters to refresh the connection.

    vCenter - Rescan iSCSI Adapter

    vCenter - Rescan iSCSI Adapter - details

  7. Verify block volume attachments. After the rescan completes, you should see all the block volumes attached as Oracle iSCSI disks.

    vCenter - Validate New Devices

  8. Validate datastore availability from Datastores tab for the newly added host. You should see all the datastores mounted, matching the configuration of the other hosts in the cluster.

    vCenter - Validate Datastores

  9. To confirm datastore presence, navigate to the Storage view and select the datastore cluster. Verify that the newly added host appears under the Hosts section.

  10. Once all configurations are complete, remove the ESXi host from maintenance mode.

    vCenter - ESXi Host - Exit Maintenance Mode

  11. After exiting maintenance mode, confirm that your virtual environment remains stable and healthy as expected.

    vCenter - Validate Host Health

Scenario 2: Configure Dense Shaped Instances (vSAN Datastore)

Note: These steps are only applicable if you are using Dense shaped instances with vSAN.

Before configuring vSAN datastore, ensure the ESXi host is out of maintenance mode. Monitor the progress until completion.

vCenter - Dense Host - Exit Maintenance Mode

  1. Access vSAN disk management.

    1. Select Dense Cluster under the data center.

    2. Navigate to Configure, vSAN and Disk Management.

  2. To claim unused disks, click Claim Unused Disks to incorporate available disks into vSAN storage.

    vCenter - Dense Host - Claim Unused Disks

  3. Configure vSAN disks: A vSAN cluster typically requires at least one high-performance cache disk and one or more capacity disks per host for data storage. Select the first disk as the cache and the remaining disks for capacity (usually 7 for Dense shapes). You can adjust this configuration based on your specific environment. Submit the task and wait for successful completion.

    vCenter - Dense Host - Configure vSAN Disks

  4. From the right-hand pane, confirm that all available disks on the host are listed and healthy.

    vCenter - Dense Host - Configure vSAN Disks

  5. To verify vSAN datastore capacity, navigate to the Storage view and select the vSAN datastore. The summary page should now reflect the increased total capacity due to the added capacity drives.

    vCenter - Dense Host - Configure vSAN Disks

  6. To confirm host status in vSAN, go to the Hosts tab within the datastore. You should see the newly added host listed with a Normal status.

    vCenter - Dense Host - Configure vSAN Disks

  7. Configure vSAN fault domain.

    • A single OCI region typically has 3 fault domains, and vSAN fault domains should mirror these. Oracle Cloud VMware Solution provisioning usually distributes ESXi hosts across all fault domains for optimal balance. As this is replacing the existing faulty host, the provisioning service deploys within the same fault domain. Aim to co-locate it with the original host that resides in the same OCI fault domain.

    • Under vSAN, click Fault Domains. Select the newly added host and move it to the same fault domain as the original host (for example, Fault-Domain-1).

    vCenter - Dense Host - Configure vSAN Disks

  8. Verify fault domain placement and confirm that the new host now resides within the desired fault domain.

    vCenter - Dense Host - Configure vSAN Disks

Task 6: Test the New ESXi Host

This task ensures the newly added ESXi host functions correctly by deploying or migrating a test virtual machine (VM) to it.

  1. Deploy or migrate a test VM. You can either deploy a new test VM directly on the newly added ESXi host or migrate/clone an existing test VM from another host in the cluster to the new host.

  2. Verify VM functionality. Once the VM is deployed or migrated, power it on and perform basic tests to confirm it works as expected. This could involve:

    • Logging in to the VM operating system.
    • Verifying network connectivity.
    • Checking resource availability (CPU, memory and storage).
    • Testing application functionality (if applicable).

    If the test VM operates successfully on the new ESXi host, you can proceed with confidence that the host has been configured correctly.

Task 7: Remove the Faulty Host from vCenter and NSX Manager

In this task, we will remove the ESXi host from your vCenter cluster and NSX Manager.

  1. Prepare the ESXi host for removal.

    1. Log in to the vCenter Server and locate the ESXi host you want to retire.

    2. If the host is already in a Disconnected state and all you want to do is remove the host from vCenter, skip step 2 to 5 and move to step 6 (Disconnect and Remove host from vCenter Inventory).

    3. Ensure all virtual machines on the target host are either powered off or migrated to the new host or other hosts within the cluster. A host with running VMs cannot enter maintenance mode.

      Migrate off VMs from Faulty Host

  2. To enter maintenance mode, right-click on the ESXi host and select Maintenance Mode and Enter Maintenance Mode.

    Faulty Host - Enter Maintenance Mode

    Data Migration Options (Based on Host Type):

    • Standard Shapes: By default, powered off and suspended VMs are migrated to other hosts. Accept the defaults and submit the task.

      Faulty Host - Enter Maintenance Mode Standard Shape

    • Dense Shapes: In addition to the default migration, also select Full data migration from the vSAN data migration drop-down menu. This ensures complete data evacuation from the host.

      Faulty Host - Enter Maintenance Mode Dense Shape

      Note: Click PRE-CHECK to validate the vSAN migration process before proceeding to maintenance mode.

      Faulty Host - Enter Maintenance Mode Dense Shape Options

  3. Verify successful maintenance mode entry.

    • Standard Shapes: Due to minimal data movement, this should be quick.

      Faulty Host - Successful Maintenance Mode - Standard Shape

    • Dense Shapes: vSAN data evacuation can take time depending on the environment. Monitor the progress.

      Faulty Host - Maintenance Mode in Progress- Dense Shape

      Note: Ensure successful entry to maintenance mode before continuing to avoid data loss or downtime.

      Faulty Host - Successful Maintenance Mode - Dense Shape

  4. Move faulty host out of the cluster.

    1. To isolate the host from the cluster, right-click on the host and click Move.

      Move host out of Cluster

    2. Select the datacenter.

      Move host out of Cluster - Select Datacenter

    3. Verify the faulty host is not in the vCenter cluster.

      Verify faulty host is out of Cluster

  5. Monitor NSX configuration removal.

    1. To monitor NSX configuration removal, log in to NSX Manager and observe the automatic removal of NSX configuration on the host.

    2. Verify NSX configuration removal completion. In NSX Manager, confirm the host shows Not Configured under Other Nodes.

      NSX Configuration removed

  6. Disconnect and remove host from vCenter Inventory.

    1. To disconnect the ESXi host, right-click on the host and click Connection, Disconnect in vCenter Server.

      Disconnect Faulty Host from vCenter

    2. Verify host disconnected status as the host should now appear as Disconnected in vCenter Server.

      Faulty Host Disconnected from vCenter

    3. To remove the host from inventory, right-click on the host and select Remove from Inventory. This permanently removes the host from your vCenter inventory (proceed with caution).

      Remove Faulty Host from vCenter Inventory

    4. Verify the health of your environment in both vCenter Server and NSX Manager.

      vCenter Health Status

      NSX Health Status

Task 8: Remove the Faulty Host in OCI Console

This task guides you through terminating the faulty ESXi host within the OCI Console.

  1. Open the OCI Console and navigate to the cluster containing the ESXi host you want to remove.

  2. Identify the host that you previously marked for replacement (indicated with Updating status).

  3. To terminate the host, click Remove failed Host associated with the faulty host. This would be located in the top banner or within the host details section.

    Terminate Faulty Host

  4. Now, the faulty host will change to Terminating state.

    Faulty Host in Terminating State

    The OCI Console will start a task to delete the ESXi host. Monitor the progress of this task until it reaches successful completion.

    Notice the pricing interval end times have switched between the hosts.

    Billing Interval Commitments swapped to new host

  5. Once the termination task finishes successfully, the Replace Host activity is considered complete. Validate that the status of you SDDC is healthy and back to the same host count as it was prior to starting the replace host activity.

    SDDC Details page and Cluster summary

For further configuration options tailored to your specific VMware environment, consult the relevant vCenter documentation. For any Oracle Cloud VMware Solution related questions, see Oracle Cloud VMware Solution.

Acknowledgments

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.