Servers

These sections describe the management and compute nodes employed by Private Cloud Appliance.

Management Nodes

At the heart of each Private Cloud Appliance installation are three management nodes. They are installed in rack units 5, 6 and 7 and form a cluster for high availability: all servers are capable of running the same controller software and system-level services, have equal access to the system configuration, and all three servers manage the system as a fully active cluster. For details about the management node components, refer to Server Components.

The management nodes, running the controller software, provide a foundation for the collection of services responsible for operating and administering Private Cloud Appliance. Responsibilities of the management cluster include monitoring and maintaining the system hardware, ensuring system availability, upgrading software and firmware, backing up and restoring the appliance, and managing disaster recovery. For an overview of management node services, refer to Appliance Administration Overview. See the Oracle Private Cloud Appliance Administrator Guide for instructions about performing management functions.

The part of the system where the appliance infrastructure is controlled is called the Service Enclave, which runs on the management node cluster and can be accessed through the Service CLI or the Service Web UI. Access is closely monitored and restricted to privileged administrators. For more information, see Administrator Access. Also refer to the chapter Working in the Service Enclave in the Oracle Private Cloud Appliance Administrator Guide.

Compute Nodes

The compute nodes in the Private Cloud Appliance are part of the hardware layer and provide the processing power and memory capacity to host compute instances. Management of the hardware layer is provided by the platform and services layer of the appliance. For more details about the Layered Architecture approach, see Architecture and Design.

When a system is initialized, compute nodes are automatically discovered by the admin service and put in the Ready-to-Provision state. Administrators can then provision the compute nodes through the Service Enclave, and they are ready for use. When additional compute nodes are installed at a later stage, the new nodes are discovered, powered on, and discovered automatically by the same mechanism.

As an administrator, you can monitor the health and status of compute nodes from the Private Cloud Appliance Service Web UI or Service CLI, as well as performing other operations such as assigning compute nodes to tenancies and upgrading compute node components. For general administration information see Appliance Administration Overview . For more information about managing compute resources, see Compute Instance Concepts.

The minimum configuration of the base rack contains three compute nodes, but it can be expanded by three nodes at a time up to 18 compute nodes, if you choose to use all of your flex bay space for compute nodes. The system can support up to 20 compute nodes total, with two slots reserved for spares that you can request through an exception process. Contact your Oracle representative about expanding the compute node capacity in your system. For hardware configuration details of the compute nodes, refer to Server Components.

GPU Expansion Nodes

GPU capacity is added to a Private Cloud Appliance by installing GPU expansion nodes. These are X10-2c GPU L40S Compute Server nodes installed in an X10-2c GPU expansion rack, which has a minimum configuration of 1 factory-installed GPU node. More nodes can be installed at the factory or after deployment. Cabling is preinstalled for a full rack configuration, regardless of the number of factory-installed nodes. A single expansion rack contains up to 6 GPU nodes. Two expansion racks can be connected to the base rack, for a maximum of 12 GPU nodes.

The X10-2c GPU L40S Compute Server is a 3 RU server with Intel Xeon Platinum 8480+ architecture, high-speed Ethernet connectivity, and four NVIDIA L40S GPUs with 48GB GDDR6 memory and 1466 peak FP8 TFLOPS.

For detailed component specifications, refer to the manufacturer website.

Unlike standard compute nodes, the GPU nodes are not automatically discovered. After physical installation, and assuming an active high-performance storage pool is present in the appliance configuration, the GPU expansion rack is activated by running a script from one of the management nodes. The script tasks are followed by the node installation process, which prepares the nodes for provisioning.

The new nodes are provisioned from the Service Enclave, using the UI or CLI. When the GPU nodes are up and running, appliance administrators can manage and monitor these in the same way as standard compute nodes. Users must deploy compute instances with a dedicated shape to take advantage of the GPUs.

For more information, refer to Integrating GPU Expansion Nodes in the "Oracle Private Cloud Appliance Administrator Guide"

Server Components

The following table lists the components of the server models that might be installed in a Private Cloud Appliance rack and which are supported by the current software release.

Oracle Server X9-2 Management Node Oracle Server X9-2 Compute Node Oracle Server X10 Compute Node X10-2c GPU L40S Compute Server Oracle Server X11 Compute Node Oracle Server X11 Management Node

2x Intel Xeon Gold 5318Y CPU, 24 core, 2.10GHz, 165W

2x Intel Xeon Platinum 8358 CPU, 32 core, 2.60GHz, 250W

2x AMD EPYC 9J14 CPU, 96 core, 2.60GHz, 400W

2x Intel Xeon Platinum 8480+ CPU, 56 core, 2.00GHz, 350W

2x AMD EPYC 9J25 CPU, 96 core, 2.60GHz, 400W

2x AMD EPYC 9J15 CPU, 32 core, 2.95GHz, 210W

16x 64 GB DDR4-3200 DIMMs (1TB total)

16x 64 GB DDR4-3200 DIMMs (1TB total)

24x 96 GB DDR5-4800 DIMMs (2.25 TB total)

16x 64 GB DDR5-4800 DIMMs (1TB total)

24x 96 GB DDR5-6400 DIMMs (2.25 TB total)

24x 64 GB DDR5-6400 DIMMs (1.5 TB total)

2x 240GB M.2 SATA boot devices configured as RAID1 mirror

2x 240GB M.2 SATA boot devices configured as RAID1 mirror

2x 480GB M.2 NVMe SSD boot devices configured as RAID1 mirror

--

2x 480GB M.2 NVMe SSD boot devices configured as RAID1 mirror

2x 480GB M.2 NVMe SSD boot devices configured as RAID1 mirror

2x 3.84 TB NVMe SSD storage devices configured as RAID1 mirror

--

--

2x 3.84 TB NVMe SSD storage devices configured as RAID1 mirror

--

2x 3.84 TB NVMe SSD storage devices configured as RAID1 mirror

1x Ethernet port for remote management

1x Ethernet port for remote management

1x Ethernet port for remote management

2x Ethernet port for remote management

2x Ethernet ports for remote management

2x Ethernet ports for remote management

1x Dual-port 100Gbit Ethernet NIC module in OCPv3 form

1x Dual-port 100Gbit Ethernet NIC module in OCPv3 form

2x Dual-port 100Gbit Ethernet NIC modules

2x Dual-port 200Gbit Ethernet NIC modules

2x Dual-port 200Gbit Ethernet NIC modules

1x Dual-port 200Gbit Ethernet NIC modules

--

--

--

4x NVIDIA L40S GPU, 48GB GDDR6, 350W

--

--

2x Redundant power supplies and fans

2x Redundant power supplies and fans

2x Redundant power supplies and fans

4x Redundant power supplies and fans

2x Redundant power supplies and fans

2x Redundant power supplies and fans