Optional GPU Expansion
To enable GPU-accelerated workloads, Oracle Private Cloud Appliance can be expanded with server nodes that have GPUs installed. An X10-2c GPU node is a 3 RU server with Intel Xeon Platinum 8480+ architecture, high-speed Ethernet connectivity, and four NVIDIA L40S GPUs with 48GB GDDR6 memory and 1466 peak FP8 TFLOPS.
X10-2c GPU nodes are delivered in an expansion rack containing power distribution units (PDUs) and networking components to integrate the additional physical resources with the base rack. A GPU expansion rack contains at least 1 and a maximum of 6 factory-installed GPU nodes. More nodes can be installed after initial deployment. Cabling is preinstalled for a full rack configuration, regardless of the number of factory-installed nodes. Up to two expansion racks can be connected to a base rack, for a maximum of 12 GPU nodes.
Installation Requirements
- Site Preparation
-
If you have decided to expand your Oracle Private Cloud Appliance environment with GPU nodes, carefully plan ahead for the installation of the additional hardware. The X10-2c GPU expansion rack has the same external dimensions as the base rack, and contains the same type of hardware. Therefore, the base rack site requirements also apply for the expansion rack. They are described in detail in the chapter Site Requirements.
- Rack Cabling
-
The cable connections between the base rack and the GPU expansion rack must not exceed 25 meters. Allocate a space for the expansion rack near the base rack, ensuring that the inter-rack cabling is within the specified maximum length when routed through the floor or ceiling. The required cable length must be specified with the order. The default length of cables included in the shipment is 10 meters.
- High-Performance Storage
-
The GPU compute shapes are optimized for high speed and low latency. They use high-performance storage exclusively, meaning the system's ZFS Storage Appliance must provide a high performance storage pool consisting of one or more performance disk trays. In case no performance tray is present in the existing installation, one is added to the GPU expansion order. If the base rack has no rack units available to add the performance tray, it will be installed in a storage expansion rack. The high performance storage pool must be configured before the GPU expansion rack is activated.
Connection to the Base Rack
When the X10-2c GPU expansion rack is in its allocated space, it must be connected to the base rack. The expansion rack leaf switches are cross-connected to the base rack spine switches to extend the data network into the expansion rack. Similarly, the expansion rack components are added to the internal management network through a cable connection between the management switches in the racks. The ports required for this setup have been reserved on all connected switches.
- Connect the Management Switches
-
A GPU expansion rack management switch must be connected to the base rack management switch using two cable links with 25Gbps SFP28 transceivers.
-
On the side of the GPU expansion rack, use management switch ports 49 and 50.
-
Connect the cables from the first GPU expansion rack to base rack management switch ports 49 and 50. Connect the cables from the second GPU expansion rack to base rack management switch ports 51 and 52.
-
- Connect the Leaf Switches
-
Each GPU expansion rack leaf switch has redundant cross-connections to the two spine switches of the base rack. There are 8 total cable links between a GPU expansion rack and the base rack, using 100Gbps QSFP28 transceivers.
-
On the side of the GPU expansion rack, use ports 33-36 in each leaf switch.
-
Connect the cables from the first GPU expansion rack to the base rack spine switches using this pattern:
-
Leaf1 (RU18/pcaswlf03) ports 33 and 34 to Spine1 (RU29/pcaswsp01) ports 11 and 12
-
Leaf1 (RU18/pcaswlf03) ports 35 and 36 to Spine2 (RU30/pcaswsp02) ports 11 and 12
-
Leaf2 (RU19/pcaswlf04) ports 33 and 34 to Spine1 (RU29/pcaswsp01) ports 13 and 14
-
Leaf2 (RU19/pcaswlf04) ports 35 and 36 to Spine2 (RU30/pcaswsp02) ports 13 and 14
-
-
Connect the cables from the second GPU expansion rack to the base rack spine switches using this pattern:
-
Leaf1 (RU18/pcaswlf05) ports 33 and 34 to Spine1 (RU29/pcaswsp01) ports 15 and 16
-
Leaf1 (RU18/pcaswlf05) ports 35 and 36 to Spine2 (RU30/pcaswsp02) ports 15 and 16
-
Leaf2 (RU19/pcaswlf06) ports 33 and 34 to Spine1 (RU29/pcaswsp01) ports 17 and 18
-
Leaf2 (RU19/pcaswlf06) ports 35 and 36 to Spine2 (RU30/pcaswsp02) ports 17 and 18
-
-
When the physical connections are in place, the X10-2c GPU expansion rack must be activated. For more information, refer to Integrating GPU Expansion Nodes in the "Oracle Private Cloud Appliance Administrator Guide".