2 Hardware Overview

This chapter provides an overview of the hardware components that the Oracle Private Cloud Appliance comprises: a base rack with servers, switches and storage. Different sections describe the role of each component. Special attention is given to the appliance network infrastructure, and the way it integrates with the data center environment and, optionally, an Oracle Exadata system.

Base Rack Components

The current Private Cloud Appliance hardware platform, with factory-installed software release 3.0.1, consists of an Oracle Rack Cabinet 1242 base, populated with the hardware components identified in the figure below. This figure shows a full configuration, however you can customize your system to include different storage or compute capacity as needed.

Starting with release 3.0.1, the flex bay concept is available on Private Cloud Appliance. Flex bays are dedicated 4 rack unit sections within the rack that can be used for flexible expansion of your system; adding either storage or compute resources. For each flex bay, you can choose to add 1-4 compute nodes, 1 Oracle Storage Drive Enclosure DE3-24C, or 1-2 Oracle Storage Drive Enclosure DE3-24P. A flex bay can house either storage or compute resources, but not both in the same bay.


Figure showing the components installed in a base rack.

Figure Legend

Callout Quantity Description

A

1 - 4

flex bay

can accommodate 1-4 compute nodes, or 1-2 storage enclosures

B

2

leaf switch

C

1

Oracle Storage Drive Enclosure DE3-24C disk shelf

D

1

management switch

E

2

spine switch

F

1 - 5

can accommodate 1 - 5 compute nodes

G

3

compute nodes

3 required for minimum configuration

H

3

management nodes

I

2

storage controllers

Servers

These sections describe the management and compute nodes employed by Private Cloud Appliance.

Management Nodes

At the heart of each Private Cloud Appliance installation are three management nodes. They are installed in rack units 5, 6 and 7 and form a cluster for high availability: all servers are capable of running the same controller software and system-level services, have equal access to the system configuration, and all three servers manage the system as a fully active cluster. For details about the management node components, refer to Server Components.

The management nodes, running the controller software, provide a foundation for the collection of services responsible for operating and administering Private Cloud Appliance. Responsibilities of the management cluster include monitoring and maintaining the system hardware, ensuring system availability, upgrading software and firmware, backing up and restoring the appliance, and managing disaster recovery. For an overview of mangement node services, refer to Appliance Administration Overview. See the Oracle Private Cloud Appliance Administrator Guide for instructions about performing management functions.

The part of the system where the appliance infrastructure is controlled is called the Service Enclave, which runs on the management node cluster and can be accessed through the Service CLI or the Service Web UI. Access is closely monitored and restricted to privileged administrators. For more information, see Administrator Access. Also refer to the chapter Working in the Service Enclave in the Oracle Private Cloud Appliance Administrator Guide.

Compute Nodes

The compute nodes in the Private Cloud Appliance are part of the hardware layer and provide the processing power and memory capacity to host compute instances. Management of the hardware layer is provided by the platform and services layer of the appliance. For more details about the Layered Architecture approach, see Architecture and Design.

When a system is initialized, compute nodes are automatically discovered by the admin service and put in the Ready-to-Provision state. Administrators can then provision the compute nodes through the Service Enclave, and they are ready for use. When additional compute nodes are installed at a later stage, the new nodes are discovered, powered on, and discovered automatically by the same mechanism.

As an administrator, you can monitor the health and status of compute nodes from the Private Cloud Appliance Service Web UI or Service CLI, as well as performing other operations such as assigning compute nodes to tenancies and upgrading compute node components. For general administration information see Appliance Administration Overview . For more information about managing compute resources, see Compute Instance Concepts.

The minimum configuration of the base rack contains three compute nodes, but it can be expanded by three nodes at a time up to 18 compute nodes, if you choose to use all of your flex bay space for compute nodes. The system can support up to 20 compute nodes total, with two slots reserved for spares that you can request through an exception process. Contact your Oracle representative about expanding the compute node capacity in your system. For hardware configuration details of the compute nodes, refer to Server Components.

Server Components

The table below lists the components of the server models that may be installed in a Private Cloud Appliance rack and which are supported by the current software release.

Quantity Oracle Server X9-2 Management Node Oracle Server X9-2 Compute Node

1

Oracle Server X9-2 base chassis

Oracle Server X9-2 base chassis

2

Intel Xeon Gold 5318Y CPU, 24 core, 2.10GHz, 165W

Intel Xeon Platinum 8358 CPU, 32 core, 2.60GHz, 250W

16

16x 64 GB DDR4-3200 DIMMs (1TB total)

16x 64 GB DDR4-3200 DIMMs (1TB total)

2

240GB M.2 SATA boot devices configured as RAID1 mirror

240GB M.2 SATA boot devices configured as RAID1 mirror

2

3.84 TB NVMe SSD storage devices configured as RAID1 mirror

1

Ethernet port for remote management

Ethernet port for remote management

1

Dual-port 100Gbit Ethernet NIC module in OCPv3 form

Dual-port 100Gbit Ethernet NIC module in OCPv3 form

2

Redundant power supplies and fans

Redundant power supplies and fans

Network Infrastructure

For network connectivity, Private Cloud Appliance relies on a physical layer that provides the necessary high-availability, bandwidth and speed. On top of this, a distributed network fabric composed of software-defined switches, routers, gateways and tunnels enables secure and segregated data traffic – both internally between cloud resources, and externally to and from resources outside the appliance.

Device Management Network

The device management network provides internal access to the management interfaces of all appliance components. These have Ethernet connections to the 1Gbit management switch, and all receive an IP address from each of these address ranges:

  • 100.96.0.0/23 – IP range for the Oracle Integrated Lights Out Manager (ILOM) service processors of all hardware components

  • 100.96.2.0/23 – IP range for the management interfaces of all hardware components

To access the device management network, you connect a workstation to port 2 of the 1Gbit management switch and statically assign the IP address 100.96.3.254 to its connected interface. Alternatively, you can set up a permanent connection to the Private Cloud Appliance device management network from a data center administration machine, which is also referred to as a bastion host. From the bastion host, or from the (temporarily) connected workstation, you can reach the ILOMs and management interfaces of all connected rack components. For information about configuring the bastion host, see the "Optional Bastion Host Uplink" section of the Oracle Private Cloud Appliance Installation Guide.

Note that port 1 of the 1Gbit management switch is reserved for use by support personnel only.

Data Network

The appliance data connectivity is built on redundant 100Gbit switches in two-layer design similar to a leaf-spine topology. The leaf switches interconnect the rack hardware components, while the spine switches form the backbone of the network and provide a path for external traffic. Each leaf switch is connected to all the spine switches, which are also interconnected. The main benefits of this topology are extensibility and path optimization. A Private Cloud Appliance rack contains two leaf and two spine switches.

The data switches offer a maximum throughput of 100Gbit per port. The spine switches use 5 interlinks (500Gbit); the leaf switches use 2 interlinks (200Gbit) and 2x2 crosslinks to each spine. Each server node is connected to both leaf switches in the rack, through the bond0 interface that consists of two 100Gbit Ethernet ports in link aggregation mode. The two storage controllers are connected to the spine switches using 4x100Gbit connections.

For external connectivity, 5 ports are reserved on each spine switch. Four ports are available to establish the uplinks between the appliance and the data center network; one port is reserved to optionally segregate the administration network from the data traffic.

The connections between the Private Cloud Appliance and the customer data center are called uplinks. They are physical cable connections between the two spine switches in the appliance rack and one or – preferably – two next-level network devices in the data center infrastructure. Besides the physical aspect, there is also a logical aspect to the uplinks: how traffic is routed between the appliance and the external network it is connected to.

On each spine switch, ports 1-4 can be used for uplinks to the data center network. For speeds of 10Gbps or 25Gbps, the spine switch port must be split using an MPO-to-4xLC breakout cable. For speeds of 40Gbps or 100Gbps each switch port uses a single MPO-to-MPO cable connection. The correct connection speed must be specified during initial setup so that the switch ports are configured with the appropriate breakout mode and transfer speed.

The uplinks are configured during system initialization, based on information you provide as part of the installation checklist. Unused spine switch uplink ports, including unused breakout ports, are disabled for security reasons. The table shows the supported uplink configurations by port count and speed, and the resulting total bandwidth.

Uplink Speed Number of Uplinks per Spine Switch Total Bandwidth

10 Gbps

1, 2, 4, 8, or 16

20, 40, 80, 160, or 320 Gbps

25 Gbps

1, 2, 4, 8, or 16

50, 100, 200, 400, or 800 Gbps

40 Gbps

1, 2, or 4

80, 160, or 320 Gbps

100 Gbps

1, 2, or 4

200, 400, or 800 Gbps

Regardless of the number of ports and port speeds configured, you also select a topology for the uplinks between the spine switches and the data center network. This information is critical for the network administrator to configure link aggregation (port channels) on the data center switches. The table shows the available options.

Topology Description

Triangle

In a triangle topology, all cables from both spine switches are connected to a single data center switch.

Square

In a square topology, two data center switches are used. All outbound cables from a given spine switch are connected to the same data center switch.

Mesh

In a mesh topology, two data center switches are used as well. The difference with the square topology is that uplinks are created in a cross pattern. Outbound cables from each spine switch are connected in pairs: one cable to each data center switch.

The physical topology for the uplinks from the appliance to the data center network is dependent on bandwidth requirements and available data center switches and ports. Connecting to a single data center switch implies that you select a triangle topology. To increase redundancy you distribute the uplinks across a pair of data center switches, selecting either a square or mesh topology. Each topology allows you to start with a minimum bandwidth, which you can scale up with increasing need. The maximum bandwidth is 800 Gbps, assuming the data center switches, transceivers and cables allow it.

The diagrams below represent a subset of the supported topologies, and can be used as a reference to integrate the appliance into the data center network. Use the diagrams and the notes to determine the appropriate cabling and switch configuration for your installation.


Figure showing six examples of supported uplink topologies. The examples are explained in the diagram notes below.

The logical connection between the appliance and the data center is implemented entirely in layer 3. In the OSI model (Open Systems Interconnection model), layer 3 is known as the network layer, which uses the source and destination IP address fields in its header to route traffic between connected devices.

Private Cloud Appliance supports two logical connection options: you must choose between static routing and dynamic routing. Both routing options are supported by all three physical topologies.

Connection Type Description

Static Routing

When static routing is selected, all egress traffic goes through a single default gateway IP address configured on data center network devices. This gateway IP address must be in the same subnet as the appliance uplink IP addresses, so it is reachable from the spine switches. The data center network devices can use SVIs (Switch Virtual Interfaces) with VLAN IDs in the range of 2-3899.

All gateways configured within a virtual cloud network (VCN) will automatically have a route rule to direct all traffic intended for external destination to the IP address of the default gateway.

Dynamic Routing

When dynamic routing is selected, BGP (Border Gateway Protocol) is used to establish a TCP connection between two Autonomous Systems: the appliance network and the data center network. This configuration requires a registered or private ASN (Autonomous System Number) on each side of the connection. Private Cloud Appliance BGP configuration uses ASN 136025 by default, this can be changed during initial configuration.

For BGP routing, two routing devices in the data center must be connected to the two spine switches in the appliance rack. Corresponding interfaces (port channels) between the spine switches and the data center network devices must be in the same subnet. It is considered good practice to use a dedicated /30 subnet for each point-to-point circuit, which is also known as a route hand-off network. This setup provides redundancy and multipathing.

Dynamic routing is also supported in a triangle topology, where both spine switches are physically connected to the same data center network device. In this configuration, two BGP sessions are still established: one from each spine switch. However, this approach reduces the level of redundancy.

Supported Routing Designs

The table below shows which routing designs are supported depending on the physical topology in your data center and the logical connection you choose to implement.

Note that link aggregation across multiple devices (vPC or MLAG) is only supported with static routing. When dynamic routing is selected, link aggregation is restricted to ports of the same switch.

When the uplinks are cabled in a mesh topology, a minimum of 2 physical connections per spine switch applies. To establish BGP peering, 2 subnets are required. If the uplink count changes, the port channels are reconfigured but the dedicated subnets remain the same.

The uplinks to the data center run a variety of protocols to provide redundancy and reduce link failure detection and recovery times on these links. These protocols work with the triangle, square, or mesh topologies.

The suite of uplink protocols include:

  • Bidirectional Forwarding Detection (BFD)
  • Virtual Router Redundancy Protocol (VRRP)
  • Hot Spare Router Protocol (HSRP)
  • Equal Cost Multi-Path (ECMP)

Each is briefly described in the following sections of this topic.

BFD

In most router networks, connection failures are detected by loss of the “hello” packets sent by routing protocols. However, detection by this method often takes more than one second, routing a lot of packets on high-speed links to a destination that they cannot reach, which burdens link buffers. Increasing the “hello” packet rate burdens the router CPU.

Bidirectional Forwarding Detection (BFD) is a built-in mechanism that alerts routers at the end of a failed link that there is a problem more quickly than any other mechanism, reducing the load on buffers and CPUs. BFD works even in situations where there are switches or hubs between the routers.

BFD requires no configuration and has no user-settable parameters.

VRRPv3

The Virtual Router Redundancy Protocol version 3 (VRRPv3) is a networking protocol that uses the concept of a virtual router to group physical routers together and make them appear as one to participating hosts. This increases the availability and reliability of routing paths through automatic default gateway selections on an IP subnetwork.

With VRRPv3, the primary/active and secondary/standby routers act as one virtual router. This virtual router becomes the default gateway for any host on the subnet participating in VRRPv3. One physical router in the group becomes the primary/active touter for packet forwarding. However, if this router fails, another physical router in the group takes over the forwarding role, adding redundancy to the router configuration. The VRRPv3 “network” is limited to the local subnet and does not advertise routes beyond the local subnet.

HSRP

Cisco routers often use a redundancy protocol called the Hot Spare Router Protocol (HSRP) to improve router availability. Similar to the methods of VRRP, HSRP groups physical routers into a single virtual router. The failure of a physical default router results in another router using HSRP to take over the default forwarding of packets without stressing the host device.

ECMP

Equal Cost Multi-Path (ECMP) is a way to make better use of network bandwidth, especially in more complex router networks with many redundant links.

Normally, router networks with multiple router paths to another destination network choose one active route to a gateway router as the “best” path and use the other paths as a standby in case of failure. The decision about which path to a network gateway router to use is usually determined by its “cost” from the routing protocol perspective. In cases where the cost over several links to reach network gateways are equal, the router simply chooses one based on some criteria. This makes routing decisions easy but wastes network bandwidth as network links on paths not chosen sit idle.

ECMP is a way to send traffic on multiple path links with equal cost, making more efficient use of network bandwidth.

Administration Network

In an environment with elevated security requirements, you can optionally segregate administrative appliance access from the data traffic. The administration network physically separates configuration and management traffic from the operational activity on the data network by providing dedicated secured network paths for appliance administration operations. In this configuration, the entire Service Enclave can be accessed only over the administration network. This also includes the monitoring, metrics collection and alerting services, the API service, and all component management interfaces.

Setting up the administration network requires additional Ethernet connections from the next-level data center network devices to port 5 on each of the spine switches in the appliance. Inside the administration network, the spine switches must each have one IP address and a virtual IP shared between the two. A default gateway is required to route traffic, and NTP and DNS services must be enabled. The management nodes must be assigned host names and IP addresses in the administration network – one each individually and one shared between all three.

A separate administration network can only be configured using static routing with vPC. The use of a VLAN is supported, but when combined with static routing the VLAN ID must be different from the one configured for the data network.

Reserved Network Resources

The network infrastructure and system components of Private Cloud Appliance need a large number of IP addresses and several VLANs for internal operation. It is critical to avoid conflicts with the addresses in use in the customer data center as well as the CIDR ranges configured in the virtual cloud networks (VCNs).

These IP address ranges are reserved for internal use by Private Cloud Appliance:

Reserved IP Addresses Description

CIDR blocks in Shared Address Space

The Shared Address Space, with IP range 100.64.0.0/10, was implemented to connect customer-premises equipment to the core routers of Internet service providers.

To allocate IP addresses to the management interfaces and ILOMs (Oracle Integrated Lights Out Manager) of hardware components, two CIDR blocks are reserved for internal use: 100.96.0.0/23 and 100.96.2.0/23.

CIDR blocks in Class E address range

Under the classful network addressing architecture, Class E is the part of the 32-bit IPv4 address space ranging from 240.0.0.0 to 255.255.255.255. At the time, it was reserved for future use, so it cannot be used on the public Internet.

To accommodate the addressing requirements of all infrastructure networking over the physical 100Gbit connections, the entire 253.255.0.0/16 subnet is reserved. It is further subdivided into multiple CIDR blocks in order to group IP addresses by network function or type.

The various CIDR blocks within the 253.255.0.0/16 range are used to allocate IP addresses for the Kubernetes containers running the microservices, the virtual switches, routers and gateways enabling the VCN data network, the hypervisors, the appliance chassis components, and so on.

Link Local CIDR block

A link-local address belongs to the 169.254.0.0/16 IP range, and is valid only for connectivity within a host's network segment, because the address is not guaranteed to be unique outside that network segment. Packets with link-local source or destination addresses are not forwarded by routers.

The link-local CIDR block 169.254.239.0/24, as well as the IP address 169.254.169.254, are reserved for functions such as DNS requests, compute instance metadata transfer, and cloud service endpoints.

All VCN traffic – from one VCN to another, as well as between a VCN and external resources – flows across the 100Gbit connections and is carried by VLAN 3900. Traffic related to server management is carried by VLAN 3901. All VLANs with higher IDs are also reserved for internal use, and VLAN 1 is the default for untagged traffic. The remaining VLAN range of 2-3899 is available for customer use.

Oracle Exadata Integration

Optionally, Private Cloud Appliance can be integrated with Oracle Exadata for a high-performance combination of compute capacity and database optimization. In this configuration, database nodes are directly connected to reserved ports on the spine switches of the Private Cloud Appliance. Four 100Gbit ports per spine switch are reserved and split into 4x25Gbit breakout ports, providing a maximum of 32 total cable connections. Each database node is cabled directly to both spine switches, meaning up to 16 database nodes can be connected to the appliance. It is allowed to connect database nodes from different Oracle Exadata racks.

Once the cable connections are in place, the administrator configures an Exadata network, which enables traffic between the connected database nodes and a set of compute instances. These prerequisites apply:

  • The Exadata network must not overlap with any subnets in the on-premises network.

  • The VCNs containing compute instances that connect to the database nodes, must have a dynamic routing gateway (DRG) configured.

  • The relevant subnet route tables must contain rules to allow traffic to and from the Exadata network.

The Exadata network configuration determines which Exadata clusters are exposed and which subnets have access to those clusters. Access can be enabled or disabled per Exadata cluster and per compute subnet. In addition, the Exadata network can be exposed through the appliance's external network, allowing other resources within the on-premises network to connect to the database nodes through the spine switches of the appliance. The Exadata network configuration is created and managed through the Service CLI.

Storage

Private Cloud Appliance includes an Oracle ZFS Storage Appliance ZS9-2 as the base storage option, with the capability to expand storage within the rack as needed.

Oracle ZFS Storage Appliance ZS9-2

The Oracle ZFS Storage Appliance ZS9-2, which consists of two controller servers installed at the bottom of the appliance rack and disk shelf about halfway up, fulfills the role of 'system disk' for the entire appliance. It is crucial in providing storage space for the Private Cloud Appliance software.

The default disk shelf in the appliance has 100+ TB of customer usable storage for public object storage, customer compute images, and customer block storage.

The hardware configuration of the Oracle ZFS Storage Appliance ZS9-2 is as follows:

  • Two clustered storage controller heads

  • One fully populated disk chassis with twenty 18TB hard disks

  • Four cache disks installed in the disk shelf: 2x 200GB SSD and 2x 7.68TB SSD

  • Mirrored configuration, for optimum data protection

The storage appliance is connected to the management subnet and the storage subnet. Both heads form a cluster in active-active configuration to guarantee continuation of service in the event that one storage head should fail. The storage heads provide two IPs in the storage subnet: one for the default capacity storage pool, the other to access the optional performance (SSD) storage pool. Four management IP addresses are provided: one local to each controller head and one for each storage pool that follows the pool resources between controllers over takeover or failback events for convenient maintenance access. The primary mirrored capacity storage pool contains two projects, named PCA and private_ostore_project.

Additional Storage

You can optionally increase the storage in your system by adding disk shelves to a system flex bay. The available storage options for expansion are the Oracle Storage Drive Enclosure DE3-24C and the Oracle Storage Drive Enclosure DE3-24P.

The supported hardware configuration of the Oracle Storage Drive Enclosure DE3-24C is as follows:

  • Fully populated disk chassis with twenty 18TB hard disks

  • Four cache disks installed in the disk shelf: 2x 200GB SSD and 2x 7.68TB SSD

The supported hardware configuration of the Oracle Storage Drive Enclosure DE3-24P is as follows:

  • Twenty 7.68TB SSD

  • Two cache disks installed in the disk shelf: 2x 200GB SSD

  • Two drive bay fillers

Storage devices included in additional Oracle Storage Drive Enclosure DE3-24C enclosures are automatically added to the primary capacity storage pool, while devices included in Oracle Storage Drive Enclosure DE3-24P enclosures are automatically added to the optional high performance storage pool, once the enclosure is installed and cabled to the storage controller.