Network Infrastructure

For network connectivity, Private Cloud Appliance relies on a physical layer that provides the necessary high-availability, bandwidth and speed. On top of this, a distributed network fabric composed of software-defined switches, routers, gateways and tunnels enables secure and segregated data traffic – both internally between cloud resources, and externally to and from resources outside the appliance.

Device Management Network

The device management network provides internal access to the management interfaces of all appliance components. These have Ethernet connections to the 1Gbit management switch, and all receive an IP address from each of these address ranges:

  • 100.96.0.0/23 – IP range for the Oracle Integrated Lights Out Manager (ILOM) service processors of all hardware components

  • 100.96.2.0/23 – IP range for the management interfaces of all hardware components

To access the device management network, you connect a workstation to port 2 of the 1Gbit management switch and statically assign the IP address 100.96.3.254 to its connected interface. Alternatively, you can set up a permanent connection to the Private Cloud Appliance device management network from a data center administration machine, which is also referred to as a bastion host. From the bastion host, or from the (temporarily) connected workstation, you can reach the ILOMs and management interfaces of all connected rack components. For information about configuring the bastion host, see the "Optional Bastion Host Uplink" section of the Oracle Private Cloud Appliance Installation Guide.

Note that port 1 of the 1Gbit management switch is reserved for use by support personnel only.

Data Network

The appliance data connectivity is built on redundant 100Gbit switches in two-layer design similar to a leaf-spine topology. The leaf switches interconnect the rack hardware components, while the spine switches form the backbone of the network and provide a path for external traffic. Each leaf switch is connected to all the spine switches, which are also interconnected. The main benefits of this topology are extensibility and path optimization. A Private Cloud Appliance rack contains two leaf and two spine switches.

The data switches offer a maximum throughput of 100Gbit per port. The spine switches use 5 interlinks (500Gbit); the leaf switches use 2 interlinks (200Gbit) and 2x2 crosslinks to each spine. Each server node is connected to both leaf switches in the rack, through the bond0 interface that consists of two 100Gbit Ethernet ports in link aggregation mode. The two storage controllers are connected to the spine switches using 4x100Gbit connections.

For external connectivity, 5 ports are reserved on each spine switch. Four ports are available to establish the uplinks between the appliance and the data center network; one port is reserved to optionally segregate the administration network from the data traffic.

The connections between the Private Cloud Appliance and the customer data center are called uplinks. They are physical cable connections between the two spine switches in the appliance rack and one or – preferably – two next-level network devices in the data center infrastructure. Besides the physical aspect, there is also a logical aspect to the uplinks: how traffic is routed between the appliance and the external network it is connected to.

On each spine switch, ports 1-4 can be used for uplinks to the data center network. For speeds of 10Gbps or 25Gbps, the spine switch port must be split using an MPO-to-4xLC breakout cable. For speeds of 40Gbps or 100Gbps each switch port uses a single MPO-to-MPO cable connection. The correct connection speed must be specified during initial setup so that the switch ports are configured with the appropriate breakout mode and transfer speed.

The uplinks are configured during system initialization, based on information you provide as part of the installation checklist. Unused spine switch uplink ports, including unused breakout ports, are disabled for security reasons. The table shows the supported uplink configurations by port count and speed, and the resulting total bandwidth.

Uplink Speed Number of Uplinks per Spine Switch Total Bandwidth

10 Gbps

1, 2, 4, 8, or 16

20, 40, 80, 160, or 320 Gbps

25 Gbps

1, 2, 4, 8, or 16

50, 100, 200, 400, or 800 Gbps

40 Gbps

1, 2, or 4

80, 160, or 320 Gbps

100 Gbps

1, 2, or 4

200, 400, or 800 Gbps

Regardless of the number of ports and port speeds configured, you also select a topology for the uplinks between the spine switches and the data center network. This information is critical for the network administrator to configure link aggregation (port channels) on the data center switches. The table shows the available options.

Topology Description

Triangle

In a triangle topology, all cables from both spine switches are connected to a single data center switch.

Square

In a square topology, two data center switches are used. All outbound cables from a given spine switch are connected to the same data center switch.

Mesh

In a mesh topology, two data center switches are used as well. The difference with the square topology is that uplinks are created in a cross pattern. Outbound cables from each spine switch are connected in pairs: one cable to each data center switch.

The physical topology for the uplinks from the appliance to the data center network is dependent on bandwidth requirements and available data center switches and ports. Connecting to a single data center switch implies that you select a triangle topology. To increase redundancy you distribute the uplinks across a pair of data center switches, selecting either a square or mesh topology. Each topology allows you to start with a minimum bandwidth, which you can scale up with increasing need. The maximum bandwidth is 800 Gbps, assuming the data center switches, transceivers and cables allow it.

The diagrams below represent a subset of the supported topologies, and can be used as a reference to integrate the appliance into the data center network. Use the diagrams and the notes to determine the appropriate cabling and switch configuration for your installation.


Figure showing six examples of supported uplink topologies. The examples are explained in the diagram notes below.

The logical connection between the appliance and the data center is implemented entirely in layer 3. In the OSI model (Open Systems Interconnection model), layer 3 is known as the network layer, which uses the source and destination IP address fields in its header to route traffic between connected devices.

Private Cloud Appliance supports two logical connection options: you must choose between static routing and dynamic routing. Both routing options are supported by all three physical topologies.

Connection Type Description

Static Routing

When static routing is selected, all egress traffic goes through a single default gateway IP address configured on data center network devices. This gateway IP address must be in the same subnet as the appliance uplink IP addresses, so it is reachable from the spine switches. The data center network devices can use SVIs (Switch Virtual Interfaces) with VLAN IDs in the range of 2-3899.

All gateways configured within a virtual cloud network (VCN) will automatically have a route rule to direct all traffic intended for external destination to the IP address of the default gateway.

Dynamic Routing

When dynamic routing is selected, BGP (Border Gateway Protocol) is used to establish a TCP connection between two Autonomous Systems: the appliance network and the data center network. This configuration requires a registered or private ASN (Autonomous System Number) on each side of the connection. Private Cloud Appliance BGP configuration uses ASN 136025 by default, this can be changed during initial configuration.

For BGP routing, two routing devices in the data center must be connected to the two spine switches in the appliance rack. Corresponding interfaces (port channels) between the spine switches and the data center network devices must be in the same subnet. It is considered good practice to use a dedicated /30 subnet for each point-to-point circuit, which is also known as a route hand-off network. This setup provides redundancy and multipathing.

Dynamic routing is also supported in a triangle topology, where both spine switches are physically connected to the same data center network device. In this configuration, two BGP sessions are still established: one from each spine switch. However, this approach reduces the level of redundancy.

Supported Routing Designs

The table below shows which routing designs are supported depending on the physical topology in your data center and the logical connection you choose to implement.

Note that link aggregation across multiple devices (vPC or MLAG) is only supported with static routing. When dynamic routing is selected, link aggregation is restricted to ports of the same switch.

When the uplinks are cabled in a mesh topology, a minimum of 2 physical connections per spine switch applies. To establish BGP peering, 2 subnets are required. If the uplink count changes, the port channels are reconfigured but the dedicated subnets remain the same.

The uplinks to the data center run a variety of protocols to provide redundancy and reduce link failure detection and recovery times on these links. These protocols work with the triangle, square, or mesh topologies.

The suite of uplink protocols include:

  • Bidirectional Forwarding Detection (BFD)
  • Virtual Router Redundancy Protocol (VRRP)
  • Hot Spare Router Protocol (HSRP)
  • Equal Cost Multi-Path (ECMP)

Each is briefly described in the following sections of this topic.

BFD

In most router networks, connection failures are detected by loss of the “hello” packets sent by routing protocols. However, detection by this method often takes more than one second, routing a lot of packets on high-speed links to a destination that they cannot reach, which burdens link buffers. Increasing the “hello” packet rate burdens the router CPU.

Bidirectional Forwarding Detection (BFD) is a built-in mechanism that alerts routers at the end of a failed link that there is a problem more quickly than any other mechanism, reducing the load on buffers and CPUs. BFD works even in situations where there are switches or hubs between the routers.

BFD requires no configuration and has no user-settable parameters.

VRRPv3

The Virtual Router Redundancy Protocol version 3 (VRRPv3) is a networking protocol that uses the concept of a virtual router to group physical routers together and make them appear as one to participating hosts. This increases the availability and reliability of routing paths through automatic default gateway selections on an IP subnetwork.

With VRRPv3, the primary/active and secondary/standby routers act as one virtual router. This virtual router becomes the default gateway for any host on the subnet participating in VRRPv3. One physical router in the group becomes the primary/active touter for packet forwarding. However, if this router fails, another physical router in the group takes over the forwarding role, adding redundancy to the router configuration. The VRRPv3 “network” is limited to the local subnet and does not advertise routes beyond the local subnet.

HSRP

Cisco routers often use a redundancy protocol called the Hot Spare Router Protocol (HSRP) to improve router availability. Similar to the methods of VRRP, HSRP groups physical routers into a single virtual router. The failure of a physical default router results in another router using HSRP to take over the default forwarding of packets without stressing the host device.

ECMP

Equal Cost Multi-Path (ECMP) is a way to make better use of network bandwidth, especially in more complex router networks with many redundant links.

Normally, router networks with multiple router paths to another destination network choose one active route to a gateway router as the “best” path and use the other paths as a standby in case of failure. The decision about which path to a network gateway router to use is usually determined by its “cost” from the routing protocol perspective. In cases where the cost over several links to reach network gateways are equal, the router simply chooses one based on some criteria. This makes routing decisions easy but wastes network bandwidth as network links on paths not chosen sit idle.

ECMP is a way to send traffic on multiple path links with equal cost, making more efficient use of network bandwidth.

Administration Network

In an environment with elevated security requirements, you can optionally segregate administrative appliance access from the data traffic. The administration network physically separates configuration and management traffic from the operational activity on the data network by providing dedicated secured network paths for appliance administration operations. In this configuration, the entire Service Enclave can be accessed only over the administration network. This also includes the monitoring, metrics collection and alerting services, the API service, and all component management interfaces.

Setting up the administration network requires additional Ethernet connections from the next-level data center network devices to port 5 on each of the spine switches in the appliance. Inside the administration network, the spine switches must each have one IP address and a virtual IP shared between the two. A default gateway is required to route traffic, and NTP and DNS services must be enabled. The management nodes must be assigned host names and IP addresses in the administration network – one each individually and one shared between all three.

A separate administration network can be used with both static and dynamic routing. The use of a VLAN is supported, but when combined with static routing the VLAN ID must be different from the one configured for the data network.

Reserved Network Resources

The network infrastructure and system components of Private Cloud Appliance need a large number of IP addresses and several VLANs for internal operation. It is critical to avoid conflicts with the addresses in use in the customer data center as well as the CIDR ranges configured in the virtual cloud networks (VCNs).

These IP address ranges are reserved for internal use by Private Cloud Appliance:

Reserved IP Addresses Description

CIDR blocks in Shared Address Space

The Shared Address Space, with IP range 100.64.0.0/10, was implemented to connect customer-premises equipment to the core routers of Internet service providers.

To allocate IP addresses to the management interfaces and ILOMs (Oracle Integrated Lights Out Manager) of hardware components, two CIDR blocks are reserved for internal use: 100.96.0.0/23 and 100.96.2.0/23.

CIDR blocks in Class E address range

Under the classful network addressing architecture, Class E is the part of the 32-bit IPv4 address space ranging from 240.0.0.0 to 255.255.255.255. At the time, it was reserved for future use, so it cannot be used on the public Internet.

To accommodate the addressing requirements of all infrastructure networking over the physical 100Gbit connections, the entire 253.255.0.0/16 subnet is reserved. It is further subdivided into multiple CIDR blocks in order to group IP addresses by network function or type.

The various CIDR blocks within the 253.255.0.0/16 range are used to allocate IP addresses for the Kubernetes containers running the microservices, the virtual switches, routers and gateways enabling the VCN data network, the hypervisors, the appliance chassis components, and so on.

Link Local CIDR block

A link-local address belongs to the 169.254.0.0/16 IP range, and is valid only for connectivity within a host's network segment, because the address is not guaranteed to be unique outside that network segment. Packets with link-local source or destination addresses are not forwarded by routers.

The link-local CIDR block 169.254.239.0/24, as well as the IP address 169.254.169.254, are reserved for functions such as DNS requests, compute instance metadata transfer, and cloud service endpoints.

All VCN traffic – from one VCN to another, as well as between a VCN and external resources – flows across the 100Gbit connections and is carried by VLAN 3900. Traffic related to server management is carried by VLAN 3901. All VLANs with higher IDs are also reserved for internal use, and VLAN 1 is the default for untagged traffic. The remaining VLAN range of 2-3899 is available for customer use.

Oracle Exadata Integration

Optionally, Private Cloud Appliance can be integrated with Oracle Exadata for a high-performance combination of compute capacity and database optimization. In this configuration, database nodes are directly connected to reserved ports on the spine switches of the Private Cloud Appliance. Four 100Gbit ports per spine switch are reserved and split into 4x25Gbit breakout ports, providing a maximum of 32 total cable connections. Each database node is cabled directly to both spine switches, meaning up to 16 database nodes can be connected to the appliance. It is allowed to connect database nodes from different Oracle Exadata racks.

Once the cable connections are in place, the administrator configures an Exadata network, which enables traffic between the connected database nodes and a set of compute instances. These prerequisites apply:

  • The Exadata network must not overlap with any subnets in the on-premises network.

  • The VCNs containing compute instances that connect to the database nodes, must have a dynamic routing gateway (DRG) configured.

  • The relevant subnet route tables must contain rules to allow traffic to and from the Exadata network.

The Exadata network configuration determines which Exadata clusters are exposed and which subnets have access to those clusters. Access can be enabled or disabled per Exadata cluster and per compute subnet. In addition, the Exadata network can be exposed through the appliance's external network, allowing other resources within the on-premises network to connect to the database nodes through the spine switches of the appliance. The Exadata network configuration is created and managed through the Service CLI.