Plan High Availability for Network Resources
One of the first steps in working with Oracle Cloud Infrastructure is to set up a virtual cloud network (VCN) for your cloud resources. Ensuring the high availability of this network is one of the most important considerations in your architecture design.
- Determine the right size of your network's subnets.
- Plan high availability configurations for these key components: Load Balancers, IPSec VPN Connections, and FastConnect Circuits.
Determine the Right Size of Subnets
A subnet is a subdivision of a cloud network. Establishing high availability of your network requires sizing this resource correctly.
Each subnet in a VCN consists of a contiguous range of IP addresses that do not overlap with other subnets in the VCN (for example, 172.16.1.0/24). The first two IP addresses and the last one in the subnet's CIDR are reserved by the Oracle Cloud Infrastructure Networking service. You can't change the size of the subnet after it is created, so it's important to determine the size you need before creating any subnets. Consider the future growth of your workloads and leave sufficient capacity to meet high availability requirements, such as the need to set up standby Compute instances.
Plan High Availability for Load Balancers
Oracle Cloud Infrastructure Load Balancing provides automated traffic distribution from one entry point to multiple servers reachable from your VCN. The service offers a load balancer with your choice of a public or private IP address and provisioned bandwidth.
The Load Balancing service improves resource utilization, facilitates scaling, and helps ensure high availability. It supports routing incoming requests to various backend sets based on virtual hostname, path route rules, or combination of both.
To accept traffic from the internet, you create a public load balancer. The service assigns it a public IP address that serves as the entry point for incoming traffic. You can associate the public IP address with a friendly DNS name through any DNS vendor.
A public load balancer is regional in scope. It is inherently highly available across availability domains. In a region that has a single availability domain, the load balancer nodes are distributed across fault domains. To achieve high availability for your systems, you can put the systems behind a public load balancer. For instance, you can put your web server VMs as backend server sets behind a public load balancer, as illustrated in the following diagram:
Note:
The architecture shows multiple availability domains (ADs). For a region that has a single AD, adjust the architecture to distribute your resources across the fault domains within the AD.To isolate your load balancer from the internet and simplify your security posture, you create a private load balancer. The Load Balancing service assigns it a private IP address that serves as the entry point for incoming traffic.
When you create a private load balancer, the service requires only one subnet to host both the primary and standby load balancers. The load balancer can be regional or AD-specific, depending on the scope of its host subnet
Description of the illustration pvt-lb.png
Note:
The architecture shows multiple availability domains (ADs). For a region that has a single AD, adjust the architecture to distribute your resources across the fault domains within the AD.- Deploy two private load balancers, one in each availability domain.
- Configure two custom DNS VMs in the VCN.
- Modify the VCN Default DHCP options to use a Custom DNS Resolver and set the DNS servers to the IP addresses of the DNS VMs.
- Add a new round-robin DNS zone entry for the private load balancer FQDN with a low TTL.
- Add two A records with the IP addresses of the two private load balancers.
- Use the FQDN of the private load balancer when accessing the private load balancer.
Understand FastConnect and VPN High Availability Design
Understanding how to design your network for redundancy so that it meets the requirements for the Oracle Cloud Infrastructure IPSec VPNs and FastConnect service level agreement ensures the highly available, fault-tolerant network connections that are key to well-architected systems.
- Schedule regular maintenance by Oracle, your provider, or your own organization.
- Avoid single points of failure, even if you are planning to use multiple interfaces for availability. High availability connections require redundant hardware, even when connecting from the same physical location.
- Consider a dual provider approach to ensure network diversity when selecting FastConnect providers.
- Provision sufficient network capacity to ensure that the failure of one network connection doesn’t overwhelm and degrade redundant connections.
Plan High Availability for IPSec VPN Connections
You can choose to implement IPSec VPN connections to connect your data center to Oracle Cloud Infrastructure. An IPSec VPN connection is easy to set up and cost-effective.
To enable redundancy, each Oracle Cloud Infrastructure dynamic routing gateway (DRG) has multiple VPN endpoints so that each IPSec VPN connection consists of multiple redundant IPSec tunnels that use static routes to route traffic. To ensure high availability, you must set up VPN connections within your internal network to use either path when needed as illustrated in following diagram:
Description of the illustration vpn-redundancy.png
If your data centers span multiple geographical locations, we recommend using a broad CIDR (0.0.0.0/0) as a static route in addition to the CIDR of the specific geographical location. This broad CIDR provides high availability and flexibility to your network design.
For instance, the following diagram shows two networks in separate geographical areas that each connect to Oracle Cloud Infrastructure. Each area has a single on-premises router, so two IPSec VPN connections can be created. Note that each IPSec VPN connection has two static routes: one for the CIDR of the particular geographical area, and a broad 0.0.0.0/0 static route.
Description of the illustration redundancy-multiple-onprem-network.png
In one scenario, the CPE 1 router in the preceding diagram goes down. If Subnet 1 and Subnet 2 can communicate with each other, the VCN is still able to access the systems in Subnet 1 because of the 0.0.0.0/0 static route that goes to CPE 2. The following diagram illustrates this scenario:
Description of the illustration vpn-redundancy-multiple-onprem-networks-failover.png
In another scenario, you add a new geographical area with Subnet 3 and connect it to Subnet 2. You would add a route rule to your VCN’s route table for Subnet 3 so that the VCN can reach systems in Subnet 3 without creating a new VPN connection because of the 0.0.0.0/0 static route that goes to CPE 2. The following diagram illustrates this scenario:
Description of the illustration vpn-redundancy-additional-onprem-network.png
Plan High Availability for FastConnect Circuits
Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable and consistent networking experience compared to internet-based connections.
- Use private peering to extend your existing infrastructure into a virtual cloud network (VCN) in Oracle Cloud Infrastructure (for example, to implement a hybrid cloud, or in a lift-and-shift scenario). Communication across the connection is with IPv4 private addresses (typically RFC 1918).
- Use public peering to access public services in Oracle Cloud Infrastructure without using the internet (for example, to access the Oracle Cloud Infrastructure Console and APIs, or public load balancers in your VCN). Communication across the connection is with IPv4 public IP addresses. Without FastConnect, the traffic destined for public IP addresses would be routed over the internet. With FastConnect, that traffic goes over your private physical connection.
You can either connect directly to Oracle Cloud Infrastructure routers in provider points-of-presence (POPs) or use one of Oracle’s many partners to connect from POPs around the world to your Oracle Cloud Infrastructure Networking resources. Oracle provides features that allow you to build fault-tolerant connections, including multiple POPs per region and multiple FastConnect routers per POP.
- Multiple FastConnect locations within each metro area
- Multiple routers in each FastConnect location
- Multiple physical circuits in each FastConnect location
- Availability domain redundancy: Connect to any FastConnect location and access services located in any availability domain within a region. This configuration provides availability domain resiliency via multiple POPs per region. Peering connections terminate on routers in the POP.
- Data center location redundancy: Connect at two different FastConnect locations per region.
- Router redundancy: Connect to two different routers per FastConnect location.
- Circuit redundancy: Have multiple physical connections at any of the FastConnect locations. Each of these circuits can have multiple physical links in an aggregated interface/LAG, which adds another level of redundancy.
- Partner/provider redundancy: Connect to the FastConnect locations by using single or multiple partners.
- Colocation (port speed of 10 Gbps): By colocating with Oracle in a FastConnect location
- Oracle provider (port speeds in 1-Gbps and 10-Gbps increments): By connecting to an Oracle provider
For the Oracle provider scenario, we recommend that you set up redundant circuits with two different FastConnect locations by the same provider or different providers. With this configuration, you can have redundancy on both the circuits and the data center levels. The following diagram illustrates FastConnect connection with two virtual circuits and two different FastConnect locations:
Description of the illustration fastconnect-multiple-fc-locations.png
Oracle’s FastConnect partners have redundant links to the Oracle network. As a customer of the partner, you should have redundant links to the partner’s network. These connections should be on different routers, both in your network and in the partner’s network. When you provision virtual circuits, provision them across your multiple provider links.
This diagram illustrates these redundant connections:
Description of the illustration fastconnect-dual-vc.png
- Avoid Impact During Planned Maintenance
When you want to perform maintenance on one of your routers, you can configure your Border Gateway Protocol (BGP) local preference on routes learned over their virtual circuit so that the local preference is higher on the router that will stay in service. BGP local preference is used to modify outbound traffic preference in an on-premises network.
You can modify traffic from Oracle to your network by using BGP AS prepending. On the router where the maintenance will be performed, prepend your local BGP AS number. Doing so causes the Oracle Cloud network to prefer the FastConnect virtual circuit that has the shorter AS path.
After you modify the BGP local preference and AS prepending, monitor your router’s virtual circuit interface counters and verify that the in and out packet counter rates are very low. The only traffic remaining on the link should be BGP protocol traffic.
- Continuously Test Redundant Paths
During normal operation, we recommend using all available paths between your on-premises network and the Oracle Cloud. Doing so ensures that if a failure occurs, your redundant path is already working. Alternatively, using an active/backup design means that you trust that your backup path will work during a failure. For this reason, you should consider using equal BGP local preference and BGP AS path length.
Use Both IPSec VPN and FastConnect
To have an additional level of redundancy, you can set up both IPSec VPN and FastConnect to connect your on-premises data centers to Oracle Cloud Infrastructure.
When you set up both an IPSec VPN connection and FastConnect virtual circuits to the same DRG, remember that the IPSec VPN uses static routes but FastConnect uses BGP. Oracle Cloud Infrastructure advertises a route for each of your VCN’s subnets over the FastConnect virtual circuit BGP session, and overrides the default route selection behavior to prefer BGP routes over static routes if a static route overlaps with a route advertised by your on-premises network. The following diagram illustrates this configuration: