Learn About Designing a Kubernetes Topology in the Cloud

Enterprises are increasingly deploying cloud-native workloads in lightweight containers. In a production environment, you need a way to manage the containers and ensure that there is no downtime. Kubernetes is an open-source orchestration engine for managing containerized applications and services.

Oracle Cloud Infrastructure Container Engine for Kubernetes is a managed, scalable, and highly available service that you can use to deploy your containerized applications to Kubernetes clusters in the cloud.

Architecture

The architecture of a Kubernetes-based topology in the cloud depends on factors such as whether the containerized workloads should be accessible from the public internet, the size and number of node pools, and the fault-tolerance requirements of your workloads.

The following diagram shows a reference architecture of a Kubernetes cluster in an Oracle Cloud Infrastructure region that contains multiple availability domains.


Architecture for a region that has multiple availability domains
The architecture contains the following components:
  • Virtual cloud network (VCN): All the resources in the topology are in a single VCN.
  • Subnets:

    The VCN in this architecture contains four subnets, two public and two private. One of the public subnets is for the bastion host; the other is for the Internet-facing load balancer. Of the two private subnets, one is for an admin host that contains the tooling necessary to manage the Kubernetes cluster. The other private subnet is for the nodes of the Kubernetes cluster.

    All the subnets are regional; that is, they span all the availability domains in the region, abbreviated as AD1, AD2, and AD3 in the architecture diagram. So they are protected against availability-domain failure. You can use the subnets for resources that you deploy to any availability domain in the region.

  • Network gateways
    • Service gateway (optional)

      The service gateway enables resources in the VCN to access Oracle services such as Oracle Cloud Infrastructure Object Storage, Oracle Cloud Infrastructure File Storage, and Oracle Cloud Infrastructure Database privately; that is, without exposing the traffic to the public internet. Connections over the service gateway can be initiated from the resources within the VCN, and not from the services that the resources communicate with.

    • NAT gateway (optional)

      The NAT gateway enables compute instances that are attached to private subnets in the VCN to access the public internet. Connections through the NAT gateway can be initiated from the resources within the VCN, and not from the public internet.

    • Internet gateway

      The internet gateway enables connectivity between the public internet and any resources in public subnets within the VCN.

  • Bastion host (optional)

    The bastion host is a compute instance that serves as the entry point to the topology from outside the cloud.

    The bastion host is provisioned typically in a DMZ. It enables you to protect sensitive resources by placing them in private networks that can't be accessed directly from outside the cloud. You expose a single, known entry point that you can audit regularly. So you avoid exposing the more sensitive components of the topology, without compromising access to them.

    The bastion host in the sample topology is attached to a public subnet and has a public IP address. An ingress security rule is configured to allow SSH connections to the bastion host from the public internet. To provide an additional level of security, you can limit SSH access to the bastion host from only a specific block of IP addresses.

    You can access Oracle Cloud Infrastructure instances in private subnets through the bastion host. To do this, enable ssh-agent forwarding, which allows you to connect to the bastion host, and then access the next server by forwarding the credentials from your computer. You can also access the instances in the private subnet by using dynamic SSH tunneling. The dynamic tunnel provides a SOCKS proxy on the local port; but the connections originate from the remote host.

  • Load balancer nodes:

    The load balancer nodes intercept and distribute traffic to the available Kubernetes nodes running your containerized applications. If the applications must be accessible from the public internet, then use public load balancers; otherwise, use private load balancers, which don't have a public IP address. The architecture shows two load balancer nodes, each in a distinct availability domain.

  • Admin host (optional):

    By using an admin host, you can avoid installing and running infrastructure-management tools such as, kubectl, helm, and the Oracle Cloud Infrastructure CLI outside the cloud. In the reference architecture, the admin host is in a private subnet, and can be accessed through the bastion host. To be able to run the Oracle Cloud Infrastructure CLI on the admin host, you must designate it as an instance principal.

  • Kubernetes worker nodes:

    The Kubernetes worker nodes are the compute instances on which you can deploy your containerized applications. All the worker nodes in this reference architecture are in a single node pool, and are attached to a private subnet. You can create multiple node pools, if required.

    The worker nodes in the reference architecture are not accessible directly from the public internet. Users of the containerized applications can access them through the load balancer. Administrators can access the worker nodes through the bastion host.

    The architecture shows three worker nodes, each in a distinct availability domain within the region: AD1, AD2, and AD3. The Kubernetes master nodes run in Oracle’s tenancy, and are not shown.

If the region in which you want to deploy your containerized applications contains a single availability domain, then the worker nodes are distributed across the fault domains (FD) within the availability domain, as shown in the following architecture.


Architecture for a region that has a single availability domain

About Required Services and Permissions

This solution requires the following services and permissions:

Service Permissions Required
Oracle Cloud Infrastructure Identity and Access Management Manage dynamic groups and policies.
Oracle Cloud Infrastructure Networking Manage VCNs, subnets, internet gateways, NAT gateways, service gateways, route tables, and security lists.
Oracle Cloud Infrastructure Compute Manage compute instances.
Oracle Cloud Infrastructure Container Engine for Kubernetes Manage clusters and node pools.

See Policy Configuration for Cluster Creation and Deployment.