Deploy Serverless Kubernetes with OKE Virtual Nodes

Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE) provides different operation modes: Managed Nodes and Virtual Nodes. Oracle Cloud Infrastructure (OCI) manages the control plane, but you choose the operation mode. With Managed Nodes, you provision the nodes within your tenancy and you're responsible for maintenance operations such as upgrading, scaling, and patching the worker nodes. OCI provides automated steps for these operations, but it is up to you to initiate the operations. With Virtual Nodes, OCI deploys, monitors, and manages software abstractions of actual nodes in your OCI tenancy. Use Virtual Nodes for a serverless Kubernetes experience to run containerized applications at scale when you want to focus on your workloads, pods, and application logic without the operational overhead of managing, scaling, upgrading, and troubleshooting the node infrastructure.

Both modes of operations can support basic applications up to the most mission-critical. With Virtual Nodes, Kubernetes operations are simplified and provide the best price performance. The tradeoff between Managed Nodes and Virtual Nodes is with Managed Nodes; you have greater control over your node infrastructure. You can configure your Kubernetes resources to use HostPort and HostNetwork or run DaemonSets and other options that may be required for your applications or tools. In most use cases, these options aren't required.

Virtual Nodes make the most sense when you don't need fine-tune control over running and managing your containers. If your applications require configuration of your node infrastructure that is unavailable with Virtual Nodes, then use Managed Nodes.

Architecture

This architecture depicts an application server deployed in an Oracle Cloud Infrastructure Container Engine for Kubernetes cluster with Virtual Nodes that allows you to execute, create, read, update, and delete (CRUD) operations on an Oracle MySQL Database Service database deployed in another subnet within the customer's tenancy. The application server is accessed externally through a load balancer service that maps to a Kubernetes ingress controller. The application requires a password to access the Oracle MySQL Database Service, which is stored in an Oracle Cloud Infrastructure Vault (OCI Vault).

To integrate the OKE cluster with OCI Vault, the External Secrets Operator is deployed in the OKE cluster. This allows you to define a SecretStore resource that maps to the OCI Vault to hold the password for the database. Once the SecretStore resource is mapped, OKE cluster can access the password as a Kubernetes secret. An Oracle Cloud Infrastructure Identity and Access Management role with permission to read from the OCI Vault allows the pod to access the secret. The role is associated with the Kubernetes service account used in the pod's definition to grant it access to read from the OCI Vault.

The following diagram illustrates this reference architecture.

Description of k8n-virtual-nodes.png follows
Description of the illustration k8n-virtual-nodes.png

k8n-virtual-nodes-oracle.zip

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Compartment

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Vault

    Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud. You can use the Vault service to create and manage vaults, keys, and secrets.

  • Registry

    Oracle Cloud Infrastructure Registry is an Oracle-managed registry that enables you to simplify your development-to-production workflow. Registry makes it easy for you to store, share, and manage development artifacts, like Docker images. The highly available and scalable architecture of Oracle Cloud Infrastructure ensures that you can deploy and manage your applications reliably.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Container Engine for Kubernetes

    Oracle Cloud Infrastructure Container Engine for Kubernetes is a fully managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud. You specify the compute resources that your applications require, and Container Engine for Kubernetes provisions them on Oracle Cloud Infrastructure in an existing tenancy. Container Engine for Kubernetes uses Kubernetes to automate the deployment, scaling, and management of containerized applications across clusters of hosts.

  • Oracle MySQL Database Service

    Oracle MySQL Database Service is a fully managed Oracle Cloud Infrastructure (OCI) database service that lets developers quickly develop and deploy secure, cloud native applications. Optimized for and exclusively available in OCI, Oracle MySQL Database Service is 100% built, managed, and supported by the OCI and MySQL engineering teams.

    Oracle MySQL Database Service has an integrated, high-performance analytics engine (HeatWave) to run sophisticated real-time analytics directly against an operational MySQL database.

  • Ingress controller

    An Ingress controller (ing) is a component that runs in a Kubernetes cluster and manages the Ingress resources. It receives traffic from the external network, routes it to the correct service, and performs load balancing and SSL termination. The Ingress controller typically runs as a separate pod in the cluster and can be scaled independently from the services it manages.

  • Kubernetes secrets

    Kubernetes secrets can include sensitive configuration data such as authentication tokens, passwords, and SSH keys. Secrets enable you to control sensitive data and reduces the risk of exposing the data to unauthorized users. Oracle Container Engine for Kubernetes supports the encryption of Kubernetes secrets at rest.

  • External Secrets Operator

    Kubernetes External Secrets Operator integrates Oracle Container Engine for Kubernetes with Oracle Cloud Infrastructure Vault. The operator reads information from external APIs and automatically injects the values into a Kubernetes Secret.

  • SecretStore

    The Kubernetes cluster control plane stores sensitive configuration data (such as authentication tokens, certificates, and credentials) as Kubernetes secret objects in etcd. Etcd is an open source distributed key-value store that Kubernetes uses for cluster coordination and state management. In the Kubernetes clusters created by Container Engine for Kubernetes, etcd writes and reads data to and from block storage volumes in the Oracle Cloud Infrastructure Block Volumes service. By default, Oracle encrypts data in block volumes at rest, including etcd and Kubernetes secrets. Oracle manages this default encryption using a master encryption key, without requiring any action on your part. For additional control over the lifecycle of the master encryption key and how it is used, you can choose to manage the master encryption key yourself, rather than have Oracle manage it for you.

    When you create a new cluster, you can specify that Kubernetes secrets in etcd are to be encrypted using the Oracle Key Management Cloud Service.

  • Pod

    A pod is a group of one or more containers and their shared storage, and any specific options on how these should be run together. Typically, containers in a pod share the same network and memory space and can access shared volumes for storage. These shared resources allow the containers in a pod to communicate internally in a seamless way as if they were installed on a single logical host.

  • Virtual Node

    A Virtual Node is a software abstraction of an actual node. Virtual Nodes are deployed in OCI's tenancy and are entirely monitored and managed by OCI.

Recommendations

Use the following recommendations as a starting point. Your requirements might differ from the architecture described here.
  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

    Use regional subnets.

  • Network security groups (NSGs)

    You can use NSGs to define a set of ingress and egress rules that apply to specific VNICs. We recommend using NSGs rather than security lists, because NSGs enable you to separate the VCN's subnet architecture from the security requirements of your application.

    You can use NSGs to define a set of ingress and egress rules that apply to specific VNICs. We recommend using NSGs rather than security lists, because NSGs enable you to separate the VCN's subnet architecture from the security requirements of your application.

  • Load balancer bandwidth

    While creating the load balancer, you can either select a predefined shape that provides a fixed bandwidth, or specify a custom (flexible) shape where you set a bandwidth range and let the service scale the bandwidth automatically based on traffic patterns. With either approach, you can change the shape at any time after creating the load balancer.

Considerations

Consider the following when working with Virtual Nodes.

To start with Virtual Nodes, you must first create anOracle Cloud Infrastructure Container Engine for Kubernetes cluster with a Virtual Node Pool or add a Virtual Node Pool to an existing Container Engine for Kubernetes cluster.

  • Select the shape and placement of your virtual nodes.
    • The shape determines the type of processors along with the amount of CPU and memory resources that can be allocated to each pod. Each pod can allocate up to the memory and CPU limits of the selected shape.

    • Virtual Nodes will scale pods with the configuration of Horizontal Pod Autoscaler (HPA). There is no need to configure node Autoscaler.

    • You can place your Virtual Nodes in specific Availability Domains and Fault Domains that are most optimum for your high availability (HA) needs. For the maximum level of redundancy, place a Virtual Node in each fault domain within the availability domain of your OKE cluster.

    • Use Kubernetes node labels to specify the placement of your pods on Virtual Nodes. To get an even distribution of pods across nodes, use the PodTopologySpread constraint.

  • Virtual Nodes use VCN-native pod networking. The number of pods available in the cluster is limited by the number of IP addresses available in the subnet.

    • Native Pod Networking provides every pod with an IP address native to the subnets of the VCN, which allows each pod to benefit from built-in OCI network security features such as VCN Flow logs, routing policies, VTAP, and security groups.
    • Consider using a subnet range that allows for the number of pods and nodes you expect to use and provides room for growth.
    • Define Network security groups (NSGs) to limit the type of traffic that can ingress and egress from the pods.
    • Use VCN Flow logs to inspect all network traffic between pods or use VTAP to capture the network traffic.
  • Pods in Virtual Nodes are given access to other OCI services with workload identity.

    There are cases where a pod may need access to other services within OCI.

    • Use Workload Identity to grant privileges to pods in Virtual Nodes. Workload Identity associates Oracle Cloud Infrastructure Identity and Access Management roles with Kubernetes service accounts that are assigned to pods.
  • Virtual Nodes provide hypervisor-level isolation for your pod.

    Security vulnerabilities could potentially allow malicious processes inside a container to "escape" and affect the node's kernel, potentially bringing down the node. The malicious code can also access data in memory or storage and exfiltrate the data. The exfiltrated data can potentially belong to another tenant in a multi-tenant environment.

    • Virtual Node provides hypervisor-level isolation for your pods. A pod does not share compute, memory, and network with other pods in the cluster, mitigating the possibility of a downed node or exfiltrated data belonging to another tenant caused by malicious code.

Acknowledgments

  • Author: Chiping Hwang