Autoscale Oracle Cloud Infrastructure Container Instances

Applications deployed using containers need to scale based on demand. For example, a student enrollment application needs to scale up as a semester starts, but then scale down once the semester is in progress. Oracle Cloud Infrastructure Container Instances can host your containerized applications and then autoscale them as needed.

OCI Container Instances enables you to easily run applications on a server-less compute optimized for containers. You can also easily launch one or more containers with flexibility to specify compute shape, resource allocation, networking, and other optional configurations.

Use this reference architecture to autoscale OCI Container Instances based on resource utilization.

Architecture

This architecture uses a combination of Oracle notification services, alarms, and Oracle functions to autoscale OCI Container Instances.

The following diagram illustrates this reference architecture.



autoscaling-container-instances-diagram-oracle.zip

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Compartment

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Cloud Guard

    You can use Oracle Cloud Guard to monitor and maintain the security of your resources in Oracle Cloud Infrastructure. Cloud Guard uses detector recipes that you can define to examine your resources for security weaknesses and to monitor operators and users for risky activities. When any misconfiguration or insecure activity is detected, Cloud Guard recommends corrective actions and assists with taking those actions, based on responder recipes that you can define.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Security zone

    Security zones ensure Oracle's security best practices from the start by enforcing policies such as encrypting data and preventing public access to networks for an entire compartment. A security zone is associated with a compartment of the same name and includes security zone policies or a "recipe" that applies to the compartment and its sub-compartments. You can't add or move a standard compartment to a security zone compartment.

  • Registry

    Oracle Cloud Infrastructure Registry is an Oracle-managed registry that enables you to simplify your development-to-production workflow. Registry makes it easy for you to store, share, and manage development artifacts, like Docker images. The highly available and scalable architecture of Oracle Cloud Infrastructure ensures that you can deploy and manage your applications reliably.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Vault

    Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud. You can use the Vault service to create and manage vaults, keys, and secrets.

Recommendations

Use the following recommendations as a starting point to create a secure and robust environment. Your requirements might differ from the architecture described here.
  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

  • Cloud Guard

    Clone and customize the default recipes provided by Oracle to create custom detector and responder recipes. These recipes enable you to specify what type of security violations generate a warning and what actions are allowed to be performed on them. For example, you might want to detect Object Storage buckets that have visibility set to public.

    Apply Cloud Guard at the tenancy level to cover the broadest scope and to reduce the administrative burden of maintaining multiple configurations.

    You can also use the Managed List feature to apply certain configurations to detectors.

  • Security Zones

    For resources that require maximum security, Oracle recommends that you use security zones. A security zone is a compartment associated with an Oracle-defined recipe of security policies that are based on best practices. For example, the resources in a security zone must not be accessible from the public internet and they must be encrypted using customer-managed keys. When you create and update resources in a security zone, Oracle Cloud Infrastructure validates the operations against the policies in the security-zone recipe, and denies operations that violate any of the policies.

  • Network security groups (NSGs)

    You can use NSGs to define a set of ingress and egress rules that apply to specific VNICs. We recommend using NSGs rather than security lists, because NSGs enable you to separate the VCN's subnet architecture from the security requirements of your application.

Considerations

Consider the following points when deploying this reference architecture.

  • Performance

    This solution is faster than creating a new container instance and adding it to the backend set in the function itself, since we do not have to wait for the container instance to create an API call to return the IP address of the container instance to add it to the load balancer's backend set. Oracle Cloud Infrastructure Functions has an execution limit of five minutes for each call.

  • Security

    Make sure to place all the container instances in a private subnet and only the public load balancer in the public subnet.

  • Scalability

    Make sure to set the minimum and maximum number of container instances. Test the solution thoroughly for scaling a large number of instances.

  • Cost

    Using Oracle Cloud Infrastructure Registry is free of cost. The container instance incurs a cost, but function incurs a cost only for the amount of time it is under execution.

Deploy

Follow these high-level steps to autoscale container instances.

  1. Create a stack on Oracle Cloud Infrastructure Resource Manager to create container instances, create a load balancer, and add the container instances to the load balancer backend set using the IP of the container instance.
  2. Create the stack so that it takes in the following parameters:
    • Compartment_OCID
    • Region
    • Tenancy_OCID
    • Container_Instances_Count
    • Is_Public_IP_Assigned
    • Container_instance_registry_secret
    • Container_Instance_Image_URL
    • Private_Subnet_OCID
    • Public_Subnet_OCID
  3. Create an Oracle function that reads the Container_Instance_Count from the above Resource Manager stack and updates it based on the required scaling up or scaling down call of the container instances.
  4. Once the Container_Instance_Count is updated in the Oracle function, apply the Terraform stack on Resource Manager with the updated Container_Instance_Count; then exit the function.
    The Resource Manager stack takes care of adding or terminating the OCI container instance based on the call and adds or removes it from the load balancer backend set.
  5. Add the following configurations to your Oracle function:
    • Min_Container_Instance
    • Max_Container_Instance
    • Compartment_ID
    • Resource_Manager_Stack_ID
  6. Push the newly created Oracle function image to Oracle Cloud Infrastructure Registry (OCIR).
  7. Create a Topic under notification services and give it an appropriate name.
  8. Create a subscription under the created topic and select function as the protocol.

    Note:

    Select the function you created by selecting the correct compartment and application.
  9. Create an alarm definition in the oci_computecontainerinstance namespace.
    1. Select CPU_utilization or Mem_Utilization as the Metric.
    2. Select the Container Instance name as the dimension.
    3. Set the required threshold for the Alarm to Fire.
  10. Test this autoscaling architecture to scale up and scale down OCI Container Instances.

Explore More

Learn more about OCI Container Instances.

Review these additional resources:

Acknowledgments

Authors: Christophe Pruvost, Karthic Ravindran, Chandrashekar Avadhani

Contributors: John Sulyok