Takamol: Deploy Kubernetes and Microservices for a Government HR Platform on Oracle Cloud

To help match working-age citizens to both government and private sector employers, Riyadh-based Takamol Holding runs its job training, up-skilling, and talent development services platform in a hybrid infrastructure, where its cloud environment runs multiple Kubernetes clusters in Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE).

Founded in 2013 as a division of the Saudi Arabian Ministry of Human Resources and Social Development, Takamol has spent nearly a decade modernizing human resource programs throughout the kingdom.

Because Takamol's on-premises infrastructure proved too costly to operate and maintain, Takamol decided to launch its new labor and economic development platforms by using a container-based microservices architecture running on Oracle Cloud Infrastructure (OCI). Today, hundreds of thousands of people use Takamol's platforms daily, enabling them to find, apply, and prepare for gainful employment. Recently, Takamol migrated a donations platform, which is used by both institutional and individual donors.

In the Oracle Cloud region in Jeddah, Takamol's platform supports 5000 simultaneous users and processes up to 10,000 requests per minute. Takamol uses OCI's managed services to automate updates to its cloud native platforms quickly and with zero downtime. In its OCI tenancy, Takamol can quickly scale up during peak times and scale down at off-peak times, providing just-in-time support at the lowest possible cost.

Highlights of Takamol's cloud native deployment on OCI include:

  • Container-based architecture using Oracle Cloud Infrastructure Container Engine for Kubernetes
  • Horizontal pod autoscaling (HPA) to meet high demand during peak hours
  • Zero-trust network access
  • NGINX web application firewall
  • For its stateful components, Takamol uses PostgreSQL for its database and RabbitMQ for message queuing

Architecture

Takamol Holding engineers connect to a zero-trust, network access tool and are authenticated through their own single sign-on before gaining access to the virtual cloud network (VCN).

Platform users are routed through Oracle Cloud Infrastructure Load Balancing which administers user requests across three separate fault domains. User requests are then sent to the Oracle Container Engine for Kubernetes (OKE) ingress controller, where they are inspected before being routed to their final destination.

Within the Kubernetes cluster, Takamol uses multiple, open-source tools to process user requests, including NGINX, a reverse proxy server, a load balancer, and an API gateway. These services are scaled across the Kubernetes cluster by using horizontal pod autoscaling (HPA) to meet high demands during peak hours. Takamol also uses a layer 7 app protect denial of service (DoS) app, along with an app protect WAF by F5 NGINX. Most of Takamol's applications are stateless, following the 12 factors model, so they don't require in-application caches or storage. Instead, Takamol's applications use external storage services, which makes them easily to deploy, auto scale, and manage within the Kubernetes cluster .

Takamol also uses Argo CD, a declarative, GitOps, continuous delivery tool for Kubernetes. Using Argo CD, Takamol can deploy its workloads declaratively, without providing direct access to the cluster, which makes it possible to have its cluster deployed to a private subnet. Rather than developers updating applications, Argo CD reads from a Gitlab repository to deploy new services, without providing Gitlab direct access to update the cluster. For its stateful components, Takamol uses PostgreSQL for its database and RabbitMQ for message queuing.

The load balancer, Kubernetes cluster, and open-source tools are each located in separate subnets. Although these are isolated from each other, they can send and receive information through communications ports. Using the Oracle VCN flow logs and a SEIM security operations center (SOC), Takamol can virtualize communications between the different subnets without having to install additional tools. In the coming months, Takamol plans to send its VCN flow logs through Oracle Functions to deliver network logs to the SEIM solution.

  • Roughly 90% of Takamol's architecture was built by using infrastructure as code (IaC) from the open-source Oracle Cloud Infrastructure Terraform Provider with their own in-house built modules. This approach reduces the human effort needed to deploy and manage the infrastructure, enabling faster changes with significantly reduced risk of human error.
  • All of the services in Takamol's development, testing, and preproduction environments are replicated as its production environment. None of these environments are interconnected. This ensures consistency between the environments.
  • Database backups are done using pgbackrest, which archives and stores backups in Oracle Cloud Infrastructure Block Volumes. This allows long term storage for the database while supporting point-in-time (PIT) recovery.
  • Oracle Cloud Infrastructure Object Storage is used by microservices, metrics, OKE logs, and GitLab runners to cache data. It also provides cost-effective, long term, database backups of their PostgreSQL databases.
  • Oracle Cloud Infrastructure Registry and Oracle Cloud Infrastructure Identity and Access Management policies help Takamol control user access to the repositories. Previously, the company used Docker Hub, which did not provide as fine-grained control as does OCI. Moreover, with OCI Registry, Takamol uses the built-in security scan feature.
  • Takamol uses Loki, a time-series database for logs, Prometheus for metrics collection, Tempo for traces, and Grafana for visualization which are all centralized in the single OKE cluster.

The following diagram illustrates this reference architecture.



takamol-oci-arch-oracle.zip

For a future state and roadmap Takamol is looking to move more services to managed and cloud-native services:

  • Run a disaster recovery site out of the Oracle Cloud region in Neom.
  • Leverage Oracle Cloud Infrastructure Search with OpenSearch for a distributed, fully managed, and maintenance-free full-text search engine.
  • Leverage Oracle Autonomous Data Warehouse for database workloads.
  • Use Oracle Cloud Infrastructure Vulnerability Scanning Service to scan for vulnerabilities, particularly in docker images.

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Compartment

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domain

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Container Engine for Kubernetes

    Oracle Cloud Infrastructure Container Engine for Kubernetes is a fully managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud. You specify the compute resources that your applications require, and Container Engine for Kubernetes provisions them on Oracle Cloud Infrastructure in an existing tenancy. Container Engine for Kubernetes uses Kubernetes to automate the deployment, scaling, and management of containerized applications across clusters of hosts.

  • Compute

    The Oracle Cloud Infrastructure Compute service enables you to provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.

  • Service connectors

    Oracle Cloud Infrastructure Service Connector Hub is a cloud message bus platform that orchestrates data movement between services in OCI. You can use it to move data between services in Oracle Cloud Infrastructure. Data is moved using service connectors. A service connector specifies the source service that contains the data to be moved, the tasks to perform on the data, and the target service to which the data must be delivered when the specified tasks are completed.

    You can use Oracle Cloud Infrastructure Service Connector Hub to quickly build a logging aggregation framework for SIEM systems. An optional task might be a function task to process data from the source or a log filter task to filter log data from the source.

  • Registry

    Oracle Cloud Infrastructure Registry is an Oracle-managed registry that enables you to simplify your development-to-production workflow. Registry makes it easy for you to store, share, and manage development artifacts, like Docker images. The highly available and scalable architecture of Oracle Cloud Infrastructure ensures that you can deploy and manage your applications reliably.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Vault

    Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud. You can use the Vault service to create and manage vaults, keys, and secrets.

  • Cloud Guard

    You can use Oracle Cloud Guard to monitor and maintain the security of your resources in Oracle Cloud Infrastructure. Cloud Guard uses detector recipes that you can define to examine your resources for security weaknesses and to monitor operators and users for risky activities. When any misconfiguration or insecure activity is detected, Cloud Guard recommends corrective actions and assists with taking those actions, based on responder recipes that you can define.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Analytics

    Oracle Analytics Cloud is a scalable and secure public cloud service that empowers business analysts with modern, AI-powered, self-service analytics capabilities for data preparation, visualization, enterprise reporting, augmented analysis, and natural language processing and generation. With Oracle Analytics Cloud, you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

Get Featured in Built and Deployed

Want to show off what you built on Oracle Cloud Infrastructure? Care to share your lessons learned, best practices, and reference architectures with our global community of cloud architects? Let us help you get started.

  1. Download the template (PPTX)

    Illustrate your own reference architecture by dragging and dropping the icons into the sample wireframe.

  2. Watch the architecture tutorial

    Get step by step instructions on how to create a reference architecture.

  3. Submit your diagram

    Send us an email with your diagram. Our cloud architects will review your diagram and contact you to discuss your architecture.

Acknowledgments

  • Authors: Robert Huie, Sasha Banks-Louie
  • Contributors: Tim Graves, Faisal Alsanie, Robert Lies

    Takamol Team: Mohammed BinSabbar