Deploy an Autoscaling Virus Scanner Using Oracle Container Engine for Kubernetes

Ensure your files in Oracle Cloud Infrastructure Object Storage are virus scanned to detect and prevent viruses, malware, identity theft, and fraud. Implementing a virus scanner ensures all files entering your Oracle Cloud Infrastructure (OCI) solution are safe and secure.

Architecture

This architecture creates a virus scanner to scan files uploaded to Oracle Cloud Infrastructure (OCI) Object Storage. The virus scanner is deployed on Oracle Container Engine for Kubernetes and uses Kubernetes Event-driven Autoscaling to manage virus scan jobs.

Virus scan jobs are configured to scan single files and zip files. When multiple files are uploaded to the created object storage bucket, virus scan jobs are executed on Oracle Container Engine for Kubernetes using OCI Events and OCI Queue (max 3 jobs simultaneously by default, but this can be changed using the Kubernetes Event-driven Autoscaling configuration). After scanning, files are moved to object storage buckets depending on the scan result (clean or infected). If there are no files to scan, the Kubernetes Event-driven Autoscaler scales down the nodes in pool 2 to zero. When scanning, the Kubernetes Event-driven Autoscaler scales nodes up.

The virus scanner uses a third-party named Trellix's free trial uvscan. The application code is written mostly in NodeJS and uses the Oracle Cloud Infrastructure SDK for JS.

The following diagram illustrates this reference architecture.



oke-antivirus-architecture.zip

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Monitoring

    Oracle Cloud Infrastructure Monitoring service actively and passively monitors your cloud resources using metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Functions

    Oracle Functions is a fully managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. Functions enable you to deploy your code, and either call it directly or trigger it in response to events. Oracle Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • Queue

    Oracle Cloud Infrastructure Queue provides a scalable system to process messages while handling complex management tasks such as guaranteed at-least-once processing, tracking, and client isolation. This centralized service also manages message ordering and processing state, which allows stateless client processes to offload cursor tracking.

  • Events

    Oracle Cloud Infrastructure services emit events, which are structured messages that describe the changes in resources. Events are emitted for create, read, update, or delete (CRUD) operations, resource lifecycle state changes, and system events that affect cloud resources.

  • Registry

    Oracle Cloud Infrastructure Registry is an Oracle-managed registry that enables you to simplify your development-to-production workflow. Registry makes it easy for you to store, share, and manage development artifacts, like Docker images. The highly available and scalable architecture of Oracle Cloud Infrastructure ensures that you can deploy and manage your applications reliably.

Deploy

The code for deploying the virus scanner is available in GitHub.

  1. Go to GitHub.
  2. Clone or download the repository to your local computer.
  3. Follow the instructions in the README document.

Explore More

Learn more about deploying a virus scanner on Oracle Container Engine for Kubernetes.

Review these additional resources:

Acknowledgments

  • Author: Mika Rinne
  • Contributors: Marta Tolosa, Badr Tharwat