Low cost antivirus for Object Storage

Improve security and maintain compliance by building a low cost virus scanner to scan file objects as they are created in Oracle Cloud Infrastructure Object Storage. This architecture uses a single Oracle Cloud Infrastructure Compute instance and ClamAV, an open source antivirus engine.

Architecture

This architecture shows two methods of conducting scanning operations on your files in Object Storage.

One method relies on the Oracle Cloud Infrastructure Events service and the Oracle Cloud Infrastructure Streaming service to notify a script or program on the scanner that one or more new files have been added to the file bucket. The other method scans the file bucket on a schedule that you set.

The following diagram illustrates this reference architecture.

Description of architecture-antivirus.png follows
Description of the illustration architecture-antivirus.png

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Compartment

    A compartment is a collection of related resources. Compartments are a fundamental component of Oracle Cloud Infrastructure for organizing and isolating cloud resources. Compartments are tenancy-wide and cross all regions.

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that is allowed in and out of the subnet.

  • Instance Principal

    Instance Principal is the IAM service feature that enables instances to be authorized actors (or principals) to perform actions on service resources. Each compute instance has its own identity, and it authenticates using the certificates that are added to it.

  • Dynamic Group

    Dynamic groups allow you to group Oracle Cloud Infrastructure computer instances as "principal" actors, similar to user groups. You can then create policies to permit instances to make API calls against Oracle Cloud Infrastructure services.

  • Policies

    A policy is a document that specifies who can access which Oracle Cloud Infrastructure resources that your company has, and how. A policy simply allows a group to work in certain ways with specific types of resources in a particular compartment or tenancy.

  • Object Storage Service

    Oracle Cloud Infrastructure Object Storage service is an internet-scale, high-performance storage platform that offers reliable and cost-efficient data durability. You can store an unlimited amount of unstructured data of any content type. Including database backups, analytic data, and rich content such as images and videos.

  • Compute Instances

    Oracle Cloud Infrastructure Compute lets you provision and manage compute hosts. You can launch compute instances with shapes that meet your resource requirements (CPU, memory, network bandwidth, and storage). After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you do not need it.

    The architecture has one compute instance. It hosts the ClamAV antivirus engine and other tooling to manage the scanning process.

  • Events

    Oracle Cloud Infrastructure services emit events, which are structured messages that describe the changes in resources. Events are emitted for create, read, update, or delete (CRUD) operations, resource lifecycle state changes, and system events that affect cloud resources.

    In this architecture, the Events service is used to track the creation of an object on a Object Storage bucket.

  • Streaming

    Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. You can use Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.

Recommendations

Your requirements might differ from the architecture described here. Use the following recommendations as a starting point.

  • VCN and Subnets

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

  • Compartment

    By default, any Oracle Cloud tenancy has a default root compartment named after the tenancy itself. The tenancy administrator (default root compartment administrator) is any user who is a member of the default Administrators group.

    For this architecture, create a compartment that contains all resources to isolate them and improve security.

  • Instance Principal

    Create a dynamic group and policies to allow the VM to access Object Storage buckets without making them public.

  • Oracle Image

    Create the VM using Oracle Cloud Developer Image, which comes with OCI-CLI, Python, and Git installed and ready to use.

Considerations

Consider the following points when deploying this reference architecture:

  • Frequency

    The frequency with which you execute your scan depends on the volume and frequency of incoming objects. A general guideline is starting with a weekly scan and adjust based on the time taken to process all objects in the bucket.

  • Performance

    Several factors affect performance, but the most important is the number of files that need to be processed and the instance shape.

  • Security

    Use instance principal and dynamic group to restrict access to object storage buckets. Use Oracle Cloud Infrastructure Identity and Access Management (IAM) policies to assign privileges to the specific dynamic group to avoid making the buckets public.

    Encryption is enabled for Oracle Cloud Infrastructure Object Storage by default and can’t be turned off.

  • Cost

    Oracle Cloud Infrastructure instance and its block storage is paid per use, so you pay only when you run the antivirus in the bucket. To scan incoming objects you are going to keep an instance running 24x7 and depending on objects volume you can use free-tier, share an instance with other service or start a dedicated one. You don’t need to pay for outbound data transfer and for the open source antivirus.

Deploy

The Terraform code for this reference architecture is available as a sample stack in Oracle Cloud Infrastructure Resource Manager. You can also download the code from GitHub, and customize it to suit your specific requirements.

  • Deploy using the sample stack in Oracle Cloud Infrastructure Resource Manager:
    1. Go to Deploy to Oracle Cloud.

      If you aren't already signed in, enter the tenancy and user credentials.

    2. Select the region where you want to deploy the stack.
    3. Follow the on-screen prompts and instructions to create the stack.
    4. After creating the stack, click Terraform Actions, and select Plan.
    5. Wait for the job to be completed, and review the plan.

      To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.

    6. If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
  • Deploy using the Terraform code in GitHub:
    1. Go to GitHub.
    2. Clone or download the repository to your local computer.
    3. Follow the instructions in the README document.