Stream IoT Data to an Autonomous Database Using Serverless Functions

Workloads that leverage Internet of Things (IoT) devices need to scale efficiently in real time. As you deploy more devices and sensors, the volume and variety of streamed data is bound to grow. Use serverless functions and an autonomous database in Oracle Cloud to automate and scale the processing of streamed IoT data.

Architecture

In this architecture, data from IoT devices flows in through an API gateway to serverless functions, which use the Streaming service to upload the data to an autonomous database in Oracle Cloud. Users outside the cloud can access the data through a Flask-based web server running on an Oracle Cloud Infrastructure Compute instance.

The following diagram illustrates this architecture.



oci-arch-iot-streaming-oracle.zip

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

    In this architecture, the autonomous database and a function to set up the database are attached to a private subnet. The compute instance that hosts the web server and the functions that process the streams are deployed in a public subnet.

  • Network security groups (NSG)

    NSGs act as virtual firewalls for your cloud resources. With the zero-trust security model of Oracle Cloud Infrastructure, all traffic is denied, and you can control the network traffic inside a VCN. An NSG consists of a set of ingress and egress security rules that apply to only a specified set of VNICs in a single VCN.

    Access to the database and the web server in this architecture is controlled through separate NSGs.

  • API Gateway

    Oracle API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and which you can expose to the public internet if required. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting.

  • Streaming

    Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. You can use Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.

  • Functions

    Oracle Functions is a fully managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. Functions enable you to deploy your code, and either call it directly or trigger it in response to events. Oracle Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • Autonomous database

    This architecture uses an autonomous database (Oracle Autonomous Data Warehouse or Oracle Autonomous Transaction Processing) with a private endpoint.

    Oracle Autonomous Data Warehouse is a self-driving, self-securing, self-repairing database service that is optimized for data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

    Oracle Autonomous Transaction Processing is a self-driving, self-securing, self-repairing database service that is optimized for transaction processing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

  • Web server

    In this architecture, a Flask micro-framework endpoint is deployed on a compute instance. The Flask-based application can expose the data in the autonomous database as dynamic web content.

Recommendations

Use the following recommendations as a starting point. Your requirements might differ.

  • VCN sizing

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

  • Compute shapes

    In this architecture, an Oracle Linux 7.8 image and the VM.Standard2.1 shape is used for the Flash-based web server. Choose a shape that's appropriate for your application’s resource needs.

  • API gateway features

    The API Gateway endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting. Choose the features that suit your business and IT needs.

  • Stream partitioning

    The Streaming service gives you a partitioned, append-only log of messages: a stream. A partition is a section of a stream. Partitions allow you to distribute a stream by splitting messages across multiple nodes. You can place each partition on a separate machine to enable multiple consumers to read from a stream in parallel. For large and compute-intensive workloads, consider increasing the number of partitions.

  • Autonomous database version

    Use the latest available version for the autonomous database.

Considerations

When implementing this architecture, consider your requirements for the following parameters:

  • Request throttling

    After creating the API gateway and deploying one or more APIs, you might want to limit the rate at which front-end clients can send requests to backend services. Decide the request-rate limit based on your requirement to maintain high availability and fair use by protecting the backend resources from being overwhelmed by too many requests. You might also need to prevent denial-of-service (DoS) attacks or control and constrain resource consumption. Ultimately, you apply a rate limit globally to all the routes in an API deployment specification.

  • Service limits

    When designing your architecture, consider the service limits for the Streaming and Functions services. See the Service Limits documentation listed in the Explore More section.

  • Scalability
    • Database

      You can manually scale the number of CPU cores of the database up or down at any time. The autoscaling feature of autonomous databases allows your database to use up to three times the current base number of CPU cores at any time. As demand increases, autoscaling automatically increases the number of cores in use. Autonomous databases allow you to scale the storage capacity at any time without affecting availability or performance.

    • Application

      You can scale your Flask application by using the instance pool and autoscaling features.

      Instance pools enable you to provision and create multiple compute instances based on the same configuration within the same region.

      Use autoscaling to automatically adjust the number of compute instances in an instance pool based on performance metrics, such as CPU utilization. Autoscaling helps you provide consistent performance for users during periods of high demand and reduce your costs when the demand is low.

    • Functions

      Oracle Functions creates and removes function containers automatically based on the request load. You pay only when the functions are invoked and for the duration that they run.

  • Application availability

    Fault domains provide the best resilience within an availability domain. If you need higher availability, consider using multiple availability domains or multiple regions where feasible.

  • Backups
    • Database

      Oracle Cloud Infrastructure automatically backs up autonomous databases and retains the backups for 60 days. You can restore and recover your database to any point in time during the retention period. You can also create manual backups to supplement the automatic backups. Manual backups are stored in an Oracle Cloud Infrastructure Object Storage bucket that you create, and are retained for 60 days.

    • Application

      The Oracle Cloud Infrastructure Block Volumes service lets you create point-in-time backups of data on a block volume. You can restore these backups to new volumes at any time.

      You can also use the service to make a point-in-time, crash-consistent backup of a boot volume without application interruption or downtime. Boot and block volumes have the same backup capabilities.

  • Security
    • Access control

      Use policies to restrict who can access your resources in the cloud and the actions that they can perform.

    • Network security

      The Networking service offers two virtual firewall features that use security rules to control traffic at the packet level: security lists and network security groups (NSG). An NSG consists of a set of ingress and egress security rules that apply only to a set of VNICs of your choice in a single VCN. For example, you can choose all the compute instances that act as web servers in the web tier of a multitier application in your VCN.

      NSG security rules function the same as security list rules. However, for an NSG security rule's source or destination, you can specify an NSG instead of a CIDR block. So, you can easily write security rules to control traffic between two NSGs in the same VCN or traffic within a single NSG. When you create an database system, you can specify one or more NSGs. You can also update an existing database system to use one or more NSGs.

Deploy

The Terraform code for this reference architecture is available in GitHub. You can pull the code into Oracle Cloud Infrastructure Resource Manager with a single click, create the stack, and deploy it. Alternatively, download the code from GitHub to your computer, customize the code, and deploy the architecture by using the Terraform CLI.

  • Deploy by using Oracle Cloud Infrastructure Resource Manager:
    1. Click Deploy to Oracle Cloud

      If you aren't already signed in, enter the tenancy and user credentials.

    2. Review and accept the terms and conditions.
    3. Select the region where you want to deploy the stack.
    4. Follow the on-screen prompts and instructions to create the stack.
    5. After creating the stack, click Terraform Actions, and select Plan.
    6. Wait for the job to be completed, and review the plan.

      To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.

    7. If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
  • Deploy by using the Terraform CLI:
    1. Go to GitHub.
    2. Download or clone the code to your local computer.
    3. Follow the instructions in the README.

Change Log

This log lists only the significant changes: