Modern App Development - Event-Driven

Build event-driven applications on Oracle Cloud Infrastructure that subscribe to changes in your cloud resources and to events generated by your application, enabling you to respond in a near real-time manner.

In the cloud, an event is any significant occurrence or change in a system. The core tenets of an event-driven architecture are to capture, communicate, process, and persist events. Most modern apps built with microservices rely on an event-driven architecture.

Event-driven architectures typically use either a publish-subscribe (pub-sub) model or an event-stream model.

Successful event-driven apps are flexible, scalable, secure, use consistent and deterministic deployment processes, and are simple to manage. Oracle Cloud Infrastructure provides the platform building blocks for deploying optimal event-driven apps. This document presents a recommended architecture pattern and the design principles it incorporates.

Design Principles

This event-driven architecture implements modern app development principles in the following ways:

  • Use lightweight, open-source frameworks and mature programming languages

    When feasible, apps should adhere to open-industry standards. For event-driven apps, we recommend using the open CloudEvents format, created by the Cloud Native Computing Foundation (CNCF), to ensure the best possible experience. You should also use microservices-based architectures as the underpinning for event-driven apps.

    When you want lightweight, time-bound processing of events on the event consumer side, use Oracle Functions, which is fully managed and serverless, scales automatically, and supports numerous languages. For heavier computing needs, use lightweight, container applications enabled by Oracle Container Engine for Kubernetes (OKE) with an API gateway interface to your containers. If your organization has specialized needs, such as incorporating artificial intelligence or machine learning into your apps, use bare metal or GPU compute nodes from the Oracle Cloud Infrastructure Compute service.

  • Build apps as services that communicate over APIs

    A key architectural feature that enables maximum availability is the loose coupling that is purpose-built into Oracle Cloud Infrastructure event-driven applications. In a loosely coupled architecture, event producers don’t know which event consumers are listening for an event, and the event doesn’t know the consequences of its occurrence, so event producers and event consumers can operate fully independently. We recommend event producers that are decoupled from consumer services, enabling independent scaling, deployment cadence, and update schedules.

  • Use fully managed services to eliminate complexity across application development, runtimes, and data management

    As a rule of thumb, fully-managed services are your best bet when it comes to operating in the cloud, as infrastructure maintenance and security come pre-built. When it comes to building event-driven apps, most mundane tasks like polling, detection, and routing of events is done seamlessly by Oracle Cloud Infrastructure services. The only instance where customers must invest manual effort to manage their systems is in cases where event producers and consumers are not managed by Oracle Cloud Infrastructure. For example, if you are using your own Kafka cluster as an event router, you must manage its availability, resiliency, and operational maintenance. Managed services eliminate the need for timely self-management and hard-coding, freeing up your time for business-differentiating work.

  • Instrument end-to-end monitoring and tracing

    Event Routers like Service Control Hub and Event Service produce metrics in Oracle Cloud Infrastructure Monitoring, which makes it easy to monitor your application events and to build custom metrics and alarms. For end-to-end distributed tracing, we recommend building custom dashboards with logging rules (Log Analytics) as well as alarm-based monitoring (ONS) to enable administrators to discover and quickly react to any issues. Additionally, event routers have the unique ability to provide a single pane-of-glass view into the event's footprint.

  • Eliminate single points of failure by using automated data replication and failure recovery

    To fulfill regulatory and compliance needs, you can back up messaging data in Oracle Cloud Infrastructure Object Storage for long-term retention. Use a serverless service like Service Connector to seamlessly move data from the messaging service to object storage and enable Oracle Cloud Infrastructure Object Storage's backup features to achieve multi-region backup. Implement a cross-region disaster discovery strategy by using Kafka MirrorMaker 2.0 deployed on fault-tolerant Oracle Container Engine for Kubernetes to asynchronously replicate data between streams. This setup enables a Recovery Time Objective (RTO) and Recovery Point Objective (RPO) of minutes. Use remote VCN peering to ensure minimal latency during the data transfer.

    Incorporate idempotency into apps by storing the offsets of processed messages in external storage such as object storage or an autonomous database. Detect and discard duplicates by querying the external storage. Categorize errors that are easily recoverable and allow for a replay of messages. However, unrecoverable errors must be written to a separate stream, a dead-letter queue, or object storage without blocking the primary execution pipeline.

Other considerations:

  • Use idempotent messaging strategies

    An expectation of enterprise apps is that the event routers ensure at-least-once delivery, guaranteeing that events are delivered even in error scenarios. However, one unintended impact of at-least-once delivery is that event consumers can be triggered repeatedly for the same message, which can lead to duplicate or incorrect app behavior. To address this issue, event consumer idempotency needs to be at the core of an event-driven architecture. Idempotency is the idea that when an event consumer processes a message once it should not behave any differently when processing the message multiple times.

  • Maximize the value of your data with advanced analytics

    Use advanced analytics to research and discover information that is created by or that describes events. Review, integrate, and visualize this data and gain insights at key stages of the event processing lifecycle.

  • Optimize your apps for interoperability

    Many Oracle Cloud Infrastructure customers operate broad IT portfolios, spanning cloud providers and on-premises implementations. When building event-driven apps, there are many ways that you can architect your apps to talk to third-party services. Any application, whether it’s hosted on the cloud or on-premises, can produce custom logs by using the Oracle Cloud Infrastructure Logging SDK or REST APIs. From there, this log data can be used as a source in Service Control Hub, which means that any third-party application can become an event producer. For event consumption, integrate with a Kafka-compatible target so that Service Control Hub can then write events to the streaming service using Kafka Connect. If the target doesn't support Kafka, you can also use Service Control Hub and Oracle Cloud Infrastructure Notifications to call the HTTP endpoint for these targets. Oracle Cloud Infrastructure Notifications supports Datadog, Slack, PagerDuty, SMS, and email as targets. Additionally, if your event consumer has legacy APIs (for example, SOAP) or employs nonstandard protocols, you can use Service Control Hub and custom functions.

Architecture

This architecture uses modern app development principles to create event-driven applications.



The architecture has the following components:

  • Virtual cloud network (VCN) and subnet

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Streaming

    Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. You can use Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.

  • Functions

    Oracle Functions is a fully managed, multitenant, highly scalable, on-demand, Functions-as-a-Service (FaaS) platform. It is powered by the Fn Project open source engine. Functions enable you to deploy your code, and either call it directly or trigger it in response to events. Oracle Functions uses Docker containers hosted in Oracle Cloud Infrastructure Registry.

  • Service connectors

    Oracle Cloud Infrastructure Service Connector Hub is a cloud message bus platform that orchestrates data movement between services in OCI. You can use it to move data between services in Oracle Cloud Infrastructure. Data is moved using service connectors. A service connector specifies the source service that contains the data to be moved, the tasks to perform on the data, and the target service to which the data must be delivered when the specified tasks are completed.

    You can use Oracle Cloud Infrastructure Service Connector Hub to quickly build a logging aggregation framework for SIEM systems. An optional task might be a function task to process data from the source or a log filter task to filter log data from the source.

  • Autonomous Transaction Processing

    Oracle Autonomous Transaction Processing is a self-driving, self-securing, self-repairing database service that is optimized for transaction processing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

  • Notifications

    The Oracle Cloud Infrastructure Notifications service broadcasts messages to distributed components through a publish-subscribe pattern, delivering secure, highly reliable, low latency, and durable messages for applications hosted on Oracle Cloud Infrastructure.

  • Oracle Analytics Cloud

    Oracle Analytics Cloud is a scalable and secure public cloud service that provides a full set of capabilities to explore and perform collaborative analytics for you, your workgroup, and your enterprise. With Oracle Analytics Cloud, you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

Alternatives and Anti-Patterns

Consider alternate designs based to existing investments, operational familiarity, or other constraints.

  • Use an event-driven architecture that implements a highly scalable, resilient, and flexible event processing framework as an Event Mesh. Events, in this context, represent a data package dynamically created within an application runtime or generated due to an infrastructure state change. Individual events must be processed quickly and consistently. Oracle Transactional Event Queues (TEQ) supports all event-generating sources as streams of data are sent to the queues and topics in TEQ. Based on event delivery, PL/SQL procedures or Oracle Functions calls are executed as notification callbacks for actions that required further processing. Usually, event data triggers an alert or call to action which is then stored in Autonomous Database to support analytics, self-healing actions, and reporting by Oracle Analytics Cloud.

  • Use Oracle GoldenGate to move heterogeneous data to the cloud in real-time, including Change Data Capture (CDC) scenarios and state changes generated by events (triggers).

  • Use Custom Application Server, which might include an embedded third-party service bus component such as RabbitMQ. These solutions offer flexibility and the ability to focus on very specific feature sets. This design requires significant initial development investment, ongoing maintenance, and subject matter expert (SME) administrative effort. Custom Application Servers might also require additional design investment and overhead to deliver data redundancy and high availability

  • Use Kafka clusters in a self-managed cloud or on-premise environment. This solution, although offering scalability and high availability, demands significant specialized developer knowledge along with extensive SME operational administration overhead. Careful consideration should be given before selecting this option because of the lead time to production and the risk of high total cost of ownership (TCO).

Example Use Cases

The following solution is often a simple enhancement to existing tools and operating processes.

This use case uses real-time facial recognition to secure large private campuses and semiprivate or public venues. It offers:
  • A transparent layer of security to enhance existing key card, access code, and biometric security protocols
  • Rapid person-of-interest identification during incident investigations
  • A secure, serverless, fully-managed, highly scalable platform with global reach
The data flow is as follows:
  • Image (face tile) data from distributed devices flows in through an API gateway for secure ingestion.
  • Serverless functions analyze the image data for a face vector match, validate time and location, and pass the data on to Oracle Autonomous Transaction Processing database, and through a service gateway to the streaming service to buffer the data for additional rapid processing.
  • Serverless functions record detailed activity information in an Oracle Autonomous Data Warehouse instance and generate alert notifications according to security protocols.
  • Threat security teams respond to alerts and incidents.
  • Analysts use Oracle Analytics Cloud to investigate incidents, generate reports, and perform trend and pattern analysis.
  • Security administrators access data through secure, encrypted network channels to maintain the image vector repository, manage alerts and incidents, and to update and apply security protocol policies.