Deploy a Migrated MongoDB Workload to Oracle Autonomous Transaction Processing Serverless

Migrate an existing workload that uses a document database, in this case MongoDB, to Oracle Autonomous Transaction Processing Serverless (ATP Serverless) database on Oracle Cloud Infrastructure (OCI) to modernize the development of your JSON-centric applications alongside other multi-model workloads.

Workloads and applications that use documents and document databases to evolve data schemas and applications are popular due to the flexibility they offer to developers. Schema flexibility, rapid development, and scalability enable accelerated prototyping of application features, easier application evolution, and the ability to build iteratively smaller applications and features that developers can scale to address a large user base. However, these types of workloads have their challenges, including weaker transactional guarantees, data query versatility, and the inability to support other workloads on documents, such as analytics or machine learning.

What if these workloads can benefit from the advantages of traditional document databases and leverage the benefits of relational databases? For instance, have stronger transactional guarantees and added functionality such as analytics and machine learning, without the need to replicate data to another database or system.

Autonomous Transaction Processing Serverless is a fully automated database service optimized to run transactional, analytical, and batch workloads concurrently. To accelerate performance, it’s preconfigured for row format, indexes, and data caching while providing scalability, availability, transparent security, and real-time operational analytics. Application developers and DBAs can rapidly and cost-effectively develop and deploy applications without sacrificing functionality or atomicity, consistency, isolation, and durability (ACID) properties.

Functional Architecture

This reference architecture assumes that you have a workload with an application and a MongoDB database, either on-premises or in the cloud, and will migrate to OCI. It describes the future state architecture, its benefits, how you can deploy it and what additional features you can use to augment the existing workload.

This reference architecture focuses on the deployment of the migrated workload and not on the migration process itself. For more details on the migration process, see the Explore More section.

One of the key products used in this architecture is Oracle Database API for MongoDB, which enables applications to interact with collections of JSON documents in Oracle Database using MongoDB drivers, tools, and SDKs. This enables existing application code to work with data stored in Autonomous Transaction Processing Serverless (ATP Serverless) without the need to refactor the code.

The following diagram depicts a typical application composed of a database, back-end and front-end tiers.



mongodb-atp-s-logical-arch-migration-oracle.zip

The MEAN stack is a popular stack used to implement this pattern:
  • MongoDB: Document database
  • Express: Back-end framework
  • Angular: Front-end framework
  • Node.js: Back-end server

This example uses a MEAN stack to migrate an existing deployment to OCI and ATP Serverless.

The migration of this workload to OCI and ATP Serverless is straightforward and consists, at high level, of the following steps:

  1. Deploy an ATP Serverless instance, enabling at creation time the Oracle Database Mongo DB API.
  2. Migrate metadata and data from MongoDB to ATP Serverless.
  3. Deploy application servers to run Node.js and Express using either VMs, containers, or Kubernetes, to the same region and availability domain as ATP Serverless.
  4. Deploy the back-end application code to the application servers.
  5. Connect the back-end application to ATP Serverless using the same MongoDB tools and drivers used on the current application.
  6. Connect users to the new application URI.

After the workload is migrated to ATP Serverless, several features are available to augment the existing functionality, whether that is to 1) support additional nonfunctional requirements such as easily improve scalability, resiliency or high availability or 2) have additional functional features such as operational reporting, analytics and machine learning in place, without the need to copy data out of the database.

To improve scalability and high availability, use the Autonomous Transaction Processing Serverless auto scaling feature. With a single click or API call, it allows the workload to use up to 3 times the baseline capacity without any downtime. Note that Autonomous Transaction Processing Serverless uses Oracle Real Application Clusters (Oracle RAC) technology for high availability. For the backend tier, use compute instance pools with auto scaling rules to enable application high availability and scalability.

Since Autonomous Transaction Processing Serverless is built on top of multi-model, multi-workload database technology, you can add features that rely on relational, spatial, graph or vector data types that work alongside the existing application.

Physical Architecture

The physical architecture includes public and private subnets in OCI with a secondary backup region to support high availability.

The architecture supports the following:

  • Front-end tier
    • Application users can connect from the internet or the corporate network.
    • User connection is secured using an OCI Web Application Firewall.
    • User connection to the application is load balanced for increased resiliency and scalability.
    • Load balancer is deployed with high availability.
  • Back-end tier
    • Application servers are deployed in a high availability fashion using an instance pool.
    • Instance Pool is used with auto scaling to achieve horizontal scalability.
    • Instance Pool is configured to deploy instances in the same Availability Domain as ATP Serverless, to have application and database colocation, hence optimizing connection latency.
    • Instance Pool is configured to distribute instances across fault domains in the same availability domain where ATP Serverless is placed, to increase workload resiliency.
  • Database tier
    • ATP Serverless provides high availability as Oracle Real Application Clusters (Oracle RAC) and several database nodes underpin the service instance. Therefore, by default the database tier is highly available and resilient.
    • Oracle Database API for MongoDB enabled in ATP Serverless enables you to use existing application code without changes.
    • The Oracle Database API for MongoDB is highly resilient, and that resiliency is guaranteed internally by ATP Serverless.
    • ATP Serverless can use auto scaling, adjusting to increases and decreases of system load.
    • ATP Serverless business continuity is achieved with Oracle Autonomous Data Guard based cross-region disaster recovery.
    • Cross-region Oracle Autonomous Data Guard Standby Recovery Time Objective (RTO) is fifteen minutes and Recovery Point Objective (RPO) is one minute.
  • Disaster recovery
    • Two regions support cross-region disaster recovery for the entire cloud deployment.
    • The standby region supports a warm standby where cloud instances are predeployed to lower the total recovery time objective (RTO).
    • ATP Serverless in the primary region has a Oracle Autonomous Data Guard cross-region peer on the standby region
    • The backend tier instance pool is preconfigured with minimum amount of instances in the pool, and the OCI Full Stack Disaster Recovery DR plan, that automates each step of the failover, can change the number of compute instances that should be running after the failover. The metric based auto scaling configuration is defined to determine the minimum and maximum number of instances in the pool, as well as the metrics used to scale out and in.
    • The standby region is deployed with a similar topology to reduce the overall recovery time objective.
    • OCI Full Stack Disaster Recovery automates failover for the whole stack to the standby region and fallbacks to the primary region.
  • Networking
    • The dynamic routing gateways deployed in both regions are peered.
    • On-premises connectivity leverages both Oracle Cloud Infrastructure FastConnect and site-to-site VPN for redundancy.
    • All incoming traffic from on-premises and from the internet is first routed into the hub VCN and then into the workload VCN.
    • Uses a hub and spoke network design to increase the security posture and accommodate other workload VCNs.
    • Services are deployed with private endpoints to increase the security posture.
    • The JSON workload VCN is segregated into several private subnets to increase the security posture.
  • Security
    • All data is secure in transit and at rest.
    • Potential design improvements not depicted on this deployment for simplicity's sake include using a full CIS-compliant landing zone and leveraging a network firewall deployed in the hub VCN. A network firewall will improve the overall security posture by inspecting all traffic and by enforcing policies.


mongodb-atp-s-physical-arch-oracle.zip

The architecture has the following main components:

  • Region

    An OCI region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an OCI region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping classless inter-domain routing (CIDR) blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect creates a dedicated, private connection between your data center and OCI. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another OCI region, an on-premises network, or a network in another cloud provider.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Service gateway

    A service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet.

  • Internet gateway

    An internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Load balancer

    Oracle Cloud Infrastructure Load Balancing provides automated traffic distribution from a single entry point to multiple servers.

  • Web Application Firewall

    Oracle Cloud Infrastructure Web Application Firewall (WAF) is a payment card industry (PCI) compliant, regional-based and edge enforcement service that is attached to an enforcement point, such as a load balancer or a web application domain name. WAF protects applications from malicious and unwanted internet traffic. WAF can protect any internet-facing endpoint, providing consistent rule enforcement across your applications.

  • Oracle Autonomous Transaction Processing Serverless

    Oracle Autonomous Transaction Processing Serverless is a fully automated database service optimized to run transactional, analytical, and batch workloads concurrently. To accelerate performance, it’s preconfigured for row format, indexes, and data caching while providing scalability, availability, transparent security, and real-time operational analytics. Application developers and DBAs can rapidly, easily, and cost-effectively develop and deploy applications without sacrificing functionality or atomicity, consistency, isolation, and durability (ACID) properties.

  • Full Stack Disaster Recovery

    Oracle Cloud Infrastructure Full Stack Disaster Recovery is an orchestration and management service that provides comprehensive disaster recovery capabilities for all layers of an application stack, including infrastructure, middleware, database, and application.

  • Object Storage

    OCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.

    Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Oracle Database API for MongoDB

    Oracle Database API for MongoDB enables applications to interact with collections of JSON documents in Oracle Database using MongoDB drivers, tools, and SDKs.

Physical Architecture Variant

The fully managed MongoDB API provided by ATP Serverless is the best solution for most workloads since it is easier to manage. This physical architecture variant uses a customer-managed Oracle REST Data Services deployment running in each application server.

If there are requirements to manually control the configuration and management of Oracle REST Data Services, then using customer-managed Oracle REST Data Services is an option. For example, to allow the application to use larger connection pools.

Note:

Use this architecture variant if there is a specific workload requirement to do so. Only advanced users should deploy this architecture variant.

This section only describes the differences compared to the previously described physical architecture, so all physical architecture design principles are valid unless stated otherwise.

The architecture diagram below depicts how the variant is deployed. For simplicity, only the cloud resources deployed in the JSON Workload VCN are depicted, since the rest of the deployment is the same as described before.



mongodb-atp-s-arch-variant-oracle.zip

The following is the front-end tier for the variant:
  • Back-end application code is deployed in application servers that are part of an instance pool.
  • The incoming user requests are distributed by the load balancer, so the front-end tier is horizontally scalable and doesn't have a single point of failure.
  • Customer-managed Oracle REST Data Services is installed on each application server and configured to enable the MongoDB API, so that the application can connect to the database using MongoDB tools and drivers.
  • Customer-managed Oracle REST Data Services is configured to adjust to the workload non-functional requirements, for instance, by configuring larger connection pools or use a different database service.
  • Both the back-end code and the customer-managed Oracle REST Data Services, are preinstalled and preconfigured in the instance configuration used by the pool, so that whenever an instance is added to the pool, it is able to run the back-end and connect to the database, after the instance provisioning.

Recommendations

Use the following recommendations as a starting point to further improve and evolve the workload. Your requirements might differ from the architecture described here.

Consider the following:

  • Application Deployment

    Use a container-based deployment with Oracle Cloud Infrastructure Kubernetes Engine (OKE) if the application can run in containers.

  • Security
    • Use Oracle Data Safe to further increase the workload security posture and be able to perform database auditing.
  • Observability
    • Use OCI Audit to perform forensics auditing for all OCI services beyond Oracle Autonomous Database Serverless.
    • Use OCI Monitoring, OCI Logging and OCI Logging Analytics to have full visibility of the environment operating status.
  • Operational Efficiency
    • Use Elastic Pools if the ATP Serverless JSON workload is part of a wider database fleet for increased cost efficiency.
    • Enable Oracle Cloud Infrastructure Database Management. This service provides a comprehensive set of database performance monitoring and management features, to streamline the ATP Serverless instance.
  • Application Evolution
    • Deploy operational analytics and real time reporting in ATP Serverless using SQL and a frontend, such as APEX or Oracle Analytics Cloud, without moving data out of the database for trusted and real time data analysis.
    • Use ATP Serverless and Oracle Machine Learning to build and train models with JSON data without moving data and to deploy the models alongside the existing workload for efficient inferencing.
    • For additional use cases beyond the application core, consider using Oracle Autonomous Database Select AI and database views querying JSON and holding metadata. This enables users to query JSON data using natural language.
    • Use ATP Serverless to store additional data types (relational, vector, spatial or graph) for added workload functionality and flexibility.

Acknowledgments

  • Authors: José Cruz
  • Contributors: Massimo Castelli, Simon Griffith, Hermann Baer, Matt DeMarco, Julian Dontcheff