Deploy a Migrated MongoDB Workload to Oracle Autonomous JSON Database@Azure

Migrate an existing workload that uses a document database, in this case MongoDB, to Azure and Oracle Autonomous JSON Database deployed in Azure, a cloud document database service that makes it simple to modernize the development of your JSON-centric applications.

Workloads and applications that use documents and document databases to evolve data schemas and applications are quite popular due to the flexibility they offer to developers. Schema flexibility, rapid development, and scalability enable rapid prototyping of application features, easier application evolution, and the assurance of building iteratively smaller applications and features that developers can scale to address a large user base. However, these types of workloads have their challenges, including weaker transactional guarantees, data query versatility, and the inability to support other workloads on documents, such as analytics or machine learning.

What if these workloads could benefit from all the advantages of traditional document databases, but at the same time, leverage the benefits of relational databases? For instance, have stronger transactional guarantees and use additional functionality such as analytics and machine learning, without the need to replicate data to another database or system.

Autonomous JSON Database features NoSQL-style document APIs (Oracle Simple Oracle Document Access (SODA) and Oracle Database API for MongoDB), serverless scaling, high performance ACID transactions, comprehensive security, and low pay-per-use pricing. Autonomous JSON Database automates provisioning, configuring, tuning, scaling, patching, encrypting, and repairing of databases, eliminating database management and delivering 99.95% availability.

Functional Architecture

This architecture assumes, as a starting point, that a workload with an application and a MongoDB database exists, either an on-premises or cloud deployment, and will be migrated to Azure and Oracle Database@Azure. It describes the future state architecture, its benefits, how it can be deployed and what additional features you can use to augment the existing workload.

One of the key features used in this architecture is Oracle Database API for MongoDB, which enables applications interact with collections of JSON documents in Oracle Database using MongoDB drivers, tools, and SDKs. Existing applications code can work with data stored in Oracle Autonomous JSON Database without the need to refactor it.

The following diagram depicts a typical application composed of a database, back-end, and front-end tiers.



mongodb-ajd-azure-logical-arch-oracle.zip

A popular stack, used to implement this pattern, is the MEAN stack, using MongoDB as the document database, Express as the back-end framework, Angular as the front-end framework and Node.js as the back-end server. This document uses a MEAN stack as an example of an existing deployment that will be migrated to Azure and Autonomous JSON Database.

The migration of this workload to Azure and Autonomous JSON Database is straightforward and consists, at high level, of the following steps:

  1. Deploy an Autonomous JSON Database instance, enabling at creation time the Oracle Database MongoDB API.
  2. Migrate metadata and data from MongoDB to Autonomous JSON Database.
  3. Deploy application servers to run Node.js and Express using either Azure App Service, VMs, containers, or Kubernetes to the same region and availability domain as Autonomous JSON Database.
  4. Deploy the back-end application code to the application servers.
  5. Connect the back-end application to Autonomous JSON Database using the same MongoDB tools and drivers used on the current application.
  6. Have users connect to the new application URI.

Note this reference architecture focuses on the deployment of the migrated workload and not on the migration process itself. For more details on the migration process, refer to the Explore More section.

After the workload is migrated to Autonomous JSON Database, you can use several features to augment the existing functionality, whether that is to 1) support additional nonfunctional requirements such as easily improve scalability, resiliency or high availability or 2) have additional functional features such as operational reporting, analytics and machine learning in place, without the need to copy data out of the database.

To improve scalability and high availability, use the Autonomous JSON Database auto scaling feature. A single click or API call allows the workload to use up to 3 times the baseline capacity without any downtime. Note that Autonomous JSON Database uses Oracle Real Application Clusters (Oracle RAC) technology for high availability. For the back-end tier, use compute instance pools with auto scaling rules, thus enabling application high availability and scalability.

Since Autonomous JSON Database is built on top of a multimodel, multiworkload database technology, you can add features that rely on relational, spatial, graph or vector data types that work alongside the existing application. It is common that users want to perform analytics on top of JSON data. Using SQL in Autonomous JSON Database simplifies creating operational and analytical reporting, using the same engine and data.

Autonomous JSON Database has a limit of 20 Gb of non-JSON data. If your data volume requirements change, then you can easily convert to Oracle Autonomous Database Serverless, which supports the same features. Views and Materialized Views storage doesn't count towards the Autonomous JSON Database 20 Gb non-JSON data limit, so those can be easily created and used, for instance, to support operational analytics using SQL on top of JSON documents.

Physical Architecture

The physical architecture includes Autonomous JSON Database deployed using delegated subnets in two Microsoft Azure regions to support high availability. OCI services support automatic backup to Oracle Cloud Infrastructure Object Storage.

The architecture supports the following:

  • Front-end tier
    • Application users can connect from the internet or the corporate network.
    • User connection is routed to the active region that is running the application, using Microsoft Azure Front Door.
    • User connection is secured using Azure Web Application Firewall.
    • User connection to the application is load balanced using App Service.
  • Back-end tier
    • Application is deployed in a high availability fashion using Azure App Service.
    • Azure App Service AutoScale is used to achieve horizontal scalability.
  • Database tier
    • Autonomous JSON Database provides high availability as Oracle Real Application Clusters (Oracle RAC) and several database nodes underpin the service instance. Therefore, by default the database tier is highly available and resilient.
    • Oracle Database API for MongoDB enabled in Autonomous JSON Database allows you to use existing application code without changes.
    • The Oracle Database API for MongoDB is highly resilient, and that resiliency is guaranteed internally by Autonomous JSON Database.
    • Autonomous JSON Database can use auto scaling, adjusting to increases and decreases of system load.
    • Autonomous JSON Database business continuity is achieved through backup-based cross region disaster recovery. Alternatively, you can use refreshable clones.
  • Disaster Recovery
    • Two regions support cross-region disaster recovery for the entire cloud deployment.
    • The Autonomous JSON Database in the primary region has a backup-based cross region peer on the secondary region.
    • The second region is deployed with a similar topology to reduce the overall recovery time objective.
    • Use a warm DR strategy to reduce the overall RTO. In a warm DR strategy, the back-end tier cloud resources are already provisioned alongside the Autonomous JSON Database standby database.
    • Alternatively you can provision the back-end tier resources in the event of a failure, decreasing the cost of running the DR resources but increasing the overall RTO.
  • Networking
    • All application incoming traffic from on- premises and from the internet is routed by Azure Front Door.
    • Autonomous JSON Database is deployed with a private endpoint to increase the security posture.
    • Azure App Service is Web App is deployed using an integration subnet and VNet to reach the Autonomous JSON Database instance.
    • The Application VNet is peered with the Database VNet, and traffic is allowed to flow between the Web App and Autonomous JSON Database.
  • Security
    • All data is secure in transit and at rest.
    • The following potential design improvements are not depicted on this deployment for simplicity's sake:
      • Automate application Disaster Recovery using Azure Automation runbooks to switch Front Door endpoints and validate post failover app health.
      • Leverage a hub and spoke topology to enforce centralized network security.
      • Leverage a network firewall, deployed in the hub VNet, to improve the overall security posture by inspecting all traffic and enforcing policies.

The following diagram illustrates this reference architecture.



mongodb-ajd-azure-physical-arch.zip

The architecture has the following Microsoft components:

  • Azure Firewall Manager

    Azure Firewall Manager is a centralized security management service that simplifies the deployment and configuration of Azure Firewall across multiple regions and subscriptions. It allows for hierarchical policy management, enabling global and local firewall policies to be applied consistently. When integrated with Azure Virtual WAN (vWAN) and a secure hub, Azure Firewall Manager enhances security by automating traffic routing and filtering without the need for user-defined routes. This integration ensures that traffic between virtual networks, branch offices, and the internet is securely managed and monitored, providing a robust and streamlined network security solution.

  • Azure Front Door

    Azure Front Door is a cloud-based service that acts as a global entry point for web applications, providing high-performance content delivery, intelligent Layer 7 load balancing, and integrated security features like Web Application Firewall (WAF) and DDoS protection to ensure fast, reliable, and secure user experiences.

  • Azure region

    An Azure region is a geographical area in which one or more physical Azure data centers, called availability zones, reside. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

    Azure and OCI regions are localized geographic areas. For Oracle Database@Azure, an Azure region is connected to an OCI region, with availability zones (AZs) in Azure connected to availability domains (ADs) in OCI. Azure and OCI region pairs are selected to minimize distance and latency.

  • Azure availability zone

    Azure availability zones are physically separate locations within an Azure region, designed to ensure high availability and resiliency by providing independent power, cooling, and networking.

  • Azure Virtual Network (VNet)

    Azure Virtual Network (VNet) is the fundamental building block for your private network in Azure. VNet enables many types of Azure resources, such as Azure virtual machines (VMs), to securely communicate with each other, the internet, and on-premises networks.

  • Azure App Service

    Azure App Service is a fully managed platform-as-a-service (PaaS) that enables building, hosting, and scaling web applications, APIs, and mobile backends without managing the underlying infrastructure.

  • Azure App Service integration subnet

    A dedicated subnet within an Azure Virtual Network that is specifically delegated for use by App Service plans, enabling web apps to make outbound connections to private resources within the virtual network or its peered networks, but not to receive inbound traffic from the VNet.

  • Azure delegated subnet

    A delegated subnet allows you to insert a managed service, specifically a platform-as-a-service (PaaS) service, directly into your virtual network as a resource. You have full integration management of external PaaS services within your virtual networks.

The architecture has the following Oracle components:

  • OCI region

    An OCI region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • OCI Object Storage

    OCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from applications or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.

    Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • OCI Private Endpoint

    OCI Private Endpoint provides no-cost, private, secure access to one of many OCI services from within a virtual cloud network (VCN) or on-premises network.

Architecture Variant

This variant of the proposed physical architecture uses a customer managed Oracle REST Data Services deployment running in each application server. However, the fully managed MongoDB API provided by Autonomous JSON Database is the best solution for most workloads since it is easier to manage.

If there are requirements to manually control the configuration and management of Oracle REST Data Services, then using customer-managed Oracle REST Data Services is an option. For example, to allow the application to use larger connection pools.

Note:

Use this architecture variant if there is a specific workload requirement to do so. Only advanced users should deploy this architecture variant.

This section only describes the differences compared to the previously described physical architecture, so all physical architecture design principles are valid unless stated otherwise.

The following architecture diagram depicts how the variant is deployed. For simplicity, only the cloud resources deployed in the JSON Workload VCN are depicted, since the rest of the deployment is the same as the physical architecture described earlier.



mongodb-ajd-azure-arch-variant-oracle.zip

The following describes the front-end tier for the variant:
  • The incoming user requests are distributed by the App Service load balancer, so the front-end tier is horizontally scalable and doesn't have a single point of failure.
  • The back-end application is deployed in the App Service Scale Unit's workers.
  • The application is deployed using container as the publishing method.
  • Create, install and configure the container with the application and Oracle REST Data Services, which enables both to run in the same container.
  • Each worker runs the container image that colocates the application and Oracle REST Data Services in the same runtime environment.
  • Customer-managed Oracle REST Data Services workers is configured to enable the MongoDB API, so that the application can connect to the database using MongoDB tools and drivers.
  • Customer-managed Oracle REST Data Services is configured to adjust to the workload non-functional requirements, for instance, by configuring larger connection pools or using a different database service.
  • Both the back-end code and the customer-managed Oracle REST Data Services are preinstalled and preconfigured in the container image used on the workers. When App Service scales horizontally, new workers are able to run the back-end application and connect to the database after provisioning.

Recommendations

Use the following recommendations as a starting point to further improve and evolve the workload. Your requirements might differ from the architecture described here.
  • Application Deployment
    • Consider using a container based deployment using Azure Kubernetes Service (AKS) if you need advanced orchestration, networking and security features that might not be available in App Service.
  • Security
    • Consider using Oracle Data Safe to further increase the workload security posture and perform database auditing.
  • Observability
    • Consider using Azure Monitor, to monitor Autonomous JSON Database metrics alongside all other Azure services monitoring data.
  • Disaster Recovery
    • Consider automating and orchestrating the disaster and recovery for all the layers of the stack using Azure Site Recovery or custom scripts that detect failures and initiate failover processes.
  • Operational Efficiency
    • If the Autonomous JSON Database workload is part of a wider database fleet, then consider using Elastic Pools for increased cost efficiency.
    • Consider enabling Oracle Cloud Infrastructure Database Management, an OCI service that provides a comprehensive set of database performance monitoring and management features, to streamline management of the Autonomous JSON Database instance.
  • Application Evolution
    • Consider deploying operational analytics and real time reporting in Autonomous JSON Database using SQL and a front-end such as APEX or PowerBI, without moving data out of the database, for trusted and real time data analysis
    • Consider using Autonomous JSON Database for machine learning using Oracle Machine Learning (OML), to build and train models with JSON data without any need for data movement and to deploy the models alongside the existing workload for efficient inferencing.
    • For additional use cases beyond the application core, consider using Autonomous JSON Database. Select AI and database views querying JSON and holding metadata, so that users can query JSON data using natural language.
    • Consider using Autonomous JSON Database to store additional data types (relational, vector, spatial or graph) up to 20 Gb for added workload functionality and flexibility.

Acknowledgments

  • Authors: José Cruz
  • Contributors: Massimo Castelli, Simon Griffith, Hermann Baer, Matt DeMarco, Julian Dontcheff