Deploy a Migrated MongoDB Workload to Oracle Autonomous JSON Database

Migrate an existing workload that uses a document database, in this case MongoDB, to Oracle Autonomous JSON Database on Oracle Cloud Infrastructure (OCI) to modernize the development of your JSON-centric applications.

Workloads and applications that use documents and document databases to evolve data schemas and applications are quite popular due to the flexibility they offer to developers. Schema flexibility, rapid development, and scalability enable rapid prototyping of application features, easier application evolution, and the assurance of building iteratively smaller applications and features that developers can scale to address a large user base. However, these types of workloads have their challenges, including weaker transactional guarantees, data query versatility, and the inability to support other workloads on documents, such as analytics or machine learning.

What if these workloads could benefit from all the advantages of traditional document databases, but at the same time, leverage the benefits of relational databases? For instance, have stronger transactional guarantees and use additional functionality such as analytics and machine learning, without the need to replicate data to another database or system.

Autonomous JSON Database features NoSQL-style document APIs (Simple Oracle Document Access (SODA) and Oracle Database API for MongoDB), serverless scaling, high performance ACID transactions, comprehensive security, and low pay-per-use pricing. Autonomous JSON Database automates provisioning, configuring, tuning, scaling, patching, encrypting, and repairing of databases, eliminating database management and delivering 99.95% availability.

Functional Architecture

This architecture assumes, as a starting point, that a workload with an application and a MongoDB database exists, either an on premises or cloud deployment, and will be migrated to OCI. It describes the future state architecture, its benefits, how it can be deployed and what additional features can be used to augment the existing workload.

One of the key features that are used in this architecture is Oracle Database API for MongoDB, which enables applications interact with collections of JSON documents in Oracle Database using MongoDB drivers, tools, and SDKs. Existing applications code can work with data stored in Autonomous JSON Database, without the need to refactor it.

The following diagram depicts a typical application composed of a database, back-end and front-end tier.



mongodb-json-logical-arch-migration-oracle.zip

A popular stack, used to implement this pattern, is the MEAN stack, using MongoDB as the document database, Express as the back-end framework, Angular as the front-end framework and Node.js as the back-end server. This document uses a MEAN stack as an example of an existing deployment that will be migrated to OCI and Autonomous JSON Database.

The migration of this workload to OCI and Autonomous JSON Database is straightforward and consists, at high level, of the following steps:

  1. Deploy an Autonomous JSON Database instance, enabling at creation time the Oracle Database Mongo DB API
  2. Migrate metadata and data from MongoDB to Autonomous JSON Database.
  3. Deploy application servers to run Node.js and Express using either VMs, containers or Kubernetes, to the same region and availability domain as Autonomous JSON Database.
  4. Deploy the back-end application code to the application servers.
  5. Connect the back-end application to Autonomous JSON Database using the same MongoDB tools and drivers used on the current application.
  6. Have users connecting to the new application URI.

Note this reference architecture focuses on the deployment of the migrated workload and not on the migration process itself. For more details on the migration process please refer to the Explore More section.

After the workload is migrated to Autonomous JSON Database, you can use several features to augment the existing functionality, whether that is to 1) support additional nonfunctional requirements such as easily improve scalability, resiliency or high availability or 2) have additional functional features such as operational reporting, analytics and machine learning in place, without the need to copy data out of the database.

To improve scalability and high availability, use the Autonomous JSON Database auto scaling feature, that with a single click or API call, allows the workload to use up to 3 times the baseline capacity, without any downtime. Note that Autonomous JSON Database uses Oracle Real Application Clusters (Oracle RAC) technology for high availability. For the back-end tier, use compute instance pools, with auto scaling rules, thus enabling application high availability and scalability.

Since Autonomous JSON Database is built on top of a multimodel, multiworkload database technology, additional features that rely on relational, spatial, graph or vector data types can be added, working alongside the existing application. It is common that users want to perform analytics on top of JSON data, and using SQL in Autonomous JSON Database, simplifies creating operational and analytical reporting, using the same engine and data.

Autonomous JSON Database has a limit of 20 Gb of non JSON data, but can easily be converted to Autonomous Transaction Processing Serverless, supporting the same features, if data volume requirements change. Do note that Views and Materialized Views storage doesn't count towards the Autonomous JSON Database 20 Gb non-JSON data limit, so those can be easily created and used, for instance, to support operational analytics using SQL on top of JSON documents.

Physical Architecture

The physical architecture includes public and private subnets in OCI with a secondary backup region to support high availability.

The architecture supports the following:

  • Front-end tier
    • Application users can connect from the internet or the corporate network
    • User connection is secured using a Web Application Firewall
    • User connection to the application is load balanced for increased resiliency and scalability
    • Load balancer is deployed with high availability
  • Back-end tier
    • Application servers are deployed in a high availability fashion using an instance pool
    • Instance pool is used with auto scaling to achieve horizontal scalability
    • Instance Pool is configured to deploy instances in the same Availability Domain as the Autonomous JSON Database, to have application and database colocation, hence optimizing connection latency
    • Instance Pool is configured to distribute instances across fault domains in the same availability domain where the Autonomous JSON Database is placed, to increase workload resiliency
  • Database tier
    • Autonomous JSON Database provides high availability as Oracle Real Application Clusters (Oracle RAC) and several database nodes underpin the service instance. Therefore, by default the database tier is highly available and resilient.
    • Oracle Database API for MongoDB enabled in Autonomous JSON Database allows you to use existing application code without changes.
    • The Oracle Database API for MongoDB is highly resilient, and that resiliency is guaranteed internally by Autonomous JSON Database.
    • Autonomous JSON Database can use auto scaling, adjusting to increases and decreases of system load.
    • Autonomous JSON Database business continuity is achieved through backup-based cross region disaster recovery. Alternatively, you can use refreshable clones.
  • Disaster Recovery
    • Two regions support cross-region disaster recovery for the entire cloud deployment.
    • The Autonomous JSON Database in the primary region has a backup based cross region peer on the secondary region.
    • The second region is deployed with a similar topology to reduce the overall recovery time objective.
    • To reduce the overall RTO, you can use a warm DR strategy, where the back-end tier cloud resources are already provisioned, alongside the Autonomous JSON Database standby database.
    • Alternatively you can provision the back-end tier resources in the event of a failure, decreasing the cost of running the DR resources but increasing the overall RTO.
    • Potential design improvements not depicted on this deployment for simplicity's sake include using OCI Full Stack Disaster Recovery to automate disaster recovery for the load balancer and back-end tier.
  • Networking
    • The Dynamic Routing Gateways deployed in both regions are peered.
    • On premises connectivity leverages both OCI FastConnect and site-to-site VPN for redundancy.
    • All incoming traffic from on premises and from the internet is first routed into the hub VCN and then into the workload VCN.
    • Uses a hub and spoke network design to increase the security posture and accommodate other workload VCNs.
    • Services are deployed with private endpoints to increase the security posture.
    • The JSON workload VCN is segregated into several private subnets to increase the security posture.
  • Security

    All data is secure in transit and at rest.

  • Potential design improvements not depicted on this deployment for simplicity's sake include using a full CIS-compliant landing zone and leveraging a network firewall deployed in the hub VCN. A network firewall will improve the overall security posture by inspecting all traffic and by enforcing policies.

The following diagram illustrates this architecture.



mongodb-json-physical-arch-oracle.zip

The architecture has the following components:

  • Region

    An OCI region is a localized geographic area that contains one or more data centers, hosting availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an OCI region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping classless inter-domain routing (CIDR) blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect creates a dedicated, private connection between your data center and OCI. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another OCI region, an on-premises network, or a network in another cloud provider.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Service gateway

    A service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet.

  • Internet gateway

    An internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Load balancer

    Oracle Cloud Infrastructure Load Balancing provides automated traffic distribution from a single entry point to multiple servers.

  • Web Application Firewall

    Oracle Cloud Infrastructure Web Application Firewall (WAF) is a payment card industry (PCI) compliant, regional-based and edge enforcement service that is attached to an enforcement point, such as a load balancer or a web application domain name. WAF protects applications from malicious and unwanted internet traffic. WAF can protect any internet-facing endpoint, providing consistent rule enforcement across your applications.

  • Application server

    Application servers use a secondary peer that, like the database, will take over processing in the event of a disaster. Application servers use configuration and metadata that is stored both in the database and the file system. Application server clustering provides protection in the scope of a single region but ongoing modifications and new deployments need to be replicated to the secondary location on an ongoing basis for a consistent disaster recovery.

  • Oracle Database API for MongoDB

    Oracle Database API for MongoDB enables applications to interact with collections of JSON documents in Oracle Database using MongoDB drivers, tools, and SDKs.

  • Autonomous JSON Database

    Oracle Autonomous JSON Database is a cloud document database service that makes it simple to develop JSON-centric applications. It features simple document APIs, serverless scaling, high performance ACID transactions, comprehensive security, and low pay-per-use pricing. Autonomous JSON Database automates provisioning, configuring, tuning, scaling, patching, encrypting, and repairing the database.

  • Object Storage

    OCI Object Storage provides access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store data directly from the internet or from within the cloud platform. You can scale storage without experiencing any degradation in performance or service reliability.

    Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

Architecture Variant

The fully managed MongoDB API provided by Autonomous JSON Database is the best solution for most workloads since it is easier to manage.

If there are requirements to manually control the configuration and management of Oracle REST Data Services, then using customer-managed Oracle REST Data Services is an option. For example, to allow the application to use larger connection pools.

Note:

Use this architecture variant if there is a specific workload requirement to do so. Only advanced users should deploy this architecture variant.

This section only describes the differences compared to the previously described physical architecture, so all physical architecture design principles are valid unless stated otherwise.

The architecture diagram below depicts how the variant is deployed. For simplicity, only the cloud resources deployed in the JSON Workload VCN are depicted, since the rest of the deployment is the same as described before.



mongodb-json-arch-variant-oracle.zip

The following is the front-end tier for the variant:
  • Back-end application code is deployed in application servers that are part of an instance pool.
  • The incoming user requests are distributed by the load balancer, so the front-end tier is horizontally scalable and doesn't have a single point of failure.
  • Customer-managed Oracle REST Data Services is installed on each application server and configured to enable the MongoDB API, so that the application can connect to the database using MongoDB tools and drivers.
  • Customer-managed Oracle REST Data Services is configured to adjust to the workload non-functional requirements, for instance, by configuring larger connection pools or use a different database service.
  • Both the back-end code and the Customer-managed Oracle REST Data Services, are preinstalled and preconfigured in the instance configuration used by the pool, so that whenever an instance is added to the pool, it is able to run the back-end and connect to the database, after the instance provisioning.

Recommendations

Use the following recommendations as a starting point to further improve and evolve the workload.Your requirements might differ from the architecture described here.
  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

  • Application Deployment

    Consider using a container based deployment using Oracle Kubernetes Engine (OKE) if the application can run in containers.

  • Security

    Consider using Data Safe to further increase the workload security posture and be able to perform database auditing.

  • Observability
    • Consider using OCI Audit to perform forensics auditing for all OCI services beyond Autonomous JSON Database.
    • Consider using Monitoring, Logging and Logging Analytics to have full visibility of the environment operating status.
  • Disaster Recovery

    Consider using OCI Full Stack Disaster Recovery to automate and orchestrate the disaster and recovery of all the layers of the stack.

  • Operational Efficiency
    • Consider using Elastic Pools if the Autonomous JSON workload is part of a wider database fleet, for increased cost efficiency.
    • Consider enabling Database Management, an OCI service that provides a comprehensive set of database performance monitoring and management features, to streamline the AJD instance management.
  • Application Evolution
    • Consider deploying operational analytics and real time reporting in Autonomous JSON Database using SQL and a front-end such as APEX or Oracle Analytics Cloud, without moving data out of the database, for trusted and real time data analysis
    • Consider using Autonomous JSON Database for machine learning using Oracle Machine Learning (OML), to build and train models with JSON data without any need for data movement and to deploy the models alongside the existing workload for efficient inferencing
    • For additional use cases beyond the application core, consider using Autonomous JSON Database. Select AI and database views querying JSON and holding metadata, so that users can query JSON data using natural language
    • Consider using Autonomous JSON Database to store additional data types (relational, vector, spatial or graph), up to 20 Gb, for added workload functionality and flexibility.

Acknowledgments

  • Authors: José Cruz
  • Contributors: Massimo Castelli, Simon Griffith, Hermann Baer, Matt DeMarco, Julian Dontcheff