Modernize and Consolidate AWS RDS for MySQL and MongoDB Atlas on Oracle Autonomous Database

Learn how one customer replaced a complex, multiple-database architecture on Amazon Web Service (AWS) that used MySQL and MongoDB Atlas with a single, converged Autonomous Database.

The database migration was done by using Oracle Cloud Infrastructure GoldenGate. Oracle Cloud Infrastructure GoldenGate is a fully managed, native cloud service that moves data in real-time, at scale. The migration also preserved a large percentage of application code by using Oracle Database API for MongoDB.

The legacy architecture deployed on AWS was a complex data-processing workflow that consisted of a managed AWS RDS MySQL Database and MongoDB Atlas to ensure data consistency and accuracy, while providing the information in a flexible JSON format to their customers.

The following diagram illustrates the legacy data flow:



aws-rds-oci-adw-flow1-oracle.zip

Multiple, fragmented systems were coupled together, increasing the complexity of data loading, transformation, and synchronization. The legacy architecture was not only complex, but also wasted resources and was error-prone. The complexity of the system:

  • Increased the chances of data processing errors, leading to lower data quality and less than 100% accuracy
  • Increased the processing times, leading to suboptimal time-to-production
  • Increased the operational overhead, wasting time and resources on problem-solving and debugging

Architecture

The production database is migrated from Amazon RDS for MySQL and MongoDB Atlas in AWS to Oracle Autonomous Database deployed in Oracle Cloud Infrastructure (OCI) US-East (Ashburn) region by using OCI GoldenGate provisioned in the same region and virtual cloud network (VCN).

Unidirectional replication between the two clouds is used only for migration.

Applications are deployed in AWS US-East (Ashburn). After migration, the application tier is also migrated to an OCI compute instance. Dedicated connectivity is provided by Oracle Cloud Infrastructure FastConnect and an OCI FastConnect partner to connect databases running in AWS to OCI GoldenGate. For a link to a list of OCI FastConnect partners by region, refer to the Explore More section.

The high-level migration and consolidation steps are as follows:

  1. Prepare the MySQL and MongoDB databases for replication using OCI GoldenGate.
  2. Provision two OCI GoldenGate deployments for running Extracts: MySQL deployment for MySQL database and Big Data deployment for MongoDB.
  3. Create connections from OCI GoldenGate to MySQL and MongoDB.
  4. In the target Oracle deployment, create an OCI GoldenGate deployment to replicate data to Oracle Autonomous Database.
  5. Create a connection from OCI GoldenGate to Oracle Autonomous Database.
  6. Assign connections to the respective deployments.
  7. Create an initial load extract for both MySQL and MongoDB deployments.
  8. Create distribution service processes to send the data from the source MySQL and MongoDB deployments to the target Oracle deployment.
  9. Create a replicate process on the target Oracle deployment to replicate initial load data.
  10. Create a change-data capture extract for both MySQL and MongoDB deployments.
  11. Create a change-data capture replicate process on the target Oracle deployment to replicate data.
  12. Replicate the application from AWS EC2 to the OCI VM.
  13. Connect the application in OCI to Oracle Autonomous Database.
  14. Disconnect and decommission the applications and databases on AWS.

With converged database capabilities that support relational and document-store requirements within a single database, Oracle Autonomous Database running on OCI provides the core platform to achieve the customer’s goals. Oracle Autonomous Database enables them to store the raw data as-is in its relational format with enforced data integrity. Data for the end-user applications is returned in the required JSON format in milliseconds. The data is readily available through SQL and JSON and through Oracle Database API for MongoDB, without any further processing.

The following diagram illustrates the architecture:



aws-rds-oci-adw-arch-oracle.zip

In this architecture, OCI GoldenGate and Oracle Autonomous Database are provisioned in OCI and facilitate unidirectional replication from AWS RDS for MySQL.

By using OCI GoldenGate, you can also design, run, orchestrate, and monitor data replication and streaming analytics tasks without having to allocate or manage any compute environments. OCI GoldenGate is not required to run the application and database after the migration completed and can be removed.

Consolidating the former database architecture into a single Oracle Autonomous Database running on OCI eliminates the need to operate a complex architecture with multiple single-purpose databases and complex, synchronized extract, transform, and load (ETL) processes between these two databases, and reduces the processing time while increasing the data quality and accuracy.

The following diagram illustrates the current data flow:



aws-rds-oci-adw-flow2-oracle.zip

The customer realized the following immediate business benefits:

  • 15% faster project time-to-production and better quality to deliver higher customer satisfaction
  • Increased application performance through a simpler architecture and Oracle Autonomous Database capabilities
  • 10% operational cost savings through less infrastructure and reduced application and database management
  • Increased developer productivity to focus on innovation rather than on fixing problems

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Bastion service

    Oracle Cloud Infrastructure Bastion provides restricted and time-limited secure access to resources that don't have public endpoints and that require strict resource access controls, such as bare metal and virtual machines, Oracle MySQL Database Service, Autonomous Transaction Processing (ATP), Oracle Container Engine for Kubernetes (OKE), and any other resource that allows Secure Shell Protocol (SSH) access. With Oracle Cloud Infrastructure Bastion service, you can enable access to private hosts without deploying and maintaining a jump host. In addition, you gain improved security posture with identity-based permissions and a centralized, audited, and time-bound SSH session. Oracle Cloud Infrastructure Bastion removes the need for a public IP for bastion access, eliminating the hassle and potential attack surface when providing remote access.

  • Compute

    The Oracle Cloud Infrastructure Compute service enables you to provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Monitoring

    Oracle Cloud Infrastructure Monitoring service actively and passively monitors your cloud resources using metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

  • Cloud Guard

    You can use Oracle Cloud Guard to monitor and maintain the security of your resources in Oracle Cloud Infrastructure. Cloud Guard uses detector recipes that you can define to examine your resources for security weaknesses and to monitor operators and users for risky activities. When any misconfiguration or insecure activity is detected, Cloud Guard recommends corrective actions and assists with taking those actions, based on responder recipes that you can define.

  • Data Integration

    Oracle Cloud Infrastructure GoldenGate is a fully managed service that allows data ingestion from sources residing on premises or in any cloud, leveraging the GoldenGate CDC technology for a non intrusive and efficient capture of data and delivery to Oracle Autonomous Data Warehouse in real time and at scale in order to make relevant information available to consumers as quickly as possible.

  • AWS RDS for MySQL

    Relational Database Service (RDS) for MySQL is a fully-managed MySQL offering from Amazon Web Services (AWS).

  • Autonomous Transaction Processing

    Oracle Autonomous Transaction Processing is a self-driving, self-securing, self-repairing database service that is optimized for transaction processing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

Recommendations

Use the following recommendations as a starting point. Your requirements might differ from the architecture described here.

  • Choice of interconnection location

    This architecture requires one or more geographic locations for its components: the OCI region and associated OCI FastConnect edge node, and the AWS region and associated AWS Direct Connect edge node. To achieve the optimal end-to-end latency, we recommend that you select a location that has each of these architectural elements in close proximity.

  • Provisioning

    For an Oracle Database in a private endpoint, configure the VCN to allow traffic only from the specified VCN. This blocks access to the database from all public IPs or VCNs. Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

  • Application design

    When using active-active replication, the time zones must be same on both database systems so that timestamp-based conflict resolution and detection can operate effectively.

  • Data uniqueness

    When using active-active replication, set a unique range in sequences for easy identification to prevent conflicts. Conflict resolution procedures must be implemented on all systems in an active-active configuration. Conflicts should be identified immediately and handled with as much automation as possible.

Considerations

When implementing a split-stack deployment, consider these options.

  • Network

    With a multicloud solution, networking is a key determinant of overall system performance. It is the customer’s responsibility to ensure that the cloud-to-cloud network (bandwidth and latency) is fully tested to ensure the application performance meets the defined business requirements.

  • Cost

    Autoscaling Oracle CPUs allows the database to automatically adjust its compute resources (CPU and memory) based on workload demands. During peak workloads, Oracle Autonomous Transaction Processing (ATP) can allocate more resources to ensure optimal performance, and during periods of lower demand, it can scale down to save costs. This elasticity is particularly useful for handling fluctuating workloads efficiently and cost-effectively.

  • Conflict Resolution Strategy

    Define a clear conflict resolution strategy in your application. When conflicting changes are detected, your application should be programmed to handle them based on predefined rules. These rules could prioritize changes from one source over another or merge conflicting changes based on specific criteria.

Acknowledgments

  • Authors: Vivek Verma
  • Contributors: Wei Han, Robert Lies