Migrate applications and enable disaster recovery with ZConverter ZDM

ZConverter's Disaster Recovery Manager (ZDM) enables source-agnostic migration and recovery of complete applications. This architecture shows how you can migrate and backup applications from both on-premises systems and other cloud systems to Oracle Cloud Infrastructure (OCI), and then recover those systems at need.

ZDM uses ZConverter's proprietary .ZIA image format to create source-agnostic backups. The .ZIA format enables you to capture the entire workloads of a server, including operating systems, applications, data, IT services, and their dependencies, all in a consistent state. This allows you to backup critical applications, data, and infrastructure from any source to OCI, regardless of the machine type, disk type, hypervisor, platform or geography.

During the backup process, each application is compressed and encrypted then replicated over the WAN to OCI for long term storage. You can then use the backups to recover applications in OCI after a disaster, a ransomware attack, or whenever you need to migrate a particular server.

ZConverter enables disaster recovery between dissimilar cloud platforms, hypervisors, and disk formats, which means you can recover the following sources to OCI:

  • VMware, Hyper-V, KVM, Xen
  • AWS, Azure, GCP, Alibaba Cloud
  • OpenStack, CloudStack, Classic Oracle Cloud
  • On-premises (Bare Metal)

Architecture

This architecture shows how you can migrate and backup applications from both on-premises systems and other cloud systems to OCI, and then recover those systems at need.

You must install a ZDM instance in each source and target environment. ZDM discovers the machines to be protected, and pushes an agent to those machine. The backup process then begins, and stores backup images as .ZIA source-agnostic files, which are then offloaded to object storage for long-term retention. All backups are replicated to a target ZDM, and are then ready for recovery to provision new machines.

The following diagram illustrates the logical data flow of the backup and replication process. In this case, we are backing up and replicating sample applications that reside in a 3rd party cloud and in the customer's own data center, and copying them to OCI as the target destination.



The following diagram illustrates the physical layout of ZDM servers located in a customer's data center and in a 3rd party cloud provider, with OCI as the target destination for the backup files.



zconverter-topology-oracle.zip

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Site-to-Site VPN

    Site-to-Site VPN provides IPSec VPN connectivity between your on-premises network and VCNs in Oracle Cloud Infrastructure. The IPSec protocol suite encrypts IP traffic before the packets are transferred from the source to the destination and decrypts the traffic when it arrives.

  • Bastion service

    Oracle Cloud Infrastructure Bastion provides restricted and time-limited secure access to resources that don't have public endpoints and that require strict resource access controls, such as bare metal and virtual machines, Oracle MySQL Database Service, Autonomous Transaction Processing (ATP), Oracle Container Engine for Kubernetes (OKE), and any other resource that allows Secure Shell Protocol (SSH) access. With Oracle Cloud Infrastructure Bastion service, you can enable access to private hosts without deploying and maintaining a jump host. In addition, you gain improved security posture with identity-based permissions and a centralized, audited, and time-bound SSH session. Oracle Cloud Infrastructure Bastion removes the need for a public IP for bastion access, eliminating the hassle and potential attack surface when providing remote access.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Network security group (NSG)

    NSGs act as virtual firewalls for your cloud resources. With the zero-trust security model of Oracle Cloud Infrastructure, all traffic is denied, and you can control the network traffic inside a VCN. An NSG consists of a set of ingress and egress security rules that apply to only a specified set of VNICs in a single VCN.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Block volume

    With block storage volumes, you can create, attach, connect, and move storage volumes, and change volume performance to meet your storage, performance, and application requirements. After you attach and connect a volume to an instance, you can use the volume like a regular hard drive. You can also disconnect a volume and attach it to another instance without losing data.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • On-premises network

    This network is the local network used by your organization. It is one of the spokes of the topology.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Monitoring

    Oracle Cloud Infrastructure Monitoring service actively and passively monitors your cloud resources using metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Identity and access management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Audit

    The Oracle Cloud Infrastructure Audit service automatically records calls to all supported Oracle Cloud Infrastructure public application programming interface (API) endpoints as log events. Currently, all services support logging by Oracle Cloud Infrastructure Audit.

  • ZConverter Cloud Disaster Recovery Manager (ZDM)

    ZMD enables backup, migration, disaster recovery and ransomware protection.

  • .ZIA

    ZIA is ZConverter’s proprietary source-agnostic image format. It provides the ability to backup, replicate, migrate, and recover operating systems, applications, data, associated IT services, and necessary dependencies, all simultaneously.

Recommendations

Use the following recommendations as a starting point to backup or migrate your workloads to OCI.Your requirements might differ from the architecture described here.
  • Databases

    We recommend you use the database's replication features rather than this solution.

  • Network and DNS resolution

    Network and DNS resolution is typically configured and managed separately when working with disaster recovery scenarios.

  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

  • Load balancer bandwidth

    While creating the load balancer, you can either select a predefined shape that provides a fixed bandwidth, or specify a custom (flexible) shape where you set a bandwidth range and let the service scale the bandwidth automatically based on traffic patterns. With either approach, you can change the shape at any time after creating the load balancer.

  • System configuration for ZConverter Disaster Recovery Manager (ZDM)

    Use the following recommendations to help configure your systems:

    • Minimum server specification for ZDM: a virtual machine with 2 cores, 8GB RAM, and a 60GB root partition.
    • Create a single object-based storage repository for long term retention of full and incremental back-ups. Configure your object storage in a WORM (Write once, Read many) state to protect against deletion, lost or corrupted data, and ransomware attacks.
    • Use Terraform to auto-provision application servers at the time of recovery.
    • Use Terraform to create additional ZDM pairs by automatically provisioning VCNs, subnets, gateways, and so on.

Considerations

When enabling migration and disaster recovery with ZDM, consider the following options.

  • Load balancers

    Note that you cannot apply this solution to load balancers.

  • Replication schedule

    Your replication schedule will depend on the expected RPO. You can adjust replication periods to be shorter or longer depending on your needs.

  • Scalability
    • A pair of ZConverter Disaster Recovery Managers (ZDM) are required for each source and target combination.
    • For source environments that are either write intensive, or contain excessively large amounts of data, the replication process can be scaled up by adding additional ZDM servers in the source environment, adding more cores and RAM to the ZDM VM, or moving to a multi-processor bare metal server platform that is running local SSD drives or is attached to high-performance Network-Attached Storage (“NAS”).
    • Each ZDM pair protects approximately 200 servers. The number of servers you can support with a single ZDM pair depends mostly on the amount of data you are capturing.
  • Security
    • ZConverter compresses and encrypts the applications, data, and operating systems at the time of back up, and the workloads remain encrypted and compressed in the target environment until the time of recovery.
    • Use Oracle Cloud Infrastructure Identity and Access Management (IAM) policies to control who can access your cloud resources and what operations can be performed.
    • To protect passwords or any other secrets, consider using the Oracle Cloud Infrastructure Vault service.
  • Cost

    You can use ZDM free of charge to set up a proof of concept deployment, which may include completing the production disaster recovery installation, the disaster recovery drill training, and creating a Terraform script for auto-provisioning servers during a recovery. Once you are actively using ZDM, there is a monthly subscription fee that includes cloud disaster recovery, migration, on-premise and cloud back-up, and ransomware protection.

Deploy

The ZConverter ZDM tool is available from the Oracle Cloud Marketplace.
  1. Go to Oracle Cloud Marketplace.
  2. Click Get App.
  3. Follow the on-screen prompts.

Acknowledgments

  • Author: Zaid Al Qaddoumi
  • Contributors: Ananta Ahluwalia, Bruce Templeton, Won Lee