1 Disaster Recovery Introduction

This chapter provides an introduction to the Oracle Fusion Middleware Disaster Recovery solution.

It contains the following topics:

Disaster Recovery Overview
Disaster Recovery for Oracle Fusion Middleware Components

1.1 Disaster Recovery Overview

This section provides an overview of Oracle Fusion Middleware Disaster Recovery.

It contains the following topics:

Problem Description and Common Solutions
Terminology

1.1.1 Problem Description and Common Solutions

Providing Maximum Availability Architecture is one of the key requirements for any Oracle Fusion Middleware enterprise deployment. Oracle Fusion Middleware includes an extensive set of high availability features such as: process death detection and restart, server clustering, server migration, clusterware integration, GridLink, load balancing, failover, backup and recovery, rolling upgrades, and rolling configuration changes, which protect an Enterprise Deployment from unplanned down time and minimize planned downtime.

Additionally, enterprise deployments need protection from unforeseen disasters and natural calamities. One protection solution involves setting up a standby site at a geographically different location than the production site. The standby site may have equal or fewer services and resources compared to the production site. Application data, metadata, configuration data, and security data are replicated to the standby site on a periodic basis. The standby site is normally in a passive mode; it is started when the production site is not available. This deployment model is sometimes referred to as an active/passive model. This model is normally adopted when the two sites are connected over a WAN and network latency does not allow clustering across the two sites.

A core strategy for and a key feature of Oracle Fusion Middleware is hot-pluggability. Built for the heterogeneous enterprise, Oracle Fusion Middleware consists of modular component software that runs on a range of popular platforms and interoperates with middleware technologies and business applications from other software vendors such as IBM, Microsoft, and SAP. For instance, Oracle Fusion Middleware products and technologies such as ADF, Oracle BPEL Process Manager, Oracle Enterprise Service Bus, Oracle Web Services Manager, Adapters, Oracle Access Manager, Oracle Identity Manager, Rules, Oracle TopLink, and Oracle Business Intelligence Publisher can run on non-Oracle containers such as IBM Websphere and JBoss, in addition to running on the Oracle WebLogic Server container.

The Oracle Fusion Middleware Disaster Recovery solution uses storage replication technology for disaster protection of Oracle Fusion Middleware middle tier components. It supports hot-pluggable deployments, and it is compatible with third party vendor recommended solutions.

Disaster protection for Oracle databases that are included in your Oracle Fusion Middleware is provided through Oracle Data Guard.

This document describes how to deploy the Oracle Fusion Middleware Disaster Recovery solution for enterprise deployments on Linux and UNIX operating systems, making use of storage replication technology and Oracle Data Guard technology.

1.1.2 Terminology

This section defines the following Disaster Recovery terminology:

asymmetric topology: An Oracle Fusion Middleware Disaster Recovery configuration that is different across tiers on the production site and standby site. For example, an asymmetric topology can include a standby site with fewer hosts and instances than the production site. Section 4.4, "Creating an Asymmetric Standby Site" describes how to create asymmetric topologies.
disaster: A sudden, unplanned catastrophic event that causes unacceptable damage or loss. A disaster is an event that compromises an organization's ability to provide critical functions, processes, or services for some unacceptable period of time and causes the organization to invoke its recovery plans.
Disaster Recovery: The ability to safeguard against natural or unplanned outages at a production site by having a recovery strategy for applications and data to a geographically separate standby site.
alias host name: This guide differentiates between the terms alias host name and physical host name.

The alias host name is an alternate way to access the system besides its real network name. Typically, it resolves to the same IP address as the network name of the system. This can be defined in the name resolution system such as DNS, or locally in the local hosts file on each system. Multiple alias host names can be defined for a given system.

See also the physical host name definition later in this section.
physical host name: The physical host name is the host name of the system as returned by the gethostname() call or the hostname command. Typically, the physical host name is also the network name used by clients to access the system. In this case, an IP address is associated with this name in the DNS (or the given name resolution mechanism in use) and this IP is enabled on one of the network interfaces to the system.

A given system typically has one physical host name. It can also have one or more additional network names, corresponding to IP addresses enabled on its network interfaces, that are used by clients to access it over the network. Further, each network name can be aliased with one or more alias host names.

See also the alias host name definition earlier in this section.
virtual host name: Virtual host name is a network addressable host name that maps to one or more physical machines via a load balancer or a hardware cluster. For load balancers, the name "virtual server name" is used interchangeably with virtual host name in this book. A load balancer can hold a virtual host name on behalf of a set of servers, and clients communicate indirectly with the machines using the virtual host name. A virtual host name in a hardware cluster is a network host name assigned to a cluster virtual IP. Because the cluster virtual IP is not permanently attached to any particular node of a cluster, the virtual host name is not permanently attached to any particular node either.

Note:

Whenever the term "virtual host name" is used in this document, it is assumed to be associated with a virtual IP address. In cases where just the IP address is needed or used, it will be explicitly stated.
virtual IP: Also, cluster virtual IP and load balancer virtual IP. Generally, a virtual IP can be assigned to a hardware cluster or load balancer. To present a single system view of a cluster to network clients, a virtual IP serves as an entry point IP address to the group of servers which are members of the cluster. A virtual IP can be assigned to a server load balancer or a hardware cluster.

A hardware cluster uses a cluster virtual IP to present to the outside world the entry point into the cluster (it can also be set up on a standalone machine). The hardware cluster's software manages the movement of this IP address between the two physical nodes of the cluster while clients connect to this IP address without the need to know which physical node this IP address is currently active on. In a typical two-node hardware cluster configuration, each machine has its own physical IP address and physical host name, while there could be several cluster IP addresses. These cluster IP addresses float or migrate between the two nodes. The node with current ownership of a cluster IP address is active for that address.

A load balancer also uses a virtual IP as the entry point to a set of servers. These servers tend to be active at the same time. This virtual IP address is not assigned to any individual server but to the load balancer which acts as a proxy between servers and their clients.
production site setup: The process of creating the production site. To create the production site using the procedure described in this manual, you must plan and create physical host names and alias host names, create mount points and symbolic links (if applicable) on the hosts to the Oracle home directories on the shared storage where the Oracle Fusion Middleware instances will be installed, install the binaries and instances, and deploy the applications. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.
site failover: The process of making the current standby site the new production site after the production site becomes unexpectedly unavailable (for example, due to a disaster at the production site). This book also uses the term "failover" to refer to a site failover.
site swit chback: The process of reverting the current production site and the current standby site to their original roles. Switchbacks are planned operations done after the switchover operation has been completed. A switchback restores the original roles of each site: the current standby site becomes the production site and the current production site becomes the standby site. This book also uses the term "switchback" to refer to a site switchback.
site switchover: The process of reversing the roles of the production site and standby site. Switchovers are planned operations done for periodic validation or to perform planned maintenance on the current production site. During a switchover, the current standby site becomes the new production site, and the current production site becomes the new standby site. This book also uses the term "switchover" to refer to a site switchover.
site synchronization: The process of applying changes made to the production site at the standby site. For example, when a new application is deployed at the production site, you should perform a synchronization so that the same application will be deployed at the standby site, also.
standby site setup: The process of creating the standby site. To create the standby site using the procedure described in this manual, you must plan and create physical host names and alias host names, and create mount points and symbolic links (if applicable) to the Oracle home directories on the standby shared storage. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.
symmetric topology: An Oracle Fusion Middleware Disaster Recovery configuration that is completely identical across tiers on the production site and standby site. In a symmetric topology, the production site and standby site have the identical number of hosts, load balancers, instances, and applications. The same ports are used for both sites. The systems are configured identically and the applications access the same data. This manual describes how to set up a symmetric Oracle Fusion Middleware Disaster Recovery topology for an enterprise configuration.
topology: The production site and standby site hardware and software components that comprise an Oracle Fusion Middleware Disaster Recovery solution.
target: Targets are core Enterprise Manager entities which represent the infrastructure and business components in an enterprise. These components need to be monitored and managed for efficient functioning of the business. For example, Oracle Fusion Middleware farm or Oracle Database.
system: A System is the set of targets (hosts, databases, application servers, etc.) that work together to host your applications. To monitor an application in Enterprise Manger, you would first create a System, that consists of the database, listener, application server, and hosts targets on which the application run.
site: Site is a set of different targets in a datacenter needed to run a group of applications. For example, a site could consist of Oracle Fusion Middleware instances, databases, storage, and so on. A datacenter may have more than one site defined by Oracle Site Guard and each of them managed independently for operations like switchover and failover.

1.2 Disaster Recovery for Oracle Fusion Middleware Components

This section provides an introduction to setting up Disaster Recovery for a common Oracle Fusion Middleware enterprise deployment.

It contains the following topics:

Oracle Fusion Middleware Disaster Recovery Architecture Overview
Components Described in this Document

1.2.1 Oracle Fusion Middleware Disaster Recovery Architecture Overview

This section describes the deployment architecture for Oracle Fusion Middleware components.

The product binaries and configuration for Oracle Fusion Middleware components and applications gets deployed in Oracle home directories on the middle tier. Additionally, most of the products also have metadata or run-time data stored in a database repository.

Therefore, the Oracle Fusion Middleware Disaster Recovery solution keeps middle tier file system data and middle tier data stored in databases at the production site synchronized with the standby site.

The Oracle Fusion Middleware Disaster Recovery solution supports these methods of providing data protection for Oracle Fusion Middleware data and database content:

Oracle Fusion Middleware product binaries, configuration, and metadata files

Use storage replication technologies.
Database content

Use Oracle Data Guard for Oracle databases (and vendor-recommended solutions for third party databases).

Figure 1-1 shows an overview of an Oracle Fusion Middleware Disaster Recovery topology:

Figure 1-1 Production and Standby Site for Oracle Fusion Middleware Disaster Recovery Topology

Description of "Figure 1-1 Production and Standby Site for Oracle Fusion Middleware Disaster Recovery Topology"

Some of the key aspects of the solution in Figure 1-1 are:

The solution has two sites. The current production site is running and active, while the second site is serving as a standby site and is in passive mode.
Hosts on each site have mount points defined for accessing the shared storage system for the site.
On both sites, the Oracle Fusion Middleware components are deployed on the site's shared storage system. This involves creating all the Oracle home directories, which include product binaries and configuration data for middleware components, in volumes on the production site's shared storage and then installing the components into the Oracle home directories on the shared storage. In Figure 1-1, a separate volume is created in the shared storage for each Oracle Fusion Middleware host cluster (note the Web, Application, and Security volumes created for the Web Cluster, Application Cluster, and Security Cluster in each site's shared storage system).
Mount points must be created on the shared storage for the production site. The Oracle Fusion Middleware software for the production site will be installed into Oracle home directories using the mount points on the production site shared storage. Symbolic links may also need to be set up on the production site hosts to the Oracle Fusion Middleware home directories on the shared storage at the production site. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links.
Mount points must be created on the shared storage for the standby site. Symbolic links also need to be set up on the standby site hosts to the Oracle Fusion Middleware home directories on the shared storage at the standby site. Note that symbolic links are required only in cases where the storage system does not guarantee consistent replication across multiple volumes. See Section 3.2.3, "Storage Replication" for more details about symbolic links. The mount points and symbolic links for the standby site hosts must be identical to those set up for the equivalent production site hosts.
Storage replication technology is used to copy the middle tier file systems and other data from the production site's shared storage to the standby site's shared storage.
After storage replication is enabled, application deployment, configuration, metadata, data, and product binary information is replicated from the production site to the standby site.
It is not necessary to perform any Oracle software installations at the standby site hosts. When the production site storage is replicated at the standby site storage, the equivalent Oracle home directories and data are written to the standby site storage.
Schedule incremental replications at a specified interval. The recommended interval is once a day for the production deployment, where the middle tier configuration does not change very often. Additionally, you should force a manual synchronization whenever you make a change to the middle tier configuration at the production site (for example, if you deploy a new application at the production site). Some Oracle Fusion Middleware components generate data on the file system, which may require more frequent replication based on recovery point objectives. Please refer to Chapter 2, "Recommendations for Fusion Middleware Components" for detailed Disaster Recovery recommendations for Oracle Fusion Middleware components.
Before forcing a manual synchronization, you should take a snapshot of the site to capture its current state. This ensures that the snapshot gets replicated to the standby site storage and can be used to roll back the standby site to a previous synchronization state, if desired. Recovery to the point of the previously successful replication (for which a snapshot was created) is possible when a replication fails.
Oracle Data Guard is used to replicate all Oracle database repositories, including Oracle Fusion Middleware repositories and custom application databases. For information about using Oracle Data Guard to provide disaster protection for Oracle databases, see Section 3.3, "Database Considerations."
If your Oracle Fusion Middleware Disaster Recovery topology includes any third party databases, use the vendor-recommended solution for those databases.
User requests are initially routed to the production site.
When there is a failure or planned outage of the production site, you perform the following steps to enable the standby site to assume the production role in the topology:
1. Stop the replication from the production site to the standby site (when a failure occurs, replication may have already been stopped due to the failure).
2. Perform a failover or switchover of the Oracle databases using Oracle Data Guard.
3. Start the services and applications on the standby site.
4. Use a global load balancer to re-route user requests to the standby site. At this point, the standby site has assumed the production role.

1.2.2 Components Described in this Document

The Oracle Fusion Middleware Disaster Recovery solution supports components from various Oracle product suites, including:

Oracle WebLogic Server

See Section 2.1, "Recommendations for Oracle WebLogic Server" for Disaster Recovery recommendations for Oracle WebLogic Server components.
Oracle ADF

See Section 2.2, "Recommendations for Oracle ADF" for Disaster Recovery recommendations for Oracle Application Development Framework (Oracle ADF).
Oracle WebCenter Portal components:
- Oracle WebCenter Portal: Spaces
- Oracle WebCenter Portal's Portlet Producers
- Oracle WebCenter Portal's Discussion Server
- Oracle WebCenter Content Server
- Oracle WebCenter Portal Pagelet Producer
- Oracle WebCenter Portal Activity Graph Engines
- Oracle WebCenter Portal's Personalization
- Oracle WebCenter Portal's Analytics Collector
- Oracle WebCenter Portal Services Producer
See Section 2.3, "Recommendations for Oracle WebCenter Portal" for Disaster Recovery recommendations for Oracle WebCenter Portal components.
Oracle SOA Suite components:
- Oracle SOA Service Infrastructure
- Oracle BPEL Process Manager
- Oracle Mediator
- Oracle Human Workflow
- Oracle B2B
- Oracle Web Services Manager
- Oracle User Messaging Service
- Oracle JCA Adapters
- Oracle Business Activity Monitoring
- Oracle Business Process Management
See Section 2.4, "Recommendations for Oracle SOA Suite" for Disaster Recovery recommendations for Oracle SOA Suite components.
Oracle Identity Management components:
- Oracle Internet Directory
- Oracle Virtual Directory
- Oracle Directory Integration Platform
- Oracle Identity Federation
- Oracle Directory Services Manager
- Oracle Access Manager
- Oracle Adaptive Access Manager
- Oracle Identity Manager
- Oracle Identity Navigator
See Section 2.5, "Recommendations for Oracle Identity Management" for Disaster Recovery recommendations for Oracle Identity Management components.
Oracle Portal, Forms, Reports, and Business Intelligence Discoverer components:
- Oracle Portal
- Oracle Forms
- Oracle Reports
- Oracle Business Intelligence Discoverer (Discoverer)
See Section 2.6, "Recommendations for Oracle Portal, Forms, Reports, and Discoverer" for Disaster Recovery recommendations for these components.
Oracle Web Tier components:
- Oracle HTTP Server
- Oracle Web Cache
See Section 2.7, "Recommendations for Oracle Web Tier Components" for Disaster Recovery recommendations for Oracle Web Tier components.
Oracle WebCenter Content:
- Oracle WebCenter Content
- Oracle WebCenter Content: Inbound Refinery
- Oracle WebCenter Content: Imaging
- Oracle WebCenter Content: Information Rights
- Oracle WebCenter Content: Records
See Section 2.8, "Recommendations for Oracle WebCenter Content" for Disaster Recovery recommendations for Oracle WebCenter Content components.
Oracle Business Intelligence:
- Oracle Business Intelligence Enterprise Edition (EE)
- Oracle Business Intelligence Publisher
- Oracle Real-Time Decisions
  
  See Section 2.9, "Recommendations for Oracle Business Intelligence" for Disaster Recovery recommendations for Oracle Enterprise Content Management components.