6 Transaction Recovery Spanning Multiple Sites or Data Centers

This chapter describes best practices for XA transaction recovery of WebLogic domains across physical sites as part of a Disaster Recovery (DR) solution.

This chapter includes the following sections:

Understanding XA Transaction Recovery in Disaster Recovery

Maximizing availability and providing protection from unforeseen disasters and natural calamities are key requirements of a disaster recovery solution for an enterprise deployment. Active-passive solutions involve setting up and pairing a standby site at a geographically different location with an active (production) site. The standby site is normally in a passive mode; it is started when the production site is not available. One aspect of a disaster recovery solution is to ensure that all XA transactions of affected WebLogic domains can be recovered if a production site is no longer available. The following sections provide the requirements and guidelines to recover XA transactions from failed production domains. See "Overview of Disaster Recovery" in Oracle® Fusion Middleware Disaster Recovery Guide for more information on disaster recovery solutions for enterprise deployments.

Requirements for XA Transaction Disaster Recovery

The following section provides the conditions and configuration requirements necessary to enable the successful recovery of XA transactions following the failure of a production site in an active-passive disaster recovery solution:

  • All active-passive domain pairs are configured with symmetric topology, they are identical and have the same domain configurations.

  • The timing of the failover is manually controlled. Oracle Enterprise Manager Cloud Control is one option that provides the ability to manage disaster recover of WebLogic Server domains across multiple data centers.

  • The ability to maintain the workload at less than full capacity on the active and passive site during runtime in order to achieve a consistent capacity during runtime and recovery.

  • The listen address of all managed servers and the node manager must use DNS names. IP addresses or direct hostnames should not be used anywhere in a domain configuration.

    • Before initiating the Transaction Recovery service by starting servers in the passive domain, update the DNS server to point the DNS names to the machine(s) in passive data center.

      For example: Active domain Domain1 has two managed servers running on two machines Mc1 and Mc2. In domain configuration, use the corresponding DNS names dns-1 and dns-2. When the active domain fails and we want to activate corresponding passive domain, update the DNS server and change configuration to point dns-1 and dns-2 to Mc3 and Mc4 respectively. Then start passive Domain2.

    • Do not use DNS names that include an underscore, they are not valid in WebLogic Server domains. DNS names with a dash are valid.

  • You have several options to store the Tlog: a default store, JDBC TLog, and a determiner resource when transactions span only one WLS server. A default store must be in a common area (usually NFS or SAN). A JDBC TLog uses a database as a common storage location to all WLS servers and is typically replicated using DataGuard/Active DataGuard to ensure high availability. When possible, eliminate TLogs when XA transactions span a single transaction manager by using determiner resources, see XA Transactions without TLogs.

  • All persistent stores are JDBC stores.

  • Transaction service migration within a cluster is only supported if the entire cluster, including the corresponding domains and servers, is failed over to a recovery site. Specifically, the administrator must insure the node manager and entire cluster, corresponding domains, and any impacted servers have been shutdown on failed site before starting them all on the recovery site.

  • Transactions that span WebLogic domains can only be recovered in a site failover if all domains involved in the transaction are failed over.

  • The domain information is kept in a shared location(s) to ensure domain configurations are in sync. Applications are kept in a shared location(s) to ensure they are in sync.

    Note:

    Pack/Unpack could be another approach to keep configurations in sync.

See "Setting Up and Managing Disaster Recovery Sites" in Oracle® Fusion Middleware Disaster Recovery Guide for detailed information on conditions and requirements for setting up active and passive recovery sites for enterprise deployments.

Example Active-Active Domain Configuration for XA Transaction Recovery

The following section provides an example of an active-active disaster recovery solution for WebLogic Server domains. Active-active domain solutions require two or more active domain configurations and are used to improve scalability and availability. In this example, an active domain is located at each data center and is paired with a symmetrically configured passive standby domain at another data center. Each data center is geographically distributed to protect your environment from disasters such as floods or regional network outages. See "Overview of Disaster Recovery" in the Oracle® Fusion Middleware Disaster Recovery Guide.

As shown in Figure 6-1, the example has the following configuration:

  • Site1 Domain1 - Active

  • Site1 Domain2 - Passive

  • Site2 Domain2 - Active

  • Site2 Domain1 - Passive

  • All servers point to the same DB instance

  • Databases are replicated using Oracle Active DataGuard

Figure 6-1 Example Active-Active Disaster Recovery (AADR) Domain Configuration

Surrounding text describes Figure 6-1 .

Additional Information on Maximum Availability Architecture

Oracle provides a number of resources which provide additional information on how to configure environments that maximum availability, see: