Oracle® Fail Safe Concepts and Administration Guide
Release 3.3.3 for Windows
Part No. B12070-01
Oracle Fail Safe has a number of configuration options to satisfy whatever architecture or failover requirements you have.
This chapter discusses the following topics:
Customizing Your Configuration
Integrating Clients and Applications
There are two basic ways to deploy highly available solutions:
While both of these configurations differ in the way work is allocated among the cluster nodes, they share the following features:
One or more Oracle homes are created on a private disk (usually the system disk) on each node.
All necessary Oracle product executable files are installed in the Oracle homes on each node.
All data files, configuration files, log files, html files, and so on that are required by the application being made highly available are placed on cluster disks, so that they can be accessed by each cluster node.
The Oracle Services for MSCS software automatically runs as needed on one or more cluster nodes to ensure proper configuration and failover.
Figure 1-4 shows the software and hardware components in a cluster configured with Oracle Fail Safe.
The simplest configuration is an active/passive configuration, in which one or more nodes host the entire cluster workload (such as Oracle databases and Oracle HTTP Servers), but one node remains idle (as a standby server), ready to take over processing in case a node running an application fails. This solution guarantees that the performance for the fail-safe workload will be the same before and after failover.
Figure 3-1 shows a two-node configuration with Oracle Services for MSCS, an Oracle HTTP Server, and an Oracle database running on Node 1, and with Node 2 as a standby node. Currently, nothing is running on Node 2. Node 2 will take over the workload of Node 1 in the event of a failover.
Figure 3-1 Active/Passive (Standby) Two-Node Configuration
Figure 3-2 shows a four-node configuration with Oracle Services for MSCS and an Oracle database running on Node 1, an Oracle HTTP Server and an Oracle database running on Node 2, and an Oracle HTTP Server and an Oracle database running on Node 3. Node 4 is the standby node. Currently, nothing is running on Node 4. In the event of a failover, Node 4 will take over the failover workload.
Figure 3-2 Active/Passive (Standby) Four-Node Configuration
The active/active configuration is more cost-effective than the active/passive configuration because each node shares the application processing tasks, while also backing up other nodes in the event of a failure. If one node fails, another node must be capable of running its own applications and services as well as those that fail over from the failed node. This configuration provides a flexible architecture that lets you divide the workload to best meet your business needs.
Figure 3-3 shows a two-node active/active configuration with an Oracle database running on both cluster nodes. In addition, an Oracle HTTP Server and a generic service are running on Node 1, and Oracle Services for MSCS and an Oracle HTTP Server are running on Node 2. In Figure 3-3, an Oracle database is used for marketing on Node 1, and for sales on Node 2. The cluster disks owned by Node 1 store the marketing files, and the cluster disks owned by Node 2 store the sales files.
Figure 3-3 Active/Active Configuration
In the active/active configuration, all nodes are actively processing applications during normal operations. This configuration provides better performance (higher throughput) when all nodes are operating, but slower failover and possibly reduced performance when a node fails. Also, the client connections are distributed over all nodes.
Balancing workload means making trade-offs concerning the size of the normal workload on each system. If all systems run at nearly full capacity, then few resources are available to handle the load of another system in an outage, and client systems will experience significantly slower response during and after a failover. If you have the resources to quickly repair or replace a failed system, then the temporary period during which one cluster node serves both workloads will be small; a short period of slow response will be tolerated better than a long one. In fact, some businesses actually prefer having applications run more slowly than usual than to have a period of downtime.
Alternatively, running all systems slightly under 75% to 50% capacity (depending on the number of nodes in the cluster) ensures that clients will experience no loss of response time after a failover, but the equivalent of an entire system can remain idle under normal conditions, much like an active/passive configuration.
Oracle Fail Safe can be configured to avoid some of the performance problems with this type of configuration. For example, you can:
Enable failover only for your mission-critical applications
Use different database parameter files on each node so that fewer system resources are used after a failover
Configure each component (Oracle database, Oracle HTTP Server, and so on) into a separate group with its own failover and failback policies
This is possible because Oracle Fail Safe lets you configure each cluster node to host several virtual servers.
Combine the scripting support of Oracle Fail Safe (using the FSCMD command described in Chapter 5) with a system monitoring tool (such as Oracle Enterprise Manager) to automate the movement of groups for load-balancing purposes
Although the nodes do not need to be physically identical, it is wise to select servers with enough power, memory, disk host adapters, and disk drives to support an adequate level of service should failover occur at a busy time of the day.
To operate in an Oracle Fail Safe environment, client applications do not require any special programming or changes. Client applications that work with an Oracle resource on a single node will continue to function correctly in an Oracle Fail Safe environment, without recoding, recompiling, or relinking. This is because clients can use the virtual server to access the application.
Chapters 7 through 9 contain a section specific to how you can integrate clients and applications. Chapter 7 describes how to make your clients and applications transparently fail over when a database fails over to another node in the cluster.