Skip Headers

Oracle® Fail Safe Concepts and Administration Guide
Release 3.3.3 for Windows
Part No. B12070-01
  Go To Table Of Contents
Contents
Go To Index
Index

Previous Next  

2 Cluster Concepts

Oracle Fail Safe high-availability solutions use Microsoft cluster hardware and Microsoft Cluster Server (MSCS) software.

To take advantage of the high-availability options that Oracle Fail Safe offers, it is important to understand MSCS concepts.

This chapter discusses the following topics:

Topic Reference
Cluster Technology
Section 2.1
Resources, Groups, and High Availability
Section 2.2
Groups, Virtual Addresses, and Virtual Servers
Section 2.3
Allocating IP Addresses for Virtual Addresses
Section 2.4
Cluster Group and Cluster Alias
Section 2.5
Failover
Section 2.6
Failback
Section 2.7

2.1 Cluster Technology

The Windows systems that are members of a cluster are called cluster nodes. The cluster nodes are joined together through a public shared storage interconnect as well as a private internode network connection.

The internode network connection, sometimes referred to as a heartbeat connection, allows one node to detect the availability or unavailability of another node. Typically, a private interconnect (that is distinct from the public network connection used for user and client application access) is used for this communication. If one node fails, the cluster software immediately fails over the workload of the unavailable node to an available node, and remounts on the available node any cluster resources that were owned by the failed node. Clients continue to access cluster resources without any changes.

Figure 2-1 shows the network connections in a two-node Microsoft cluster configuration.

Figure 2-1 Microsoft Cluster System

Description of cluster_config_simple.gif follows
Description of the illustration cluster_config_simple.gif

2.1.1 How Clusters Provide High Availability

Until cluster technology became available, reliability for PC systems was attained by hardware redundancy such as RAID and mirrored drives, and dual power supplies. Although disk redundancy is important in creating a highly available system, this method alone cannot ensure the availability of your system and its applications.

By connecting servers in a Windows cluster with MSCS software, you provide server redundancy, with each server (node) having exclusive access to a subset of the cluster disks during normal operations. A cluster is far more effective than independent standalone systems, because each node can perform useful work, yet still is able to take over the workload and disk resources of a failed cluster node.

By design, a cluster provides high availability by managing component failures and supporting the addition and subtraction of components in a way that is transparent to users. Additional benefits include providing services such as failure detection, recovery, and the ability to manage the cluster nodes as a single system.


Note:

See your hardware documentation for information about using redundant hardware, such as RAID technology, to increase high availability.

2.1.2 System-Level Configuration

There are different ways to set up and use a cluster configuration. Oracle Fail Safe supports the following configurations:

  • Active/passive configurations

  • Active/active configurations

See Chapter 3 for information about these configurations.

2.1.3 Disk-Level Configuration

When an MSCS cluster is recovering from a failure, a surviving node gains access to the failed node's disk data through a shared-nothing configuration.

In a shared-nothing configuration, all nodes are cabled physically to the same disks, but only one node can access a given disk at a time. Even though all nodes are physically connected to the disks, only the node that owns the disks can access them.

Figure 2-2 shows that if a node in a two-node cluster becomes unavailable, the other cluster node can assume ownership of the disks and application workloads that were owned by the failed node and continue processing operations for both nodes.

Figure 2-2 Shared-Nothing Configuration

Description of shrnothing.gif follows
Description of the illustration shrnothing.gif

2.1.4 The Quorum Resource

The quorum resource maintains the configuration data (metadata) necessary for recovery of the cluster in case of a power outage or damage to data in memory. The quorum resource is accessible to other cluster resources so that all cluster nodes have access to the cluster metadata. The quorum resource performs these services:

  • Determines which cluster node controls the cluster

  • Stores logging information necessary to recover the cluster from a failure

  • Maintains access to the most current cluster metadata

The quorum resource can be owned by only one cluster node at a time. If a cluster node becomes isolated (cannot communicate with the other cluster nodes because of a network failure, for example), then the node that gains control of the quorum resource takes over the workload of the isolated node as though a failover had occurred.

To view the location of the quorum resource and the maximum size of the quorum log, select the cluster in the Oracle Fail Safe Manager tree view, then click the Quorum tab. To change the location of the quorum resource or the maximum size of the quorum log, open MSCS Cluster Administrator, then in the File menu select Properties, then click the Quorum tab.

2.2 Resources, Groups, and High Availability

When a server node becomes unavailable, its cluster resources (for example, disks, Oracle databases and applications, and IP addresses) that are configured for high availability are moved to an available node in units called groups. The following sections describe resources and groups, and how they are configured for high availability.

2.2.1 Resources

A cluster resource is any physical or logical component that is available to a computing system and has the following characteristics:

  • It can be brought online and taken offline.

  • It can be managed in a cluster.

  • It can be hosted by only one node in a cluster at a given time, but can be potentially owned by another cluster node. (For example, a resource is owned by a given node. After a failover, that resource is owned by another cluster node. However, at any given time only one of the cluster nodes can access the resource.)

2.2.2 Groups

A group is a logical collection of cluster resources that forms a minimal unit of failover. During a failover, the group of resources is moved to another cluster node. A group is owned by only one cluster node at a time. All resources required for a given workload (database, disks, and other applications) should reside in the same group.

For example, a group created to configure an Oracle database for high availability using Oracle Fail Safe might include the following resources:

  • All disks used by the Oracle database

  • An Oracle database instance

  • One or more virtual addresses, each one consisting of:

  • An Oracle Net network listener that listens for connection requests to databases in the group

  • An Oracle Intelligent Agent that manages communications between Oracle Enterprise Manager and the databases in the group

Note that when you add a resource to a group, the disks it uses are also included in the group. For this reason, if two resources use the same disk, they cannot be placed in different groups. If both resources are to be fail-safe, both must be placed in the same group.

Oracle Fail Safe helps you to create groups and add the resources needed to run applications. For step-by-step instructions on creating a group, see the Oracle Fail Safe Tutorial.

2.2.3 Resource Dependencies

Figure 2-3 shows a group created to make a Sales database highly available. When you add a resource to a group, Oracle Fail Safe Manager automatically adds the other resources upon which the resource you added depends; these relationships are called resource dependencies. For example, when you add a single-instance database to a group, Oracle Fail Safe adds the shared-nothing disks used by the database instance and configures Oracle Net files to work with each group. Oracle Fail Safe also tests the ability of each group to fail over on each node.

Each node in the cluster can own one or more groups. Each group is composed of an independent set of related resources. The dependencies among resources in a group define the order in which the cluster software brings the resources online and offline. For example, a failure causes the Oracle application or database (and Oracle Net listener) to be brought offline first, followed by the physical disks, network name, and IP address. On the failover node, the order is reversed; MSCS brings the IP address online first, then the network name, then the physical disks, and finally the Oracle database and Oracle Net listener or application.

2.2.4 Resource Types

Each resource type (such as a generic service, physical disk, Oracle database, and so on) is associated with a resource dynamic-link library (DLL) and is managed in the cluster environment using this resource DLL. There are standard MSCS resource DLLs as well as custom Oracle resource DLLs. The same resource DLL may support several different resource types.

MSCS provides resource DLLs for the resource types that it supports, such as IP addresses, physical disks, generic services, and many others. (A generic service resource is a Windows service that is supported by a resource DLL provided in MSCS.)

Oracle Fail Safe uses many of the MSCS resource DLLs to monitor resource types for which Oracle Fail Safe provides custom support, such as Oracle HTTP Server and generic services.

Oracle provides a custom DLL for the Oracle database resource type. MSCS uses the Oracle resource DLL to manage the Oracle database resources (bring online and take offline) and to monitor the resources for availability.

Oracle Fail Safe provides the following resource DLL files to enable MSCS to communicate with and monitor Oracle database resources:

  • FsResOdbs.dll provides functions that enable MSCS to bring an Oracle database online or offline and check its status through Is Alive polling.

  • FsResOdbsEx.dll provides a resource administration extension DLL file that is used by the MSCS Cluster Administrator to display the properties of the Oracle database resource.

For example, when you use Oracle Fail Safe Manager to add an Oracle database to a group, Oracle Fail Safe creates the database resource and an Oracle listener resource.

Figure 2-4 shows how Oracle Fail Safe Manager displays resource types. Note that the Oracle HTTP Server resource type is displayed as an Oracle HTTP Server in Oracle Fail Safe Manager and as a generic service in MSCS Cluster Administrator.

Because Oracle Fail Safe has more information than MSCS about Oracle cluster resources, Oracle recommends that you use Oracle Fail Safe Manager (or the FSCMD command) to configure and administer Oracle databases and applications.

Figure 2-4 Resource Types for Highly Available Oracle HTTP Servers

Description of http_resource.gif follows
Description of the illustration http_resource.gif

See also:

  • The Oracle Fail Safe Installation Guide for complete information about the custom resource DLLs provided by Oracle Fail Safe

  • The MSCS documentation set for information about standard resource types and resource DLLs

2.3 Groups, Virtual Addresses, and Virtual Servers

A virtual address is a network address at which resources in a group can be accessed, regardless of the cluster node hosting those resources. A virtual address provides a constant node-independent network location that allows clients to easily access resources without needing to know which physical cluster node is hosting those resources.

Because groups move from an unavailable node to an available one during a failure, a client cannot connect to an application that uses an address that is identified with only one node. You identify a virtual address for a group in Oracle Fail Safe Manager by adding a unique network name and IP address to a group.

Figure 2-5 shows the wizard page in Oracle Fail Safe Manager that helps you add one or more virtual addresses to a group. For step-by-step instructions on adding a virtual address to a group, see the Oracle Fail Safe Tutorial.

Figure 2-5 Add Resource to Group - Virtual Address Wizard Page

Description of ofsman_addvirtualaddress.gif follows
Description of the illustration ofsman_addvirtualaddress.gif

Once you add a virtual address to a group, the group becomes a virtual server. Although at least one virtual address per group is required for client access, you can assign multiple virtual addresses to a group. You might assign multiple virtual addresses to provide increased bandwidth or to segment security for the resources in a group.

Each group appears to users and client applications as a highly available virtual server, independent of the physical identity of one particular node. To access the resources in a group, clients always connect to the virtual address of the group. To the client, the virtual server is the interface to the cluster resources and looks like a physical node.

Figure 2-6 shows a two-node cluster with a group configured on each node. Clients access these groups through Virtual Servers A and B. By accessing the cluster resources through the virtual address of a group, as opposed to the physical address of an individual node, you ensure successful remote connection regardless of which cluster node is hosting the group.

Figure 2-6 Accessing Cluster Resources Through a Virtual Server

Description of virtualserver.gif follows
Description of the illustration virtualserver.gif

2.4 Allocating IP Addresses for Virtual Addresses

When you set up a cluster, allocate at least the following number of IP addresses:

For example, the configuration in Figure 2-6 requires five IP addresses: one for each of the two cluster nodes, one for the cluster alias, and one for each of the two groups. (Note that you can specify multiple virtual addresses for a group; see Section 4.7 for details.)

See the Oracle Fail Safe Installation Guide for more information about allocating IP addresses for your Oracle Fail Safe environment.

2.5 Cluster Group and Cluster Alias

The cluster alias is a node-independent network name that identifies a cluster and is used for cluster-related system management. MSCS creates a group called the Cluster Group, and the cluster alias is the virtual address of this group. Oracle Services for MSCS is a resource in the Cluster Group, making it highly available and ensuring that Oracle Services for MSCS is always available to coordinate Oracle Fail Safe processing on all cluster nodes.

In an Oracle Fail Safe environment, the cluster alias is used only for system management. Oracle Fail Safe Manager interacts with the cluster components and MSCS using the cluster alias.

When you populate the tree view in Oracle Fail Safe Manager, you specify the cluster alias, as shown in Figure 2-7. The cluster alias is not the same as the computer name of any node in the cluster. By specifying the cluster alias when you add the cluster to the tree view, you ensure that when Oracle Fail Safe Manager connects to that cluster it will be using the virtual server where Oracle Services for MSCS is running; the cluster alias is always in the Cluster Group (the same group as Oracle Services for MSCS). See the Oracle Fail Safe Tutorial for step-by-step instructions on populating the Oracle Fail Safe Manager tree view and connecting to a cluster.

Figure 2-7 Cluster Alias in Add Cluster to Tree Dialog Box

Description of add_cluster.gif follows
Description of the illustration add_cluster.gif

Client applications do not use the cluster alias when communicating with a cluster resource. Rather, clients use one of the virtual addresses of the group that contains that resource.

2.6 Failover

The process of taking a group offline on one node and bringing it back online on another node is called failover. After a failover occurs, resources in the group are accessible as long as one of the cluster nodes that is configured to run those resources is available. MSCS continually monitors the state of the cluster nodes and the resources in the cluster.

A failover can be unplanned or planned:

The following sections describe these types of failover in more detail.

2.6.1 Unplanned Failover

There are two types of unplanned group failovers, which can occur due to one of the following:

  • Failure of a resource configured for high availability

  • Failure or unavailability of a cluster node

2.6.1.1 Unplanned Failover Due to a Resource Failure

An unplanned failover due to a resource failure is detected and performed as described in the following list.

  1. The cluster software detects that a resource has failed.

    To detect a resource failure, the cluster software periodically queries the resource (through the resource DLL) to see if it is up and running. See Section 2.6.4 for more information.

  2. The cluster software implements the resource restart policy. The restart policy states whether or not the cluster software should attempt to restart the resource on the current node, and if so, how many attempts within a given time period should be made to restart it. For example, the resource restart policy might specify that Oracle Fail Safe should attempt to restart the resource three times in 900 seconds.

    If the resource is restarted, then the cluster software resumes monitoring the software (step 1) and failover is avoided.

    If the resource is not, or cannot be, restarted on the current node, then the cluster software applies the resource failover policy.

    The resource failover policy determines whether or not the resource failure should result in a group failover. If the resource failover policy states that the group should not fail over, then the resource is left in the failed state and failover does not occur.

    Figure 2-11 shows the property page on which you can view or modify the resource restart and failover policies.

  3. If the resource failover policy states that the group should fail over if a resource is not (or cannot be) restarted, then the group fails over to another node. The node to which the group fails over is determined by which nodes are running, the resource's possible owner nodes list, and the group's preferred owner nodes list. See Section 2.6.7 for more information about the resource possible owner nodes list, and see Section 2.6.10 for more information about the group preferred owner nodes list.

  4. Once a group has failed over, the group failover policy is applied. The group failover policy specifies the number of times during a given time period that the cluster software should allow the group to fail over before that group is taken offline. The group failover policy lets you prevent a group from repeatedly failing over. See Section 2.6.8 for more information about the group failover policy.

  5. The failback policy determines if the resources and the group to which they belong are returned to a given node if that node is taken offline (either due to a failure or an intentional restart) and then placed back online. See Section 2.7 for information about failback.

    In Figure 2-8, Virtual Server A is failing over to Node B due to a failure of one of the resources in Group 1.

2.6.1.2 Unplanned Failover Due to Node Failure or Unavailability

An unplanned failover that occurs because a cluster node becomes unavailable is performed as described in the following list:

  1. The cluster software detects that a cluster node is no longer available.

    To detect node failure or unavailability, the cluster software periodically queries the nodes in the cluster (using the private interconnect).

  2. The groups on the failed or unavailable node fail over to one or more other nodes as determined by the available nodes in the cluster, each group's preferred owner nodes list, and the possible owner nodes list of the resources in each group. See Section 2.6.7 for more information about the resource possible owner nodes list, and see Section 2.6.10 for more information about the group preferred owner nodes list.

  3. Once a group has failed over, the group failover policy is applied. The group failover policy specifies the number of times during a given time period that the cluster software should allow the group to fail over before that group is taken offline. See Section 2.6.8 for more information about the group failover policy.

  4. The failback policy determines if the resources and the groups to which they belong are moved to a node when it becomes available once more. See Section 2.7 for information about failback.

    Figure 2-9 shows Group 1 failing over when Node A fails. Client applications (connected to the failed server) must reconnect to the server after failover occurs. If the application is performing updates to an Oracle database and uncommitted database transactions are in progress when a failure occurs, the transactions are rolled back.

Note that steps 3 and 4 in this list are the same as steps 4 and 5 in the previous list (in Section 2.6.1.1). Once a failover begins, the process is the same, regardless of whether the failover was caused by a failed resource or a failed node.

2.6.2 Planned Group Failover

A planned group failover is the process of intentionally taking client applications and cluster resources offline on one node and bringing them back online on another node. This allows administrators to perform routine maintenance tasks (such as hardware and software upgrades) on one cluster node while users continue to work on another node. Besides performing maintenance tasks, you might want to do a planned failover to balance the load across the nodes in the cluster. In other words, you can use a planned failover to move a group from one node to another. In fact, to implement a planned failover, you perform a move group operation in Oracle Fail Safe Manager (see the online help in Oracle Fail Safe Manager for instructions).

During a planned failover, Oracle Services for MSCS works with MSCS to efficiently move the group from one node to another. Client connections are lost and clients must manually reconnect at the virtual server address of the application, unless you have configured transparent application failover (see Section 7.9 for information about transparent application failover). Then, you can take your time performing the upgrade, because Oracle Fail Safe allows clients to work uninterrupted on another cluster node while the original node is offline. (If a group contains an Oracle database, the database is checkpointed prior to any planned failover to ensure rapid database recovery on the new node.)

2.6.3 Group and Resource Policies That Affect Failover

Values for the various resource and group failover policies are set to default values when you create a group or add a resource to a group using Oracle Fail Safe Manager. However, you can reset the values in these policies with the Group Failover property page, the Group Failback property page, and the Resource Policies property page. You can set values for the group failback policy at group creation time or afterwords, using the Group Failback property page.

Figure 2-10 shows the page for setting group failover policies. To access this page, select the group of interest in the Oracle Fail Safe Manager tree view and then click the Failover tab.

Figure 2-11 shows the page for setting resource policies. To access this page, select the resource of interest in the Oracle Fail Safe Manager tree view and then click the Policies tab.

Figure 2-10 Group Failover Property Page

Description of groupfailover.gif follows
Description of the illustration groupfailover.gif

Figure 2-11 Resource Policies Property Page

Description of ofsman_failoverpolicy.gif follows
Description of the illustration ofsman_failoverpolicy.gif

2.6.4 How a Resource Failure Is Detected

All resources that have been configured for high availability are monitored for their status by the cluster software. Resource failure is detected on the basis of three values:

  • Pending timeout value

    The pending timeout value specifies how long the cluster software should wait for a resource in a pending state to come online (or offline) before considering that resource to have failed. By default, this value is 180 seconds.

  • Is Alive interval

    The Is Alive interval specifies how frequently the cluster software should check the state of the resource. You can use either the default value for the resource type or specify a number (in milliseconds). This check is more thorough, but also uses more system resources than the check done during a Looks Alive interval.

  • Looks Alive interval

    The Looks Alive interval specifies how frequently the cluster software should check the registered state of the resource to determine if the resource appears to be active. You can use either the default value for the resource type or specify a number (in milliseconds). This check is less thorough, but also uses fewer system resources, than the check done during an Is Alive interval.

2.6.5 Resource Restart Policy

Once it is determined that a resource has failed, the cluster software applies the restart policy for the resource. The resource restart policy provides two options, as shown in Figure 2-11:

  • The cluster software should not attempt to restart the resource on the current node. Instead, it should immediately apply the resource failover policy.

  • The cluster software should attempt to restart the resource on the current node a specified number of times within a given time period. If the resource cannot be restarted, then the cluster software should apply the resource failover policy.

2.6.6 Resource Failover Policy

The resource failover policy determines whether or not the group containing the resource should fail over if the resource is not (or cannot be) restarted on the current node. If the policy states that the group containing the failed resource should not fail over, then the resource is left in the failed state on the current node. (The group may eventually fail over anyway; if another resource in the group has a policy that states that the group containing the failed resource should fail over, then it will.) If the policy states that the group containing the failed resource should fail over, then the group containing the failed resource fails over to another cluster node as determined by the group preferred owner nodes list. (See Section 2.6.10 and Section 2.7.1 for a description of the preferred owner nodes list.)

2.6.7 Resource Possible Owner Nodes List

The possible owner nodes list consists of all nodes on which a given resource is permitted to run. A node on which a resource is permitted to run is defined as follows:

  • The DLL for the given resource is installed on the node.

  • You have not specified that the node should be excluded from the possible owner nodes list.

In addition, although it is not a requirement, you should ensure that all resources that are permitted to run on a given node are also configured to run on that node. Otherwise, a group containing that resource may fail over to the node, but be unable to restart the resource. A resource is configured to run on a possible owner node when you do either of the following:

As mentioned previously, you can specify that a node should be excluded from the possible owner nodes list. For example, suppose you have a four-node cluster and each node has the Oracle database and the Oracle Fail Safe database resource DLLs installed. You have the choice of specifying that all four nodes are possible owner nodes for the resource. However, suppose Node 3 does not have sufficient memory to run both the database instance and the rest of its workload. You might decide to remove Node 3 from the possible owner nodes list for the database resource.

You specify the possible owner nodes list for a resource when you add it to a group. You can adjust the possible owner nodes list for a resource that has been made highly available using one of the following property pages:

  • The General property page for the resource

    The General property page for the resource does not show you how modifications to the possible owner nodes list of the resource will affect the group to which the resource belongs. If you use this property page to modify the possible owner nodes list of a resource, make sure you do not inadvertently create a situation where none of the resources in the group have common nodes in their possible owner nodes lists.

  • The Nodes property page for the group containing the resource

    The Nodes property page presents the possible owner nodes list for the group. However, the possible owner nodes list is not actually an attribute of a group. Oracle Fail Safe determines which nodes to present in the possible owner nodes list for a group by finding the intersection of the possible owner nodes list of each resource in the group. Using this property page, you can see if removing one of the possible owner nodes will result in no nodes being a possible owner node for a group. Figure 2-12 is an example of the Nodes property page. Note that if you make a change to the possible owner nodes list for a group, this change is applied to all resources in the group, except disk resources.

In a two-node cluster, the possible owner nodes list for every resource usually includes both nodes. To provide failover capabilities, at least two cluster nodes must be possible owner nodes for a resource.


Note:

Assume you add a new node to the cluster and Oracle Fail Safe or MSCS DLLs (or both) are installed on that node. That node becomes a possible owner for resources supported by the installed DLLs. If resources have not yet been configured for high availability on that node, a group can fail over to that node and be unable to restart the resources on that node.

However, if you run the Verify Group command, Oracle Fail Safe checks that the resources in the specified group are configured to run on each node that is a possible owner for the group. If it finds a possible owner node where the resources in the group are not configured to run, then Oracle Fail Safe configures them for you.

Therefore, Oracle strongly recommends you run the Verify Group command for each group for which the new node is listed as a possible owner. Section 6.1.2 describes the Verify Group command.


2.6.8 Group Failover Policy

If the resource failover policy states that the group containing the resource should fail over if the resource cannot be restarted on the current node, then the group fails over and the group failover policy is applied. Similarly, if a node becomes unavailable, the groups on that node fail over and the group failover policy is applied.

The group failover policy specifies the number of times during a given time period that the cluster software should allow the group to fail over before that group is taken offline. The failover policy provides a means to prevent a group from failing over repeatedly.

The group failover policy consists of a failover threshold and a failover period:

  • Failover threshold

    The failover threshold specifies the maximum number of times failover can occur (during the failover period) before the cluster software stops attempting to fail over the group.

  • Failover period

    The failover period is the time during which the cluster software counts the number of times a failover occurs. If the frequency of failover is greater than that specified for the failover threshold during the period specified for the failover period, then the cluster software stops attempting to fail over the group.

For example, if the failover threshold is 3 and the failover period is 5, the cluster software allows the group to fail over 3 times within 5 hours before discontinuing failovers for that group.

When the first failover occurs, a timer to measure the failover period is set to 0 and a counter to measure the number of failovers is set to 1. The timer is not reset to 0 when the failover period expires. Instead, the timer is reset to 0 when the first failover occurs after the failover period has expired.

For example, assume again that the failover period is 5 hours and the failover threshold is 3. As shown in Figure 2-13, when the first group failover occurs at point A, the timer is set to 0. Assume a second group failover occurs 4.5 hours later at point B, and the third group failover occurs at point C. Because the failover period has been exceeded when the third group failover occurs (at point C), group failovers are allowed to continue, the timer is reset to 0, and the failover counter is reset to 1.

Assume that another failover occurs at point D (after 7 total hours have elapsed since point A, and 2.5 hours have elapsed since point B). You might expect that failovers will be discontinued. The failovers at points B, C, and D have occurred within a 5-hour timeframe. However, because the timer for measuring the failover period was reset to 0 at point C, the failover threshold has not been exceeded, and the cluster software allows the group to fail over.

Assume that another failover occurs at point E. When a problem that ordinarily results in a failover occurs at point F, the cluster software does not fail over the group. Three failovers have occurred during the 5-hour period that has passed since the timer was reset to 0 at point C. The cluster software leaves the group on the current node in a failed state.

Figure 2-13 Failover Threshold and Failover Period Timeline

Description of failovex1.gif follows
Description of the illustration failovex1.gif

2.6.9 Effect of Resource Restart Policy and Group Failover Policy on Failover

Both the resource restart policy and the failover policy of the group containing the resource affect the failover behavior of a group.

For example, suppose the Northeast database is in a group called Customers, and you specify the following:

  • On the Policies property page for the Northeast database:

    • Attempt to restart the database on the current node 3 times within 600 seconds (10 minutes)

    • If the resource fails and cannot be restarted, fail over the group

  • On the Failover property page for the Customers group:

    • The failover threshold for the group containing the resource is 20

    • The failover period for the group containing the resource is 1 hour

    Assume a database failure occurs. Oracle Fail Safe attempts to restart the database instance on the current node. The attempt to restart the database instance fails three times within a 10-minute period. Therefore, the Customers group fails over to another node.

    On that node, Oracle Fail Safe attempts to restart the database instance and fails three times within a 10-minute period, so the Customers group fails over again. Oracle Fail Safe will continue attempts to restart the database instance and the Customers group will continue to fail over until the database instance restarts or the group has failed over 20 times within a 1-hour period. If the database instance cannot be restarted, and the group fails over fewer than 20 times within a 1-hour time period, the Customers group will fail over repeatedly. In such a case, consider reducing the failover threshold to eliminate the likelihood of repeated failovers.

2.6.10 Group Failover and the Preferred Owner Nodes List

When you create a group, you can create a preferred owner nodes list for both group failover and failback. (When the cluster contains only two nodes, you specify this list for failback only.) You create an ordered list of nodes to indicate the preference you have for each node in the list to own that group.

For example, in a four-node cluster, you might specify the following preferred owner nodes list for a group containing a database:

  • Node 1

  • Node 4

  • Node 3

This indicates that when all four nodes are running, you prefer for the group to run on Node 1. If Node 1 is not available, then your second choice is for the group to run on Node 4. If neither Node 1 nor Node 4 is available, then your next choice is for the group to run on Node 3. You have omitted Node 2 from the preferred owner nodes list. However, if it is the only choice available to the cluster software (because Node 1, Node 4, and Node 3 have all failed), then the group will fail over to Node 2. (This will happen even if Node 2 is not a possible owner for all resources in the group. In such a case, the group fails over, but remains in a failed state.)

When a failover occurs, the cluster software uses the preferred owner nodes list to determine the node to which it should fail over the group. The cluster software will fail over the group to the top-most node in the list that is up and running and is a possible owner node for the group. Section 2.6.11 describes in more detail how the cluster software determines the node to which a group will fail over.

See Section 2.7.1 for information about how the group preferred owner nodes list affects failback.

2.6.11 Determining the Failover Node for a Group

The node to which a group will fail over is determined by the following three lists:

  • List of available cluster nodes

    The list of available cluster nodes consists of all nodes that are running when a group failover occurs. For example, suppose you have a four-node cluster. If one node is down when a group fails over, then the list of available cluster nodes is reduced to three.

  • List of possible owner nodes for each resource in the group (See Section 2.6.7.)

  • List of preferred owner nodes for the group containing the resources (See Section 2.6.10.)

The cluster software determines the nodes to which your group can possibly fail over by finding the intersection of the available cluster nodes and the common set of possible owners of all resources in the group. For example, assume you have a four-node cluster and a group on Node 3 called Test_Group. You have specified the possible owners for the resources in Test_Group as shown in Table 2-1.

Table 2-1 Example of Possible Owners for Resources in Group Test_Group

Possible Owners for Resource 1 Possible Owners for Resource 2 Possible Owners for Resource 3
Node 1 - Yes Node 1 - Yes Node 1 - Yes
Node 2 - Yes Node 2 - No Node 2 - Yes
Node 3 - Yes Node 3 - Yes Node 3 - Yes
Node 4 -Yes Node 4 - Yes Node 4 - Yes

By reviewing Table 2-1, you see that the intersection of possible owners for all three resources is:

  • Node 1

  • Node 3

  • Node 4

Assume that Node 3 (where Test_Group currently resides) fails. The available nodes list is now:

  • Node 1

  • Node 4

To determine the nodes to which Test_Group can fail over, the cluster software finds the intersection of the possible owner nodes list for all resources in the group and the available nodes list. In this example, the intersection of these two lists is Node 1 and Node 4.

To determine the node to which it should fail over Test_Group, the cluster software uses the preferred owner nodes list of the group. Assume that you have set the preferred owner nodes list for Test_Group to be:

  • Node 3

  • Node 4

  • Node 1

Because Node 3 has failed, the cluster software will fail over Test_Group to Node 4. If both Node 3 and Node 4 are not available, then the cluster software will fail over Test_Group to Node 1. If Nodes 1, 3, and 4 are not available, then the group will fail over to Node 2. However, because Node 2 is not a possible owner for all of the resources in Test_Group, the group will remain in a failed state on Node 2.

2.7 Failback

A failback is a process of automatically returning a group of cluster resources to a preferred owner node from the failover node after a preferred owner node returns to operational status. A preferred owner node is a node on which you want a group to reside when possible (when that node is available).

You can set a failback policy that specifies when (and if) groups should fail back to a preferred owner node from the failover node. For example, you can set a group to fail back immediately or between specific hours that you choose. Or, you can set the failback policy so that a group does not fail back, but continues to run on the node where it currently resides. Figure 2-14 shows the property page for setting the failback policy for a group.

Figure 2-14 Group Failback Policy Property Page

Description of failback_policy.gif follows
Description of the illustration failback_policy.gif

2.7.1 Group Failback and the Preferred Owner Nodes List

When you create a group on a cluster, you can create a preferred owner nodes list for group failover and failback. (When the cluster contains only two nodes, you specify this list for failback only.) You create an ordered list of nodes that indicates the nodes where you prefer a group to run. When a previously unavailable node comes back online, the cluster software reads the preferred owner nodes list for each group on the cluster to determine whether or not the node that just came online is a preferred owner node for any group on the cluster. If the node that just came online is higher on the preferred owner nodes list than the node on which the group currently resides, then the group is failed back to the node that just came back online.

For example, in a four-node cluster, you might specify the following preferred owner nodes list for the group called My_Group:

  • Node 1

  • Node 4

  • Node 3

Assume that My_Group has failed over to, and is currently running on, Node 4 because Node 1 had been taken offline. Now Node 1 is back online. The cluster software reads the preferred owner nodes list for My_Group (and all other groups on the cluster); it finds that the preferred owner node for My_Group is Node 1. It will fail back My_Group to Node 1, if failback is enabled.

If My_Group is currently running on Node 3 (because both Node 1 and Node 4 are not available) and Node 4 comes back online, then My_Group will fail back to Node 4 if failback is enabled. Later, when Node 1 becomes available, My_Group will fail back once more, this time to Node 1. When you specify a preferred owner nodes list, be careful not to create a situation where failback happens frequently and unnecessarily. For most applications, two nodes in the preferred owner nodes list is sufficient.

A scenario with unexpected results is exhibited when a group has been manually moved to a node. Assume all nodes are available and My_Group is currently running on Node 3 (because you moved it there with a move group operation). If Node 4 is restarted, My_Group will fail back to Node 4, even though Node 1 (the node highest in My_Group's preferred owner node list) is also running.

When a node comes back online, the cluster software checks to see if the node that just came back online is higher on the preferred owner nodes list than the node where each group currently resides. If so, all such groups are moved to the node that just came back online.

See Section 2.6.10 for information about how the group preferred owner nodes list affects failover.

2.7.2 Client Reconnection After Failover

Node failures affect only those users and applications:

  • That are directly connected to applications hosted by the failed node

  • Whose transactions were being handled when the node failed

Typically, users and applications connected to the failed node lose the connection and must reconnect to the failover node (through the node-independent virtual address) to continue processing. With a Web application, uncommitted form input or report context is lost. Users reconnect to the application by reloading the URL in the Web browser. With a database, any transactions that were in progress and uncommitted at the time of the failure are rolled back. Client applications that are configured for transparent application failover experience a brief interruption in service; to the client applications, it appears that the node was quickly restarted. The service is automatically restarted on the failover node—without operator intervention.

See Section 7.9 for information about transparent application failover.