4 Management for High Availability

The unique advantage offered by Oracle Fail Safe is its ability to help you easily configure resources in a Windows cluster environment. This chapter discusses the following topics:

For step-by-step procedures to configure standalone resources into groups, and for information about managing those resources once they are in groups, refer to Chapter 8 and Chapter 9 in this manual and to Oracle Fail Safe Tutorial and online help.

4.1 Configuring Resources for Failover

Using Oracle Fail Safe Manager wizards, you can easily configure failover automatically and with minimal work by a system manager. Oracle Fail Safe Manager helps to configure resources into groups so that when one node in a cluster fails, another cluster node immediately takes over the resources in the failed node's groups.

The wizards minimize the risk of introducing configuration problems during implementation and also reduce the level of expertise required to configure resources for high availability. Most policies that you set with the wizards can be modified later with Oracle Fail Safe Manager.

The following list summarizes the basic tasks to perform to implement failover for resources. Except for the first task, all of these tasks must be performed using Oracle Fail Safe Manager.

  1. Ensure that the products that you want to configure with Oracle Fail Safe are properly installed. (This is described in the Oracle Fail Safe Installation Guide for Microsoft Windows.)

  2. Start Oracle Fail Safe Manager.

  3. Verify the cluster.

  4. Use the Validate action to validate the standalone Oracle Database you are adding.

  5. Add resources to the group.

  6. Verify the group.

  7. Update any Oracle Net file (such as the tnsnames.ora file) on client systems.

Note:

Depending on the type of resource being configured, there may be additional steps or considerations.

Refer to the tutorial and online help in Oracle Fail Safe Manager for step-by-step guidance on using the Oracle Fail Safe Manager wizards.

4.2 How Does Oracle Fail Safe Use the Wizard Input?

Once the wizard collects all the required information, Oracle Fail Safe Manager interacts with Oracle Fail Safe (which in turn interacts with Microsoft Windows Failover Clusters) to facilitate a high-availability environment.

Based on the information that you provide with the wizards, Oracle Fail Safe derives any additional information it requires to configure the environment.

Most resources are configured by Oracle Fail Safe using a similar series of steps. Oracle Fail Safe performs the following specific steps to configure a highly available Oracle Database:

  1. Configures access to the database using a virtual address:

    1. Configures Oracle Net to use the virtual address or addresses associated with the database on all nodes listed in the possible owner nodes list for the database. (On a two-node cluster, this is both cluster nodes. On clusters that consist of more than two nodes, specify the possible owner nodes for a resource as a step in the Add Resource to Group Wizard.)

    2. Duplicates the network configuration information on all nodes in the possible owner nodes list.

  2. Configures the database to:

    1. Verify that all data files used by the database resource are on cluster disks and are not currently used by applications in other groups. If the cluster disks are in another group, but not used by applications in that group, then Oracle Fail Safe moves the disks into the same group with the database resource.

    2. Create the failback policy for the database resources based on choices you made in the wizard.

    3. Populate the group with these resources:

      • Each disk resource used by the cluster group

      • Oracle Database

      • Oracle Net listener

  3. Performs the following steps on each of the possible owner nodes for the group to which the database has been added, one at a time:

    1. Creates an Oracle instance with the same name on the node.

    2. Verifies that the node can bring the database online and offline by failing it over to the node to ensure that the failover policy works.

  4. Shuts down Oracle Database after testing failover on all nodes in the possible owner nodes list. If the preferred owner node list is empty, then the group remains on the last node to which it was failed over as part of the configuration process.

By performing these steps, Oracle Fail Safe ensures that the resource is correctly configured and capable of failing over and failing back to all possible owner nodes of the group to which it has been added.

Figure 4-1 shows a two-node active/active cluster configuration in which each node hosts a group with a database.

Figure 4-1 Virtual Servers and Addressing in an Oracle Fail Safe Environment

Description of Figure 4-1 follows
Description of "Figure 4-1 Virtual Servers and Addressing in an Oracle Fail Safe Environment"

The virtual servers (A and B) and their network addresses are known by all clients and cluster nodes. The listener.ora file on each cluster node and the tnsnames.ora file on each client workstation contain the network name and address information for each virtual server.

For failover to work properly, the host name (virtual address), database instance, SID entry, and protocol information in each tnsnames.ora and listener.ora file must match on each server node that is a possible owner of the resources in the group and the client system.

For example, during normal operations, Virtual Server A is active on Node A. Node B is the failover node for Virtual Server A. The cluster disks are connected to both nodes so that resources can run on either node in the cluster, but service for the resources in each group is provided by only one cluster node at a time.

If a system failure occurs on Node A, then Group 1 becomes active on Node B using the same virtual address and port number as it had on Node A. Node B takes over the workload from Node A transparently to clients, which continue to access Group 1 using Virtual Server A and Group 2 using Virtual Server B. Clients continue to access the resources in a group using the same virtual server name and address, without considering the physical node that is serving the group.

4.3 Managing Cluster Security

To accomplish administrative tasks associated with Oracle Fail Safe, you need the appropriate privileges to manage Oracle resources and applications and to perform operations through Oracle Fail Safe Manager.

Table 4-1 provides a quick reference to the privileges required to use the services in an Oracle Fail Safe environment. For more information, refer to the sections listed in the last column.

Table 4-1 Permissions and Privileges

Service Required Privileges Reference

Oracle Fail Safe

Domain user account that has Administrator privileges on all cluster nodes

Section 4.3.1

Oracle Fail Safe Manager

Domain user account that has Administrator privileges on all cluster nodes

Section 4.3.2

Oracle Database

Database administrator account with SYSDBA privileges

Section 8.5


4.3.1 Oracle Fail Safe

Oracle Fail Safe accesses database resources from two different Windows services: the Cluster Service service and the Oracle Fail Safe service. The Cluster Service service implements the database resource DLL functions, that is, the common resource functions that start and stop the database resource, and determine if the database resource is functioning properly by issuing simple database queries against the database ("Is Alive" polling). The Oracle Fail Safe service processes requests from the Oracle Fail Safe clients, such as Oracle Fail Safe Manager or PowerShell cmdlets, that are related to Oracle cluster resources.

Each of these services executes in the context of the Log On As user specified for the particular service. The Oracle Fail Safe service executes under the account provided to the Oracle Fail Safe Security Setup tool during the installation of Oracle Fail Safe. Prior to Windows Server 2008, the Cluster Service service executed under the cluster account specified when the cluster was configured. In Windows Server 2008 and later the Cluster Service service executes as user Local System.

All database connections must be properly authenticated, so Oracle Fail Safe must execute from a context that is authorized to connect to a database. If operating system authentication is being used to access a database (the database parameter REMOTE_LOGIN_PASSWORDFILE is set to NONE) then Oracle Database authenticates the access from the Windows service using the account name for that service. For the Oracle Fail Safe service, that means that authentication is done using the Log On As account specified for the Oracle Fail Safe service. For the Cluster Service service, on installations that are using a Windows Server version that is older than 2008, the cluster account is used. In Windows Server 2008 and later, the Oracle Fail Safe database resource DLL impersonates the Oracle Fail Safe account when connecting to the database. Thus in Windows Server 2008 and later, even though the Cluster Service service is executing as Local System, database access authentication is done using the Oracle Fail Safe account.

Prior to Windows Server 2008 it was possible for Oracle Fail Safe to access databases from two different user accounts: the one specified for the Cluster Service service and the Oracle Fail Safe service. On systems using Windows Server 2008 and later, when using operating system authentication, Oracle Fail Safe only attempts to authenticate database access using the account specified for the Oracle Fail Safe service. See Section 8.3.3.4, "Validating Database" for more information regarding database authentication.

4.3.1.1 Changing the Oracle Fail Safe Server Account

The Oracle Fail Safe Server service must run under a domain user account that is a member of the Administrators group and has access to all nodes in the cluster. This account is used by Oracle Fail Safe to change the configuration of Oracle resources in the cluster and also used to access Oracle Databases managed by Oracle Fail Safe. During the installation of Oracle Fail Safe, the installation process prompts for an account and password to be used by the Oracle Fail Safe server. To change the account used by Oracle Fail Safe, run the Set Credentials tool, and specify a new account to be used by Oracle Fail Safe.

To change the Oracle Fail Safe account, from the Windows Start menu, select All Programs, then the Oracle Fail Safe home, and finally Set Credentials. An introduction screen will be displayed.

Figure 4-2 shows the dialog box for Oracle Fail Safe Server Credentials explaining its utility.

Figure 4-2 Oracle Fail Safe Server Credentials

Description of Figure 4-2 follows
Description of "Figure 4-2 Oracle Fail Safe Server Credentials"

Click Continue to enter the new credentials.

Figure 4-3 Windows Security Settings for the Oracle Fail Safe Server

Description of Figure 4-3 follows
Description of "Figure 4-3 Windows Security Settings for the Oracle Fail Safe Server"

4.3.2 Oracle Fail Safe Manager

The account used to log in to Oracle Fail Safe Manager must be a domain user account (not a local account) that has Administrator privileges on all cluster nodes.

4.4 Discovering Standalone Resources

Oracle Fail Safe automatically discovers (locates) and displays standalone resources in the Oracle Fail Safe Manager tree view when you select the Standalone Resources folder from the tree view. Chapter 8 and Chapter 9 contain information about how Oracle Fail Safe discovers each type of component that can be configured for high availability with Oracle Fail Safe.

4.5 Renaming Resources

Once a resource is added to a group, the resource name must not be changed. If the resource name must be changed, then use Oracle Fail Safe Manager to remove the resource from the group and then, add it back to the group using the new name.

4.6 Using Oracle Fail Safe in a Multiple Oracle Homes Environment

Oracle Fail Safe supports the multiple Oracle homes feature. The following list describes the requirements for using Oracle Fail Safe in a multiple Oracle homes environment:

  • Install Oracle Fail Safe in any one Oracle home on all cluster nodes. Only one version of Oracle Fail Safe can be installed and running on a node.

  • Use the latest release of Oracle Fail Safe Manager to manage multiple clusters. See Oracle Fail Safe Release Notes for Microsoft Windows for information about the compatibility of various versions on Oracle Fail Safe Manager and the Oracle Fail Safe server component.

    Note:

    Multiple releases of Oracle Fail Safe Manager can be installed on a system, but each release must be installed in a different Oracle home.
  • Each resource to be configured for high availability must be installed in the same Oracle home on all cluster nodes that are possible owners. The cluster Validate action validates this symmetry. See Section 7.1.1 for information about the cluster Validate action.

  • All databases and listeners in a group must come from the same Oracle home.

    On adding a database to a group, an Oracle Net listener resource is also added to the group. Optionally, you can add an Oracle Management Agent resource to the group. See Section 9.2 for more information.

    The listener is created in the same Oracle home where the database resides.

4.7 Configurations Using Multiple Virtual Addresses

Before any resources, other than generic services, can be added to a group using Oracle Fail Safe Manager, one or more virtual addresses must be added to the group. Client applications connect to the resources in a group using one of the virtual addresses in the group.

You can add up to 32 virtual addresses to a group, prior to adding resources, by starting the Add Resource to Group Wizard. In Microsoft Windows Failover Cluster Manager, select a group, then select Add a Resource action from the Actions menu in the right pane of the screen to add a virtual address (also known as client access point) to the group.

Note the following restrictions:

  • At least one virtual address must be added to a group before you can add another resource to the group. Only generic services can be added to a group that does not already contain a virtual address.

  • If the group contains one or more Oracle Databases, then:

    • All virtual addresses that you plan to configure with one or more databases in a group must be added to the group before you can add any databases to the group.

    • All databases in a group must use the same set of virtual addresses that you specify for the first database that you add to the group. (The set of virtual addresses can contain as few as one address.)

    See Section 8.3.3.2 for more information about configuring multiple virtual addresses with Oracle Databases.

When you add a virtual address to a group, the group is accessible by clients at the same network address, regardless of which cluster node is hosting the cluster.

Multiple virtual addresses in a group provide flexible configuration options. For example, users can access a database over the public network while you perform a database backup operation over the private network. Or different virtual addresses can be allocated on different network segments to control security, with administrators accessing the database on one segment, while users access the database on another segment.

When you add more than one virtual address to a group, Oracle Fail Safe Manager asks you to specify the address that clients can use to access the resources in that group. If you add more than one resource to a group (for example, a database and a Custom Application), then you can dedicate one virtual address for users to access the database directly and another for users to access the Custom Application. Alternatively, if there are many database users, then you can have some users access the database using one virtual address and the others use the other virtual address, to balance the network traffic.

See the online help in Oracle Fail Safe Manager for information about adding a virtual address to a group.

4.8 Adding a Node to an Existing Cluster

Instructions for installing the software to add a new node to an existing cluster are described in the Oracle Fail Safe Installation Guide for Microsoft Windows. Once that task is completed, there is one final step. Select the Validate action for each group on the cluster for which the new node is a possible owner.

Assume you add a new node to the cluster and install Oracle Fail Safe on that node along with the DLLs for the resources you intend to run on that node. The new node becomes a possible owner for these resources. If these resources have not yet been configured to run on the new node, when the group or groups containing them fail over to that node, then these resources cannot be restarted on that new node.

However, if you run the Validate action, then Oracle Fail Safe checks that the resources in the verified groups are configured to run on each node that is a possible owner for the group. If it finds a possible owner node where the resources in the group are not configured to run, then Oracle Fail Safe configures them for you.

Therefore, Oracle strongly recommends that you run the Validate operation for each group for which the new node is listed as a possible owner. Section 7.1.2 describes the Validate operation. Groups can also be verified using the Oracle Fail Safe PowerShell cmdlet Test-OracleClusterGroup, as described in Chapter 6.