Skip Headers
Oracle® Fusion Middleware High Availability Guide
11g Release 1 (11.1.1)

Part Number E10106-14
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

13 Using Oracle Cluster Ready Services

This chapter describes conceptual information as well as configuration procedures for Oracle Cluster Ready Services. It contains the following sections:

13.1 Introduction to Oracle Clusterware

Oracle Clusterware manages the availability of user applications and Oracle databases in a clustered environment. In an Oracle Real Application Clusters (Oracle RAC) environment, Oracle Clusterware manages all of the Oracle database processes automatically. Anything managed by Oracle Clusterware is known as a cluster resource, which could be a database instance, a listener, a virtual IP (VIP) address, or an application process.

Oracle Clusterware was initially created to support Oracle RAC. Its benefits, however, are not limited to Oracle RAC; it can also be used to manage other applications through add-on modules (or action scripts). It is this flexibility and extensibility in Oracle Clusterware that forms the basis of a high availability solution for Oracle Fusion Middleware.

For more information about Oracle Clusterware, see the Oracle Clusterware Administration and Deployment Guide.

13.2 Cluster Ready Services and Oracle Fusion Middleware

Oracle Clusterware includes a high availability framework that offers protection to any resource with the help of resource-specific add-on modules. In Oracle Clusterware terminology, a resource refers to an object that is created by Oracle Clusterware to identify the entity to be managed, such as an application, a virtual IP or a shared disk. Oracle Clusterware monitors a resource to make sure it is always available by frequently checking its state, and attempting to restart it if it is down. If restarting fails, the resource will be started on a new node, a process called failover. Resource switchover, an intentional switch of the operating environment of resources, is also allowed through the proper user interface.

With this high availability framework, Oracle Clusterware manages resources through user-provided add-on modules. For example, to create a resource for an application running in a single process, a module has to be supplied by the user to start and stop the process, and check its state. If this application fails, Oracle Clusterware attempts to restart it using this module. If the node on which this application is currently running fails, Oracle Clusterware attempts to restart it on another node if the application and resource are configured properly. You can configure the monitoring frequency for a resource and define its relationships to other resources.

Application Server Cluster Ready Services (ASCRS) relieves you from writing your own add-on modules for your critical and complex Oracle Fusion Middleware application environment, thereby giving you easy access to Oracle Clusterware's high availability features.

Note:

ASCRS is a solution for managing an Oracle Fusion Middleware environment that is already CFC enabled. Refer to Chapter 12, "Active-Passive Topologies for Oracle Fusion Middleware High Availability" for information on enabling CFC for the various Fusion Middleware components.

ASCRS consists of a frontend and a backend. The frontend is a command line interface, ascrsctl, with which you can perform administrative tasks, such as resource creation, deletion, update, start, stop or switchover between cluster nodes. The backend is logic for the life cycle management of the various Fusion Middleware resources. The frontend and backend have their own separate log files.

On Unix platforms and Windows Server 2008, ASCRS supports virtual IPs, shared disks, database listeners, database instances, OPMN managed instances, and WebLogic servers. You can create Oracle Clusterware resources from these middleware components, allowing Oracle Clusterware and ASCRS to maintain their high availability within the cluster.

Oracle Clusterware and ASCRS provide a means to improve the survivability of the various resources when their hosting environment is corrupted or lost. However, they do not prevent disk corruption or application malfunctioning caused by disk corruption.

ASCRS supports Oracle Clusterware version 10.2.0.4 or 11.1.0.7 and higher. It has online help that can be invoked using the following command:

ascrsctl help -c command -t resource_type

As an example, the following command shows the help for creating a virtual IP resource:

ascrsctl help -c create -t vip

To view the full contents of ascrsctl online help, see Appendix E, "ascrsctl Online Help."

13.3 Installing and Configuring Oracle Clusterware with CRS

As an extension of CRS, ASCRS must be installed within each CRS home on each node of the cluster and configured separately before it can be used.

With the ASCRS command line tool ascrsctl, you can give ASCRS control of various middleware components. Once a component is controlled by CRS, its runtime state is closely monitored and CRS takes the proper actions if the component fails. With ascrsctl, you can create a CRS resource, and once a resource is created, you can perform start, stop, update, switch, status, and delete operations on it.

13.3.1 Upgrading Older Versions of ASCRS to the Current ASCRS Version

If an older version of ASCRS is already in use, following these steps to upgrade to the latest ASCRS version:

  1. Use the ASCRS stop command to take all ASCRS resources offline.

  2. Take note of all Fusion Middleware resource settings by viewing their status with ASCRS status command.

  3. Delete these resources.

  4. Remove the CRS_HOME/ascrs directory.

  5. Install the new version of ASCRS as described in Section 13.3.2, "Installing ASCRS" and finish with the setup command.

  6. Using the information from step 2 above, use the ASCRS create command to recreate the Fusion Middleware resources.

13.3.2 Installing ASCRS

Install ASCRS on each node of the cluster. To successfully install ASCRS on a particular node, check the following:

  • Operating system: ASCRS is supported on Unix platforms, and Windows Server 2008. The system version and patch level should be compatible to the CRS version supported on that platform.

  • CRS and version: CRS is installed on this system, started, and functioning correctly. The CRS version must be 10.2.0.4 or higher. For information about installing Oracle Clusterware and CRS, see the Oracle Clusterware Installation Guide for Linux.

  • User account: The ASCRS installation user should be the same as the owner of the CRS home. It should also be the same as the owner of the application resource being managed. On Windows, this user must have administrator privilege.

Note:

To install ASCRS for CRS 10.2.0.4, Sun JDK (or JRE) 1.5 or higher must be installed on the local system. It is needed by ascrsctl, the command line tool.

To install ASCRS:

  1. Log in as the CRS owner.

  2. Insert the Oracle Fusion Middleware Companion CD and run the following commands to unzip and install the ascrs.zip file:

    cd CRS_HOME
    unzip Disk1/ascrs/ascrs.zip
    cd ascrs/bin
    setup
    

If the CRS version is 10.2.0.4, after JDK(JRE) 1.5+ or higher is installed, run the following command:

setup -j JDK/JRE_HOME

When the installation is complete, the following ASCRS directory structure appears:

CRS_HOME/ascrs/bin
CRS_HOME/ascrs/config
CRS_HOME/ascrs/lib
CRS_HOME/ascrs/log
CRS_HOME/ascrs/public
CRS_HOME/ascrs/perl
CRS_HOME/ascrs/sql
CRS_HOME/ascrs/wlst

13.3.3 Configuring ASCRS with Oracle Fusion Middleware

After you install ASCRS, it is ready for use with the default configuration. To customize logging locations, logging levels, or the default CRS properties, edit the config.xml file located in the CRS_HOME/ascrs/config directory.

The config.xml file contains the configuration for ascrsctl ASCRS agent logging. To change either of their locations, specify an existing path name or a path name within this CRS home using the ORACLE_HOME prefix. The available logging levels in the decreasing order of verbosity are ALL, FINEST, FINER, FINE, INFO, WARNING and SEVERE. Each resource has its own agent log file that rolls over after its size exceeds rollover_size bytes.

CRS properties are configured for policies. A policy name describes the characteristics of the CRS property values under that policy. A policy can be normal or fast. Policy 'fast' means more frequent resource health checking, and less delay in failover.

The following represents the default config.xml file shipped with ASCRS:

<?xml version="1.0" ?>
<config>
  <ascrsctl>
    <display level="normal"/>
          <log path="${ORACLE_HOME}/ascrs/log/ascrsctl.log" level="FINER"/>
    <resource-params target="vip" policy="normal">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="7"/>
      <param name="FAILOVER_DELAY" value="5"/>
      <param name="FAILURE_INTERVAL" value="50"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="2"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>
 
    <resource-params target="vip" policy="fast">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="5"/>
      <param name="FAILOVER_DELAY" value="4"/>
      <param name="FAILURE_INTERVAL" value="30"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="2"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>
 
    <resource-params target="disk" policy="normal">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="7"/>
      <param name="FAILOVER_DELAY" value="5"/>
      <param name="FAILURE_INTERVAL" value="50"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="2"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>      
 
    <resource-params target="disk" policy="fast">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="5"/>
      <param name="FAILOVER_DELAY" value="4"/>
      <param name="FAILURE_INTERVAL" value="30"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="2"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>
 
    <resource-params target="dblsnr" policy="normal">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="50"/>
      <param name="FAILOVER_DELAY" value="20"/>
      <param name="FAILURE_INTERVAL" value="300"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>      
 
    <resource-params target="dblsnr" policy="fast">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="40"/>
      <param name="FAILOVER_DELAY" value="20"/>
      <param name="FAILURE_INTERVAL" value="250"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="120"/>
      <param name="START_TIMEOUT" value="120"/>
      <param name="STOP_TIMEOUT" value="120"/>
    </resource-params>
 
    <resource-params target="db" policy="normal">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="120"/>
      <param name="FAILOVER_DELAY" value="30"/>
      <param name="FAILURE_INTERVAL" value="700"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="300"/>
      <param name="START_TIMEOUT" value="300"/>
      <param name="STOP_TIMEOUT" value="300"/>
    </resource-params>      
 
    <resource-params target="db" policy="fast">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="60"/>
      <param name="FAILOVER_DELAY" value="20"/>
      <param name="FAILURE_INTERVAL" value="400"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="300"/>
      <param name="START_TIMEOUT" value="300"/>
      <param name="STOP_TIMEOUT" value="300"/>
    </resource-params>
 
    <resource-params target="as" policy="normal">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="50"/>
      <param name="FAILOVER_DELAY" value="20"/>
      <param name="FAILURE_INTERVAL" value="350"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="600"/>
      <param name="START_TIMEOUT" value="600"/>
      <param name="STOP_TIMEOUT" value="600"/>
    </resource-params>      
 
    <resource-params target="as" policy="fast">
      <param name="AUTO_START" value="1"/>
      <param name="CHECK_INTERVAL" value="40"/>
      <param name="FAILOVER_DELAY" value="10"/>
      <param name="FAILURE_INTERVAL" value="300"/>
      <param name="FAILURE_THRESHOLD" value="5"/>
      <param name="RESTART_ATTEMPTS" value="4"/>
      <param name="SCRIPT_TIMEOUT" value="600"/>
      <param name="START_TIMEOUT" value="600"/>
      <param name="STOP_TIMEOUT" value="600"/>
    </resource-params>
  </ascrsctl>
  <agent>
    <log path="${ORACLE_HOME}/ascrs/log" level="FINER" rollover_size="5242880"/>
  </agent>
</config>

Note:

The parameter values for a resource should fall within proper ranges when creating it. If a parameter is not configured for a policy, an internal default value is assumed.

The internal default parameter values are listed in Table E-1.

Consult Oracle Clusterware documentation for the definitions of these parameters before editing their values.

Since computing environments vary in speed, Oracle recommends measuring the application's start and stop latency before setting the script, start, and stop timeout values. These values may be twice as much as the observed latencies.

13.4 Using ASCRS to Manage Resources

With the ascrsctl command line you manage CRS resources created for Fusion Middleware components. With this tool you can create, update, start, stop, switch and delete resources.

As mentioned in a previous section, a resource refers to an object that is created by CRS to identify the entity to be managed, such as an application, a virtual IP, or a shared disk. If the auto start for a resource is set to 1, CRS ensures this resource starts when CRS starts. Since Fusion Middleware resources depend on each other, start or stop of one resource may affect other resources. Resource dependency is enforced in the resource creation through ascrsctl syntax. At runtime, CRS uses this dependency knowledge for start/stop.

13.4.1 Creating CRS Managed Resources

On Unix and Windows 2008, ASCRS supports WebLogic Server, OPMN managed instance, Oracle database, Oracle database listener, virtual IP and shared disk. After a CRS managed resource is created for one of these components, it can be managed by CRS.

CRS resources created with the ascrsctl command line follow a naming convention. Follow this naming convention to ensure that the resources function correctly. To avoid unexpected errors, Oracle recommends using the CRS installation exclusively for Oracle Fusion Middleware, so that all the CRS managed resources are created with ascrsctl.

Under this naming convention, the canonical name for a resource has the following format:

ora.name.cfctype

Where name refers to the short name of the resource, for example, sharedisk, or myvip, and type refers to one of the resource types, such as vip, disk, db, dblsnr or as.

For example, on Linux, the following command creates a virtual IP resource named ora.myvip.cfcvip from the IP address 192.168.1.10 on network interface eth0 with netmask 255.255.255.0:

ascrsctl create -name myvip -type vip -ipAddr 192.168.1.10 -netmask 255.255.255.0 -interface eth0

On Windows:

ascrsctl create -name myvip -type vip -ipAddr 192.168.1.10 -netmask 255.255.255.0 -interface "Public network"

Note:

If a resource has a dependent resource, set the check interval for the dependent resource higher or at least equal to that of the resource on which it depends.

13.4.1.1 Creating a Virtual IP Resource

The following information is required for creating a virtual IP resource:

  • A valid virtual host name or IP address and it is not used by any host in the network

  • A valid netmask number for this host name or IP

  • One or more valid network interface names and they are present on all the cluster nodes where this IP is used. On Windows, the interface name is the same as the network connection name.

  • On Windows, the interface name refers to the network connection name, such as Public network.

Since a virtual IP resource is a system resource, on Unix, the create or update command generates a script that must be executed by root user to complete the operation.

The following command creates a virtual IP resource named ora.myvip.cfcvip on Linux:

ascrsctl create -name myvip -type vip -ipAddr 192.168.1.10 -netmask 255.255.255.0 -interface eth0

13.4.1.2 Creating a Shared Disk Resource

In a Fusion Middleware environment, shared disks are those disk storages that are used to hold Oracle database software, the database data files, WebLogic servers, OPMN managed components, and their Oracle homes. Shared disks allow the use of the same data when application resources are switched among the nodes within a cluster.

When creating a shared disk resource, carefully consider the following:

On Unix:

  • Before creating a shared disk resource, create an empty signature file named .ascrssf on the root of the shared disk. The owner of the CRS home should own this file. This file is used by CRS after the resource is created.

  • You can specify nop for either the mount or unmount command. You can use it for the mount command if the shared disk is never offline. If the disk does go off line for some reason, CRS will detects it and mark it as down. The nop command can be used for the unmount command if the disk does not need to be unmounted by CRS. In such a case, be absolutely sure that the disk does not need to be unmounted. There are potential disk corruption issues if the shared disk is mounted on two nodes without protection. Again, the signature file is always needed on the shared disk.

  • The unmount command, may fail if there are active processes using the shared disk. To prevent this command failure, avoid accessing this disk from other applications while this disk resource is in online state.

  • For complex mount and unmount commands, encapsulate the logic in executable scripts and specify the full path of these scripts as the mount and unmount commands. A proper unmount script is capable of killing other processes that are using this disk to ensure a successful and clean disk unmount. If the unmount command is in a script, do some basic file system checking, such as running a fsck command. Such a script should return 0 for success and 1 for failure.

  • A shared disk resource is a system resource. Create, update, or delete commands generate scripts that must be executed as root to complete the create operation. Follow the instructions from the screen output.

  • If the signature file is at the mount point of the shared disk, the start/stop operation may fail. Having the signature file on the mount point signals ASCRS that the disk is mounted, even if its not.

  • Validate the mount/unmount command before using it in the mc or umc parameters or in the script file. There is no validation from ASCRS for the commands.

  • If the shared disk is not protected by a cluster file system, it could be corrupted if it is mounted from multiple nodes. To avoid this, before creating the ASCRS resource, mount the disk only on the node where you create the resource.

On Windows Server 2008:

  • Open Microsoft Disk Management and take note of the shared disk number. A disk number is a non-negative integer, such as 0, 2, or 5.

  • Create an empty mount directory on the system drive on each cluster node, such as c:\oracle\asdisk.

  • Ensure this disk is no longer used by any application on any node.

  • From one node, in Disk Management, right click the drive and, online it, remove all partitions on it, create a single partition on this hard drive, and format it with NTFS. Remove any drive letter that may be assigned to it, and mount it to the directory you just created. Right click the drive again and offline it.

  • On each other node, open Microsoft Disk Management, online this drive, remove the drive letter, if any, and mount it to the directory you just created. Right click the drive and offline it.

  • Go to the node where you will create the disk resource, online the disk.

  • This disk root should be accessible from the mount directory.

  • Create an empty signature file named .ascrssf on the root of the shared disk. The CRS home owner should own this file. This file is used by CRS after the resource is created.

  • The mount command is "diskmgr online disknumber" and unmount command is "diskmgr offline disknumber", where diskmgr is an ASCRS built-in command. There is no need to install any additional software to use the diskmgr command.

To create a shared disk resource, on Unix, run the following ascrsctl command that includes a valid mount point, a mount command, and an unmount command:

ascrsctl create -n sharedisk -type disk -path /asdisk -mc "/bin/mount
 /dev/sda /asdisk" -umc "/bin/umount /asdisk"

To create a shared disk resource on Windows Server 2008, run an ascrsctl command similar to the following:

ascrsctl create -n sharedisk -type disk -path c:\oracle\asdisk -mc "diskmgr online 2" -umc
                "diskmgr offline 2"

After a resource is created, start it explicitly on the node where the create command is executed to make sure the already mounted disk is under ASCRS control:

ascrsctl start -n ora.sharedisk.cfcdisk -node <disk resource creation node>

Note:

On Windows Server 2008, mapped drives cannot be used as shared disk resources, and no general cluster file systems have been certified for this purpose.

13.4.1.3 Creating an Oracle Database Listener Resource

The following information is required for creating an Oracle listener resource:

  • A valid Oracle database home

  • The listener name

  • The virtual IP resource name for the listen address

  • The disk resource name for the Oracle home

Before creating the database listener resource, carefully check the following:

  • The database listener home is installed on a shared disk. A CRS resource has been created for the shared disk with an ascrsctl command, and the resource is started.

  • A CRS resource has been created for the virtual IP with an ascrsctl command, and the resource is started.

  • The listener Cold Failover Cluster (CFC) enabled. See Section 12.2.4, "Transforming an Oracle Database" for details.

  • Ensure that the listener name and Oracle home are valid database sid and database home. ASCRS does not do exhaustive validation on this type of information for this release.

  • On Windows, ensure that the start method of the listener Windows service is 'manual.'

Here is a syntax example for creating the resource:

ascrsctl create -n mydblsnr -type dblsnr -loh /cfcdb -ln LISTENER -disk ohdisk -vip myvip

For online help information for creating an Oracle database listener resource, use the following command:

ascrsctl help -c create -t dblsnr

13.4.1.4 Creating an Oracle Database Resource

A database resource is a resource of any one of the following:

  • Oracle database instance

  • Oracle Database Console (dbconsole) process

  • Oracle job scheduler process for the Windows platform

  • Oracle Volume Shadow Copy Service for the Windows platform

Creating an Oracle database instance resource requires the following:

  • A valid Oracle database home

  • The database sid name

  • Disk resource name for the Oracle home

  • Disk resource name(s) if data files reside on different shared disk(s)

  • The listener resource name

Before creating the Oracle database instance resource, carefully check the following:

  • On Windows, ensure the built-in user system is in DBA_GROUP and the start method of the corresponding Windows service is 'manual.'

  • The database home is installed on a shared disk. The data files of this database are on the same or different shared disk(s). CRS resources have been created for all these shared disks with ascrsctl and started.

  • A CRS resource has been created for the database listener with an ascrsctl command, and the resource is started.

  • The database is CFC enabled. See Section 12.2.4, "Transforming an Oracle Database" for details.

  • Ensure the database sid and Oracle home are valid. ASCRS does not do extensive validation of this information for this release.

The following is a syntax example for creating the database instance resource:

ascrsctl create -n mydb -type db -oh /cfcdb -lsnr mydblsnr -disk ohdisk datadisk

Creating all the other database resources requires the following:

  • A valid Oracle database home.

  • The database SID name.

  • The disk resource name for the Oracle home.

  • A valid virtual IP resource name. This is required for the Oracle Database Console (dbconsole) database resource only.

  • The database is CFC enabled.

For online help information for creating an Oracle database resource, use the following command:

ascrsctl help -c create -t db

13.4.1.5 Creating a Middleware Resource

OPMN instances and WebLogic servers are collectively called Application Server (AS) components and are managed by separate resources. Specifically, all OPMN managed components have to be managed by one resource and all servers under a WebLogic domain have to be managed by a different resource.

13.4.1.5.1 Creating a Resource for OPMN Managed Components

The following information is needed for creating a resource for an OPMN managed instance:

  • A valid instance home for the OPMN managed components.

  • A disk resource name for the instance home

  • A disk resource name for the instance's Oracle home if is on a different shared disk

  • The names of the OPMN managed applications for inclusion in the resource. If you plan to include only a subset of all the components, the other remaining components won't be managed by CRS and they shouldn't be started outside CRS. By default, all the components are included.

Before creating the OPMN resource, carefully check the following:

The following is a syntax example for creating the resource (Oracle home and instance home are on the same disk. All components are included.):

ascrsctl create -n myopmn -type as -ch /cfcas -disk ohdisk

For online help information for creating an OPMN instance resource, use the following command:

ascrsctl help -c create -t as
13.4.1.5.2 Creating a Resource for WebLogic Servers

Creating a CRS resource for a WebLogic domain requires more preparation than other resource types. Due to its complexity, the procedure is divided into the following sections:

  • Basic Setup

  • Node Manager Setup

  • Administration Server Setup

  • Creating the Resource

Basic Setup

Before starting the basic setup, be sure that WebLogic is installed on shared disk(s). WebLogic Server software and the domain instance can be installed on either the same or separate shared disk.

In addition, before proceeding to the following Node Manager and Server Setup, ensure that the WebLogic Server environment is CFC enabled. See Section 12.2.2.3, "Transforming the Administration Server for Cold Failover Cluster" through Section 12.2.2.5, "Transforming Node Manager" for details on enabling CFC for the Oracle WebLogic Server environment. Once CFC is enabled, you can manually start and stop the server, the original node, and the failover node(s) without noticeable difference.

To create the dependency resources:

  1. Create a CRS resource for each shared disk and start it on the node on which it was created.

  2. Create a CRS resource for the virtual IP with the ascrsctl command and start it on the same cluster node.

Node Manager Setup

To set up the Node Manager

  1. For Windows Server 2008, on each node, create Node Manager Windows service if it does not already exist, by executing the following command from the WL_HOME/server/bin directory:

    installNodeMgrSvc.cmd
    

    From Windows Service Manager, make sure this service is in manual start mode.

  2. If you have not yet done so, change Node Manager's username and password. The initial password is randomly generated. To change the Node Manager password, in the WebLogic Server Administration Console, select Domain, Security, General, and then Advanced. Enter the new password and click Save.

  3. If you have changed anything in steps 1 or 2, restart the Node Manager.

    On Unix, using the following command from the WL_HOME/server/bin directory:

    startNodemanager.sh
    

    On Windows, start Node Manager from the service manager.:

  4. Start the WebLogic scripting tool in the WL_HOME/common/bin directory. To persist Node Manager's user login information in the ascrscf.dat and ascrskf.dat files, use the following commands:

    nmConnect('nmUser','nmPasswd','hostname','nmPort','domainName','domainDir')
    storeUserConfig('WL_HOME/common/nodemanager/ascrscf.dat',
                       'WL_HOME/common/nodemanager/ascrskf.dat','true')
    nmDisconnect()
    exit()
    
  5. For Unix platforms, copy CRS_HOME/ascrs/public/cfcStartNodeManager.sh to the WL_HOME/server/bin directory, and make the script executable.

Note:

To keep the setup consistently in sync, step 4 must be performed whenever the Node Manager passwords or usernames are changed.

After you have started Node Manager for the first time, you can edit the nodemanager.properties file to set the StartScriptEnabled property. The nodemanager.properties file does not exist until Node Manager is started for the first time.

In the WL_HOME/common/nodemanager directory, set the StartScriptEnabled property in the nodemanager.properties file to true.

StartScriptEnabled=true

Check the nodemanager.properties file to ensure no value is assigned to ListenAddress, and that a valid port number is assigned to ListenPort.

When this property is set in the nodemanager.properties file, you no longer need to define it in the JAVA_OPTIONS environment variable.

Server Setup

  1. All WebLogic servers listen on the virtual IP. To ensure this is configured correctly, log in to the WebLogic Server Administration Console and navigate to Environment > Servers > server_name > Configuration > General page and verify that the virtual IP and the port number are both set correctly and click Save.

  2. Each server must also listen on the localhost. To ensure this is configured correctly, log into the WebLogic Server Administration Console and do the following:

    1. In the Domain tree, select Environment, Servers, server_name, Protocols, and then Channels.

    2. In the Change Center, click Lock & Edit.

    3. Click New, select a channel name, and protocol t3, and continue to the next screen.

    4. Enter the localhost for both the Listen Address and External ListenAddress.

    5. Enter the port number to the Listen Port and External ListenPort. This port number must be exactly the same as the port number used for the virtual IP.

    6. Continue to the next screen and verify that Enabled is selected.

    7. Click Finish.

    8. Click Activate Changes.

  3. If this is the Administration Server, ensure the DOMAIN_HOME/servers/serverName/security directory exists. This directory should contain the boot.properties file. If this file does not exist, create it and include the following properties:

    username=<admin server user name>
    password=<admin server user password>
    
  4. If this domain does not have an Administration Server, ensure the DOMAIN_HOME/servers/serverName/security directory exists. This directory should contain the boot.properties file. If this file does not exist, create it and include the following properties:

    username=<admin server user name>
    password=<admin server user password>
    
  5. If DOMAIN_HOME/servers/serverName/data/nodemanager/startup.properties exists, ensure the property AutoRestart defined in this file is set to false.

  6. Repeat the previous steps for all servers, and then restart all servers in the Administration Server console.

  7. Shutdown all WebLogic processes and stop the Node Manager.

Note:

To keep the setup consistently in sync, Step 3 must be performed whenever the Administration Server password is changed.

Creating the Resource

After Basic Setup, Node Manager Setup, and Server Setup, create the CRS resource using the following command:

ascrsctl create -n mywls -type as -ch /cfcas -disk sharedisk -vip myvip

Note:

The WebLogic component home argument (-ch) must be valid. ASCRS does not perform extensive validation for this information.

Note:

By default, the ASCRS agent checks WebLogic Server health by periodically initiating a TCP connection to the server. To alter this behavior, refer to Section 13.4.8, "Configuring and Using Health Monitors."

13.4.2 Updating Resources

You can update resources created with ascrsctl using the update command. Depending on the resource type, you can update the resource profile by specifying the appropriate parameter through the update command line. You can perform updates only when the resource is in the offline state.

For example, to update the virtual IP resource created in the last section with a new IP address and a different interface, use the following command:

ascrsctl update -n myvip -type vip -ip 192.168.1.20 -if eth1

Note:

If you want to change the set of nodes hosting a particular resource, you must stop all dependent resources and then update the cluster nodes for each resource with the same node set and ordering. To find out related resources, run ascrsctl status command for this resource.

13.4.3 Starting Up Resources

When a resource is started, it is put under the control of CRS and its runtime status is monitored continuously by CRS. If the resource depends on other resources, starting this resource automatically starts the dependent resources. Refer to Oracle Clusterware documentation for information about the role of resource placement policy during resource start up. The ascrsctl start command maps to the CRS command.

For example, to start the virtual IP resource, use the following command:

ascrsctl start -n ora.myvip.cfcvip

Note:

If a resource depends on more than one resource, while starting that resource, be sure that the resources, if online, are targeted on the same node.

13.4.4 Shutting Down Resources

When a resource is stopped, it is brought down and put in offline state and CRS stops monitoring its runtime status. If the resource depends on other resources that are in online status, the dependent resources are not stopped unless you confirm the prompt or (-np) option is specified. Refer to Oracle Clusterware documentation for more information about the implications of resource dependency during resource stop. The ascrsctl stop command maps to CRS command crs_stop.

For example, to stop the virtual IP resource, run the following command:

ascrsctl stop -n ora.myvip.cfcvip

13.4.5 Resource Switchover

Resource switchover is a process of shutting down the resource on the node on which it is running and restarting it on another node. The new node, if not specified, is determined by CRS, based on the placement policy. If the resource to be switched depends on other resources, or there are resources that are online and depend on it, this resource must be switched with -np flag.

To switch over a resource to another available node in the cluster, run the following command:

ascrsctl switch -n ora.myvip.cfcvip

13.4.6 Deleting Resources

You can delete a resource from CRS control. After a resource is deleted, the corresponding application or component's functionality is not affected, but CRS no longer monitors that resource. If a resource has dependent resources, it can not be removed.

To delete a resource from CRS control, run the following command:

ascrsctl delete -n ora.myvip.cfcvip

Note:

If you delete a resource from CRS, the log directory and the log files for the deleted resource are NOT automatically removed. If you don't plan to reuse them in the future, you should delete them manually. The log files are located in the ORA_CRS_HOME/ascrs/log directory by default.

13.4.7 Checking Resource Status

You can check resource status with the ascrsctl status command. With this command, you can view the states of all resources and their dependents. If a particular resource is specified, the status command show its CRS profile, its direct and indirect dependency relationships, and its current state information.

For example, to check the status of a resource, run the following command:

ascrsctl status -n ora.myvip.cfcvip

Assuming the virtual IP resource is used by a database listener resource and the listener resource is in turn required by a database instance resource, all the dependency information is shown in a tree structure, along with other status information in the following status output:

Basic information
------------------------+------------------------
  Name                  |  ora.myvip.cfcvip
  Type                  |  Virtual IP
  Target state          |  ONLINE
  Resource state        |  ONLINE on stajz11
  Restart count         |  0
  Failure count         |  0
  Hosting members       |  stajz11, stajz12
------------------------+------------------------
 
  Common CRS parameters
------------------------+------------------------
  Auto start            | Yes
  Check interval        | 7 sec
  Failover delay        | 5 sec
  Failure interval      | 50 sec
  Failure threshold     | 5
  Restart attempts      | 2
  Script timeout        | 30 sec
  Start timeout         | 30 sec
  Stop timeout          | 30 sec
------------------------+------------------------
 
  Resource specific parameters
------------------------+------------------------
  Interface(s)          | eth2
  Netmask               | 255.255.252.0
  Virtual IP address    | 140.87.27.48
------------------------+------------------------
 
  Resource dependency tree(s)
-------------------------------------------------
  ora.mydb.cfcdb
    |
    +->ora.mydisk.cfcdisk
    |
    +->ora.mylsnr.cfcdblsnr
         |
         +->ora.mydisk.cfcdisk
         |
         +->ora.myvip.cfcvip
 
  ora.myopmn.cfcas
    |
    +->ora.mydisk.cfcdisk
    |
    +->ora.myvip.cfcvip
 
  ora.dbc.cfcdb
    |
    +->ora.mydisk.cfcdisk
    |
    +->ora.myvip.cfcvip
 
  ora.mywls.cfcas
    |
    +->ora.mydisk.cfcdisk
    |
    +->ora.myvip.cfcvip

13.4.8 Configuring and Using Health Monitors

Resource health state is the most important information in a CRS restart or failover decision. However, precisely determining the true state of a server is not an easy task. One situation is that the server process is still running, but its service is actually down. Thus, a simple TCP ping of a WebLogic server does not necessarily tell its true functional state. As an alternative, ASCRS provides some limited support for checking WebLogic Server functional state through user-defined monitors.

All monitors are defined in ORA_CRS_HOME/ascrs/config/mconfig.xml. Here is an example of this file:

<monitors>
 
    <!-- HTTP code monitor -->
    <monitor name="http_cm">
        <method type="ping" protocol="http" url="/index.html"/>
        <result type="code" code="200"/>
    </monitor>
 
    <!-- HTTP response content monitor that checks the response of a particular
URL -->
    <monitor name="http_rcm">
        <method type="ping" protocol="http" url="/index.html"/>
        <result type="exact_content" file="/crs/ascrs/public/index_content.txt"/>
    </monitor>
 
    <!-- HTTP response content monitor that checks the desired pattern in any line
of the returned content -->
    <monitor name="http_regex">
        <method type="ping" protocol="http" url="/cqi"/>
        <result type="regex_content"><![CDATA[.+Welcome.+]]></result>
    </monitor>

    <!-- WebLogic callout script monitor -->
    <monitor name="wls_sm">
        <method type="script" command="/var/scripts/wlsrv3_checker.sh"/>
        <result type="code" code="0"/>
    </monitor>
</monitors>

For each monitor, both "method" and "result" need to be defined. Method type can be either "ping" or "script." For a "ping" method, only the "http" protocol is supported and a valid URL has to be present. The result type is one of "code," "exact_content," and "regex_content." For a ping result with the "http" protocol, the code refers to the expected HTTP return code, while "exact_content" and "regex_content" specifies the exact return content and expected pattern in a line.

To use a monitor for a WebLogic Server, you only need to assign the monitor name to the server name with the -m option during its creation or update.

For example, assuming the WebLogic resource has two servers, AdminServer and wlsapp, you can customize their monitors with the command:

ascrsctl create -n mywls -type as -ch /cfcas -disk sharedisk -vip myvip -m
AdminServer=http_cm wlsapp=http_rcm

Note:

When using the exact_content method, use the CRS_HOME/ascrs/bin/mu utility to generate the file for the expected response content, since the content saved from the browser is not usually exactly the same text as sent from the server.

13.5 Example Topologies

The following two examples illustrate how to use ASCRS to manage Fusion Middleware resources.

Figure 13-1 CRS Topology Example 1

CRS Topology Example 1
Description of "Figure 13-1 CRS Topology Example 1"

Figure 13-1 illustrates the CRS Example 1 topology. In this example, Oracle HTTP Server and SOA are installed in a two-node cluster. Oracle HTTP Server is managed by OPMN. The SOA installation has a WebLogic server running that it hosts four Java EE applications.

Assumptions:

Under these assumptions, the following procedure describes the Cold Failover Clusters automation setup:

  1. Installing WebLogic software and enabling Cold Failover Clusters:

    1. If the shared disk /dev/sda1 is mounted on Node 2, unmount it. Mount the shared disk on /shareddisk1 on Node 1 if not yet mounted.

    2. Create an empty signature file .ascrssf in /shareddisk1. Create this file only after this shared disk is mounted.

    3. Bind the virtual IP to eth0 using /sbin/ifconfig.

    4. Install SOA and OHS on the shared disk. Perform the CFC enabling procedure for both SOA and OHS using the virtual IP. After CFC enabling, shutdown all processes belonging to SOA server and OHS instance. To do a basic checking of the CFC enabling, unmount the shared disk on Node 1 and mount on Node 2 and try to start SOA and OHS on Node 2. If start fails, fix it before you proceed.

    5. After step 3 is done, shutdown all OHS and SOA processes and unmount the disk on Node 2 and mount it on Node 1.

    6. Follow the procedure in Section 13.4.1.5.2, "Creating a Resource for WebLogic Servers," except for the "Creating the Resource" step, to configure the WebLogic Server in SOA install for ASCRS.

    7. Shutdown any OHS and SOA processes.

    8. Unbind the virtual IP.

  2. Create a CRS resource.

    1. On Node 1, cd to the /CRS_HOME/ascrs/bin directory.

    2. Create the virtual IP resource:

      ascrsctl create -n asvip -t vip -if "eth0|eth1" -ip 192.168.1.10 -nm 255.255.255.0
      
    3. Create the shared disk resource:

      ascrsctl create -n asdisk -t disk -path /sharedisk1
                   -mc "/bin/mount /dev/sda1/sharedisk1"
                   -umc "/bin/umount /sharedisk1"
      
    4. Create the SOA WebLogic server resource:

      ascrsctl create -n soa -t as -vip asvip -disk asdisk
                -ch /sharedisk1/fmw/user_projects/domains/asdomain
      
    5. Create the Oracle HTTP Server resource with SOA as its dependency:

      ascrsctl create -n ohs -t as -vip asvip -disk asdisk -as soa -ch /sharedisk1/ohsinst
      
    6. Start all resources on Node 1. Since the Oracle HTP Server resource depends on all other resources, starting this resource automatically starts other resources as well.

      ascrsctl start -n ora.ohs.cfcas -node node1
      

Figure 13-2 CRS Topology Example 2

CRS Topology Example 2
Description of "Figure 13-2 CRS Topology Example 2"

Figure 13-2 illustrates the CRS Example 2 topology. In this example topology, WebLogic Server and the Oracle database are installed on a two-node cluster with the following characteristics:

The goal of this topology is to provide a failover solution for both the WebLogic Administration Server and the database instance.

Assumptions:

Under these assumptions, the following describes the procedure for automating Cold Failover Clusters:

  1. Install WebLogic software and enable Cold Failover Clusters:

    1. If the shared disk /dev/sda1 is mounted on Node 2, unmount it. Mount the shared disk on /shareddisk1 on Node 1 if it is not yet mounted.

    2. Create an empty signature file .ascrssf in /shareddisk1. Create this file only after this shared disk is mounted.

    3. Bind the virtual IP 192.168.1.10 to eth0 using /sbin/ifconfig.

    4. Install WebLogic on the shared disk. Perform the Cold Failover Clusters enabling procedure for this installation using this virtual IP.

    5. Follow the procedure in Section 13.4.1.5.2, "Creating a Resource for WebLogic Servers" to configure the WebLogic server for ASCRS.

    6. Shutdown the WebLogic server.

    7. Unbind the virtual IP.

  2. Create the WebLogic-related resource.

    1. Go to Node 1, cd to the /CRS_HOME/ascrs/bin directory.

    2. Create the virtual IP resource:

      ascrsctl create -n asvip -t vip -if "eth0|eth1" -ip 192.168.1.10 -nm 255.255.255.0
      
    3. Create the shared disk resource:

      ascrsctl create -n asdisk -t disk -path /sharedisk1
                -mc "/bin/mount /dev/sda1 /sharedisk1" -umc "/bin/umount /sharedisk1"
      
    4. Create the WebLogic server resource:

      ascrsctl create -n adminserver -t as -vip asvip -disk asdisk        -ch /sharedisk1/fmw/user_projects/domains/asdomain
      
    5. Start all WebLogic related resources on Node 1. Since this WebLogic resource depends on the disk and virtual IP resource, starting WebLogic resource automatically starts the other two as well.

      ascrsctl start -n ora.adminserver.cfcas -node node1
      
  3. Install the Oracle database software and enable Cold Failover Clusters.

    1. If the shared disk /dev/sda2 is mounted on Node 1, unmount it. Mount the shared disk on /shareddisk2 on Node 2 if it is not yet mounted.

    2. Create an empty signature file .ascrssf in /shareddisk2. Create this file only after this shared disk is mounted.

    3. If the shared disk /dev/sda3 is mounted on Node 1, unmount it. Mount the shared disk on /shareddisk3 on Node 2 if it is not yet mounted.

    4. Create an empty signature file .ascrssf in /shareddisk3. Create this file only after this shared disk is mounted.

    5. Bind the virtual IP 192.168.1.20 to eth2 using /sbin/ifconfig.

    6. Install Oracle database in directory /sharedisk2/dbhome and put the data files in /sharedisk3/dbdata.

    7. Perform the CFC enabling procedure for this install for this virtual IP.

    8. Shutdown the database.

    9. Unbind the virtual IP.

  4. Create the Oracle database-related resource.

    1. In Node 2, cd to the /CRS_HOME/ascrs/bin directory.

    2. Create the virtual IP resource:

      ascrsctl create -n dbdisk -t disk -path /sharedisk2
                   -mc "/bin/mount /dev/sda2
      /sharedisk2" -umc "/bin/umount /sharedisk2"
      
    3. Create the shared disk resource for the database software:

      ascrsctl create -n dbdisk -t disk -path /sharedisk2
                   -mc "/bin/mount /dev/sda2
      /sharedisk2" -umc "/bin/umount /sharedisk2"
      
    4. Create the shared disk resource for the database data files:

      ascrsctl create -n dfdisk -t disk -path /sharedisk3
                   -mc "/bin/mount /dev/sda3 
      /sharedisk3" -umc "/bin/umount /sharedisk3"
      
    5. Create the database instance resource:

      ascrsctl create -n asdblsnr -t lsnr -vip dbvip 
                -disk dbdisk -loh /sharedisk2/dbhome -ln LISTENER 
      
    6. Create the database listener resource:

      ascrsctl create -n asdb -t db -lsnr asdblsnr -sid orcl
                        -oh /sharedisk2/dbhome -disk dbdisk dfdisk 
      
    7. Start all the database related resources on Node 2. Since the database resource depends on all the other resources directly or indirectly, starting the database resource automatically starts the others as well.

      ascrsctl start -n ora.asdb.cfcdb -node node2
      

13.6 Troubleshooting Oracle CRS

This section includes troubleshooting information for Oracle CRS.

13.6.1 OPMN Resource Depends on a Virtual IP Resource

The OPMN resource depends on a virtual IP resource, but during create or update operations, ASCRS does not validate whether the virtual IP resource that the OPMN instance depends on actually exists. If an incorrect virtual IP resource is provided, the OPMN resource startup fails, and the OPMN instance may be left in an inconsistent state. To fix this problem, follow these steps:

  1. Stop the OPMN resource using the -f option.

  2. Ensure that all relevant processes are stopped. If not, stop all remaining stray processes manually.

  3. Update the resource with the correct virtual IP resource.

  4. Start the resource.

13.6.2 ASCRS Logging

ASCRS relies on logging for diagnosing unexpected issues. To get more diagnostic information, you can increase the verbosity of the log level by changing the ASCRS configuration file config.xml.

In addition, you can also check CRS daemon logs for basic CRS issues.