Skip Headers
Oracle® Enterprise Manager Grid Control Installation and Configuration Guide
10g Release 5 (10.2.0.5.0)

Part Number E10953-15
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

18 Configuring Enterprise Manager for Active and Passive Environments

Active and Passive environments, also known as Cold Failover Cluster (CFC) environments, refer to one type of high availability solution that allows an application to run on one node at a time. These environments generally use a combination of cluster software to provide a logical host name and IP address, along with interconnected host and storage systems to share information to provide a measure of high availability for applications.

This chapter contains the following sections:

Using Virtual Host Names for Active and Passive High Availability Environments in Enterprise Manager Database Control

This section provides information to database administrators about configuring an Oracle Database release 10g in Cold Failover Cluster environments using Enterprise Manager Database Control.

The following conditions must be met for Database Control to service a database instance after failing over to a different host in the cluster:

The following items are configuration and installation points you should consider before getting started.

Set Up the Alias for the Virtual Host Name and Virtual IP Address

You can set up the alias for the virtual host name and virtual IP address by either allowing the clusterware to set it up automatically or by setting it up manually before installation and startup of Oracle services. The virtual host name must be static and resolvable consistently on the network. All nodes participating in the setup must resolve the virtual IP address to the same host name. Standard TCP tools similar to nslookup and traceroute commands can be used to verify the set up.

Set Up Shared Storage

Shared storage can be managed by the clusterware that is in use or you can use any shared file system volume as long as it is supported. The most common shared file system is NFS. You can also use the Oracle Cluster File System software.

Set Up the Environment

Some operating system versions require specific operating system patches to be applied prior to installing release 10gR2 of the Oracle database. You must also have sufficient kernel resources available when you conduct the installation.

Before you launch the installer, specific environment variables must be verified. Each of the following variables must be identically set for the account you are using to install the software on all machines participating in the cluster.

  • Operating system variable TZ, time zone setting. You should unset this prior to the installation.

  • PERL variables. Variables like PERL5LIB should be unset to prevent the installation and Database Control from picking up the incorrect set of PERL libraries.

  • Paths used for dynamic libraries. Based on the operating system, the variables can be LD_LIBRARY_PATH, LIBPATH, SHLIB_PATH, or DYLD_LIBRARY_PATH. These variables should only point to directories that are visible and usable on each node of the cluster.

Ensure That the Oracle USERNAME, ID, and GROUP NAME Are Synchronized on All Cluster Members

The user and group of the software owner should be defined identically on all nodes of the cluster. You can verify this using the following command:

$ id -a
uid=1234(oracle) gid=5678(dba) groups=5678(dba)

Ensure That Inventory Files Are on the Shared Storage

To ensure that inventory files are on the shared storage, follow these steps:

  • Create you new ORACLE_HOME directory.

  • Create the Oracle Inventory directory under the new Oracle home

    cd <shared oracle home>
    mkdir oraInventory
    
  • Create the oraInst.loc file. This file contains the Inventory directory path information required by the Universal Installer:

    1. vi oraInst.loc

    2. Enter the path information to the Oracle Inventory directory and specify the group of the software owner as the dba user. For example:

      inventory_loc=/app/oracle/product/10.2/oraInventory inst_group=dba
      

      Depending on the type of operating system, the default directory for the oraInst.loc file is either /etc (for example, on Linux) or /var/opt/oracle (for example, on Solaris and HP-UX).

Start the Installer

To start the installer, point to the inventory location file oraInst.loc, and specify the host name of the virtual group. The debug parameter in the example below is optional:

$ export ORACLE_HOSTNAME=lxdb.acme.com
$ runInstaller -invPtrloc /app/oracle/share1/oraInst.loc ORACLE_HOSTNAME=lxdb.acme.com -debug

Windows NT Specific Configuration Steps

On Windows environments, an additional step is required to copy over service and keys required by the Oracle software.

  1. Using regedit on the first host, export each Oracle service from under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services.

  2. Using regedit on the first host, export HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE.

  3. Use regedit to import the files created in step 1 and 2 to the failover host.

For Windows, NT services need to be created on the failover host. For an Enterprise Manager release 10.2.0.5 Management Agent, the following command can be used:

emctl create service [-user <username>] [-pwd <password>] -name <servicename>

This has to be done once on the failover host after doing a failover.

Start Services

You must start the services in the following order:

  1. Establish IP address on the active node

  2. Start the TNS listener

  3. Start the database

  4. Start dbconsole

  5. Test functionality

In the event that services do not start, do the following:

  1. Establish IP on failover box

  2. Start TNS listener

    lsnrctl start
    
  3. Start the database

    dbstart
    
  4. Start Database Control

    emctl start dbconsole
    
  5. Test functionality

To manually stop or shutdown a service, follow these steps:

  1. Stop the application.

  2. Stop Database Control

    emctl stop dbconsole
    
  3. Stop TNS listener

    lsnrctl stop
    
  4. Stop the database

    dbshut
    
  5. Stop IP

Configuring Grid Control Repository in Active/Passive High Availability Environments

In order for Grid Control repository to fail over to a different host, the following conditions must be met:

Installation and Configuration

The following installation and configuration requirements should be noted:

  • To override the physical host name of the cluster member with a virtual host name, software must be installed using the parameter ORACLE_HOSTNAME.

  • For inventory pointer, software must be installed using the command line parameter -invPtrLoc to point to the shared inventory location file, which includes the path to the shared inventory location.

  • The database software, the configuration of the database, and data files are on a shared volume.

If you are using an NFS mounted volume for the installation, ensure that you specify rsize and wsize in your mount command to prevent I/O issues. See My Oracle Support note 279393.1 Linux.NetApp: RHEL/SUSE Setup Recommendations for NetApp Filer Storage.

Example:

grid-repo.acme.com:/u01/app/share1 /u01/app/share1 nfs rw,bg,rsize=32768,wsize=32768,hard,nointr,tcp,noac,vers=3,timeo=600 0 0

Note:

Any reference to shared could also be true for non-shared failover volumes, which can be mounted on active hosts after failover.

Set Up the Virtual Host Name/Virtual IP Address

You can set up the virtual host name and virtual IP address by either allowing the clusterware to set it up or manually setting it up before installation and startup of Oracle services. The virtual host name must be static and resolvable consistently on the network. All nodes participating in the setup must resolve the virtual IP address to the same host name. Standard TCP tools such as nslookup and traceroute can be used to verify the host name. Validate using the commands listed below:

nslookup <virtual hostname>

This command returns the virtual IP address and fully qualified host name.

nslookup <virtual IP>

This command returns the virtual IP address and fully qualified host name.

Be sure to try these commands on every node of the cluster to verify that the correct information is returned.

Set Up the Environment

Some operating system versions require specific patches to be applied prior to installing 10gR2. The user installing and using the 10gR2 software must also have sufficient kernel resources available. Refer to the operating system's installation guide for more details.

Before you launch the installer, certain environment variables must be verified. Each of these variables must be set up identically for the account installing the software on ALL machines participating in the cluster:

  • OS variable TZ (time zone setting)

    You should unset this variable prior to installation.

  • PERL variables

    Variables such as PERL5LIB should also be unset to prevent inadvertently picking up the wrong set of PERL libraries.

  • Same operating system, operating system patches, and version of the kernel. Therefore, RHEL 3 and RHEL 4 are not allowed for a CFC system.

  • System libraries

    For example, LIBPATH, LD_LIBRARY_PATH, SHLIB_PATH, and so on. The same system libraries must be present.

Synchronize Operating System User IDs

The user and group of the software owner should be defined identically on all nodes of the cluster. This can be verified using the id command:

$ id -a

uid=550(oracle) gid=50(oinstall) groups=501(dba)

Set Up Inventory

You can set up the inventory by using the following steps:

  1. Create your new ORACLE_HOME directory.

  2. Create the Oracle Inventory directory under the new oracle home

    cd <shared oracle home>

    mkdir oraInventory

  3. Create the oraInst.loc file. This file contains the Inventory directory path information needed by the Universal Installer.

    vi oraInst.loc

    Enter the path information to the Oracle Inventory directory, and specify the group of the software owner as the oinstall user:

    Example:

    inventory_loc=/app/oracle/product/10.2/oraInventory inst_group=oinstall

Install the Software

Follow these steps to install the software:

  1. Create the shared disk location on both the nodes for the software binaries.

  2. Point to the inventory location file oraInst.loc (under the ORACLE_BASE in this case), as well as specifying the host name of the virtual group. For example:

    $ export ORACLE_HOSTNAME=grid-repo.acme.com
    $ runInstaller -invPtrLoc /app/oracle/share1/oraInst.loc ORACLE_HOSTNAME=grid-repo.acme.com
    
  3. Install the repository DB software only on the shared location. For example:

    /oradbnas/app/oracle/product/oradb10203 using Host1

  4. Start DBCA and create all the data files be on the shared location. For example:

    /oradbnas/oradata

  5. Continue the rest of the installation normally.

  6. Once completed, copy the files oraInst.loc and oratab to /etc. Also copy /opt/oracle to all cluster member hosts (Host2, Host3, and so on).

Windows NT Specific Configuration Steps

On Windows environments, an additional step is required to copy over service and keys required by the Oracle software.

  1. Using regedit on the first host, export each Oracle service from under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services.

  2. Using regedit on the first host, export HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE.

  3. Use regedit to import the files created in step 1 and 2 to the failover host.

For Windows, NT services need to be created on the failover host. For an Enterprise Manager release 10.2.0.5 Management Agent, the following command can be used:

emctl create service [-user <username>] [-pwd <password>] -name <servicename>

This has to be done once on the failover host after doing a failover.

Startup of Services

Be sure you start your services in the proper order:

  1. Establish IP address on the active node

  2. Start the TNS listener if it is part of the same failover group

  3. Start the database if it is part of the same failover group

In case of failover, follow these steps:

  1. Establish IP address on the failover box

  2. Start TNS listener (lsnrctl start) if it is part of the same failover group

  3. Start the database (dbstart) if it is part of the same failover group

Summary

The Grid Control Management Repository can now be deployed in a CFC environment that utilizes a floating host name.

To deploy the OMS midtier in a CFC environment, please see How to Configure Grid Control OMS in Active/Passive Environment for High Availability Failover Using Virtual Host Names.

How to Configure Grid Control OMS in Active/Passive Environment for High Availability Failover Using Virtual Host Names

This section provides a general reference for Grid Control administrators who want to configure Enterprise Manager 10g Grid Control in Cold Failover Cluster (CFC) environments.

Overview and Requirements

The following conditions must be met for Grid Control to fail over to a different host:

  • The installation must be done using a Virtual Host Name and an associated unique IP address.

  • Install on a shared disk/volume which holds the binaries, the configuration and the runtime data (including the recv directory).

  • Configuration data and metadata must also failover to the surviving node.

  • Inventory location must failover to the surviving node.

  • Software owner and time zone parameters must be the same on all cluster member nodes that will host this Oracle Management Service (OMS).

Installation and Configuration

To override the physical host name of the cluster member with a virtual host name, software must be installed using the parameter ORACLE_HOSTNAME. For inventory pointer, the software must be installed using the command line parameter -invPtrLoc to point to the shared inventory location file, which includes the path to the shared inventory location.

If you are using an NFS mounted volume for the installation, please ensure that you specify rsize and wsize in your mount command to prevent running into I/O issues.

For example:

oms.acme.com:/u01/app/share1 /u01/app/share1 nfs rw,bg,rsize=32768,wsize=32768,hard,nointr,tcp,noac,vers=3,timeo=600 0 0

Note:

Any reference to shared failover volumes could also be true for non-shared failover volumes which can be mounted on active hosts after failover.

Setting Up the Virtual Host Name/Virtual IP Address

You can set up the virtual host name and virtual IP address by either allowing the clusterware to set it up, or manually setting it up yourself before installation and startup of Oracle services. The virtual host name must be static and resolvable consistently on the network. All nodes participating in the setup must resolve the virtual IP address to the same host name. Standard TCP tools such as nslookup and traceroute can be used to verify the host name. Validate using the following commands:

nslookup <virtual hostname>

This command returns the virtual IP address and full qualified host name.

nslookup <virtual IP>

This command returns the virtual IP address and fully qualified host name.

Be sure to try these commands on every node of the cluster and verify that the correct information is returned.

Setting Up Shared Storage

Storage can be managed by the clusterware that is in use or you can use any shared file system (FS) volume as long as it is not an unsupported type, such as OCFS V1. The most common shared file system is NFS.

Note:

Do not create the ssl.conf file on shared storage, otherwise there is a potential for locking issues. Create the ssl.conf file on local storage.

Setting Up the Environment

Some operating system versions require specific operating system patches be applied prior to installing 10gR2. The user installing and using the 10gR2 software must also have sufficient kernel resources available. Refer to the operating system's installation guide for more details.Before you launch the installer, certain environment variables need to be verified. Each of these variables must be identically set for the account installing the software on ALL machines participating in the cluster:

  • OS variable TZ

    Time zone setting. You should unset this variable prior to installation

  • PERL variables

    Variables such as PERL5LIB should also be unset to avoid association to the incorrect set of PERL libraries

Synchronizing Operating System IDs

The user and group of the software owner should be defined identically on all nodes of the cluster. This can be verified using the 'id' command:

$ id -a

uid=550(oracle) gid=50(oinstall) groups=501(dba)

Setting Up Shared Inventory

Use the following steps to set up shared inventory:

  1. Create your new ORACLE_HOME directory.

  2. Create the Oracle Inventory directory under the new oracle home:

    $ cd <shared oracle home>

    $ mkdir oraInventory

  3. Create the oraInst.loc file. This file contains the Inventory directory path information needed by the Universal Installer.

    1. vi oraInst.loc

    2. Enter the path information to the Oracle Inventory directory and specify the group of the software owner as the oinstall user. For example:

      inventory_loc=/app/oracle/product/10.2/oraInventory

      inst_group=oinstall

Installing the Software

Refer to the following steps when installing the software:

  1. Create the shared disk location on both the nodes for the software binaries

  2. Point to the inventory location file oraInst.loc (under the ORACLE_BASE in this case), as well as specifying the host name of the virtual group. For example:

    $ export ORACLE_HOSTNAME=lxdb.acme.com
    $ runInstaller -invPtrloc /app/oracle/share1/oraInst.loc 
    ORACLE_HOSTNAME=lxdb.acme.com -debug
    
  3. Modify the results of the uname -n (node name) command by executing the UNIX command hostname <unqualified name of virtual host>. V$session reads that command immediately.

    If you are unable to modify the host name, abort the installer when the OMS configuration assistant fails at emctl config emkey and execute the following commands to complete the installation:

    1. Replace all host name entries with the virtual host name in $OMS_HOME/sysman/config/emoms.properties.

    2. <OMS_HOME>/bin/emctl config emkey -repos -force

    3. <OMS_HOME>/bin/emctl secure oms

    4. <OMS_HOME>/bin/emctl secure lock

    5. <OMS_HOME>/perl/bin/perl $OMSHOME/sysman/install/precompilejsp.pl <OMS_HOME>/j2ee/OC4J_EM/config/global-web-application.xml

      Perform this step if you are using Grid Control 10.2.0.1. Grid Control must be installed before applying the 10.2.0.4 or 10.2.0.5 patchsets.

    6. <OMS_HOME>/bin/emctl config agent updateTZ

    7. <OMS_HOME>/opmn/bin/opmnctl stopall

    8. <OMS_HOME>/opmn/bin/opmnctl startall

    9. <AGENT_HOME>/bin/agentca -f

  1. Install Oracle Management Services on cluster member Host1 using the option, "EM install using the existing DB"

  2. Continue the remainder of the installation normally.

  3. Once completed, copy the files oraInst.loc and oratab to /etc. Also copy /opt/oracle to all cluster member hosts (Host2, Host3, and so on).

Windows Specific Configuration Steps

On Windows environments, an additional step is required to copy over service and keys required by the Oracle software.

  1. Using regedit on the first host, export each Oracle service from under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services.

  2. Using regedit on the first host, export HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE.

  3. Use regedit to import the files created in step 1 and 2 to the failover host.

For Windows, NT services need to be created on the failover host. For an Enterprise Manager release 10.2.0.5 Management Agent, the following command can be used:

emctl create service [-user <username>] [-pwd <password>] -name <servicename>

This has to be done once on the failover host after doing a failover.

Starting Up Services

Ensure that you start your services in the proper order. Use the order listed below:

  1. Establish IP address on the active node

  2. Start the TNS listener (if it is part of the same failover group)

  3. Start the database (if it is part of the same failover group)

  4. Start Grid Control using opmnctl startall

  5. Test functionality

In case of failover, refer to the following steps:

  1. Establish IP on failover box

  2. Start TNS listener using the command lsnrctl start if it is part of the same failover group

  3. Start the database using the command dbstart if it is part of the same failover group

  4. Start Grid Control using the command opmnctl startall

  5. Test the functionality

Summary

The OMS mid-tier component of Grid Control can now be deployed in a CFC environments that utilize a floating host name.

To deploy the repository database in a CFC environment, see Configuring Grid Control Repository in Active/Passive High Availability Environments.

Configuring Targets for Failover in Active/Passive Environments

This section provides a general reference for Grid Control administrators who want to relocate Cold Failover Cluster (CFC) targets from one existing Management Agent to another. Although the targets are capable of running on multiple nodes, these targets run only on the active node in a CFC environment.

CFC environments generally use a combination of cluster software to provide a virtual host name and IP address along with interconnected host and storage systems to share information and provide high availability for applications. Automating failover of the virtual host name and IP, in combination with relocating the Enterprise Manager targets and restarting the applications on the passive node, requires the use of Oracle Enterprise Manager command-line interface (EM CLI) and Oracle Clusterware (running Oracle Database release 10g or 11g) or third-party cluster software. Several Oracle partner vendors provide clusterware solutions in this area.

The Enterprise Manager Command Line Interface (EM CLI) allows you to access Enterprise Manager Grid Control functionality from text-based consoles (terminal sessions) for a variety of operating systems. Using EM CLI, you can perform Enterprise Manager Grid Control console-based operations, like monitoring and managing targets, jobs, groups, blackouts, notifications, and alerts. See the Oracle Enterprise Manager Command Line Interface manual for more information.

Target Relocation in Active/Passive Environments

Beginning with Oracle Enterprise Manager 10g release 10.2.0.5, a single Oracle Management Agent running on each node in the cluster can monitor targets configured for active / passive high availability. Only one Management Agent is required on each of the physical nodes of the CFC cluster because, in case of a failover to the passive node, Enterprise Manager can move the HA monitored targets from the Management Agent on the failed node to another Management Agent on the newly activated node using a series of EMCLI commands.

If your application is running in an active/passive environment, the clusterware brings up the applications on the passive node in the event that the active node fails. For Enterprise Manager to continue monitoring the targets in this type of configuration, the existing Management Agent needs additional configuration.

The following sections describe how to prepare the environment to automate and restart targets on the new active node. Failover and fallback procedures are also provided.

Installation and Configuration

The following sections describe how to configure Enterprise Manager to support a CFC configuration using the existing Management Agents communicating with the Oracle Management Service processes:

Prerequisites

Prepare the Active/Passive environments as follows:

  • Ensure the operating system clock is synchronized across all nodes of the cluster. (Consider using Network Time Protocol (NTP) or another network synchronization method.)

  • Use the EM CLI RELOCATE_TARGETS command only with Enterprise Manager Release 10.2.0.5 (and higher) Management Agents.

Configuration Steps

The following steps show how to configure Enterprise Manager to support a CFC configuration using the existing Management Agents that are communicating with the OMS processes. The example that follows is based on a configuration with a two-node cluster that has one failover group. For additional information about targets running in CFC active/passive environments, see My Oracle Support note 406014.1.

  1. Configure EM CLI

    To set up and configure target relocation, use the Oracle Enterprise Manager command-line interface (EM CLI). See the Oracle Enterprise Manager Command Line Interface manual and the Oracle Enterprise Manager Extensibility manual for information about EM CLI and Management Plug-Ins.

  2. Install Management Agents

    Install the Management Agent on a local disk volume on each node in the cluster. Once installed, the Management Agents are visible in the Grid Control console.

  3. Discover Targets

    After the Active / Passive targets have been configured, use the Management Agent discovery screen in the Grid Control console to add the targets (such as database, listener, application server, and so on). Perform the discovery on the active node, which is the node that is currently hosting the new target.

Failover Procedure

To speed relocation of targets after a node failover, configure the following steps using a script that contains the commands necessary to automatically initiate a failover of a target. Typically, the clusterware software has a mechanism with which you can automatically execute the script to relocate the targets in Enterprise Manager. Also, see Script Examples for sample scripts.

  1. Shut down the target services on the failed active node.

    On the active node where the targets are running, shut down the target services running on the virtual IP.

  2. If required, disconnect the storage for this target on the active node.

    Shut down all the applications running on the virtual IP and shared storage.

  3. Enable the target's IP address on the new active node.

  4. If required, connect storage for the target on the currently active node.

  5. Relocate the targets in Grid Control using EM CLI.

    To relocate the targets to the Management Agent on the new active node, issue the EM CLI RELOCATE TARGET command for each target type (listener, application servers, and so on) that you must relocate after the failover operation. For example:

    emcli relocate_targets
    -src_agent=<node 1>:3872 
    -dest_agent=<node 2>:3872
    -target_name=<database_name>
    -target_type=oracle_database
    -copy_from_src 
    -force=yes
    

    In the example, port 3872 is the default port for the Management Agent. To find the appropriate port number for your configuration, use the value for the EMD_URL parameter in the emd.properties file for this Management Agent.

    Note: In case of a failover event, the source agent will not be running. However, there is no need to have the source Management Agent running to accomplish the RELOCATE operation. EM CLI is an OMS client that performs its RELOCATE operations directly against the Management Repository.

Fallback Procedure

To return the HA targets to the original active node or to any other cluster member node:

  1. Repeat the steps in Failover Procedure to return the HA targets to the active node.

  2. Verify the target status in the Grid Control console.

EM CLI Parameter Reference

Issue the same command for each target type that will be failed over to (or be switched over) during relocation operations. For example, issue the same EM CLI command to relocate the listener, the application servers, and so on. Table 18-1 shows the EM CLI parameters you use to relocate targets:

Table 18-1 EM CLI Parameters

EM CLI Parameter Description

-src_agent

Management Agent on which the target was running before the failover occurred.

-dest_agent

Management Agent that will be monitoring the target after the failover.

-target_name

Name of the target to be failed over.

-target_type

Type of target to be failed over (internal Enterprise Manager target type). For example, the Oracle database (for a standalone database or an Oracle RAC instance), the Oracle listener for a database listener, and so on.

-copy_from_src

Use the same type of properties from the source Management Agent to identify the target. This is a MANDATORY parameter! If you do not supply this parameter, you can corrupt your target definition!

-force

Force dependencies (if needed) to failover as well.


Script Examples

The following sections provide script examples:

Relocation Script

#! /bin/ksh

#get the status of the targets

emcli get_targets -
targets="db1:oracle_database;listener_db1:oracle_listener" -noheader

  if [[ $? != 0 ]]; then exit 1; fi

# blackout the targets to stop false errors.  This blackout is set to expire in 30 minutes

emcli create_blackout -name="relocating active passive test targets" -
add_targets="db1:oracle_database;listener_db1:oracle_listener" -
reason="testing failover" -
schedule="frequency:once;duration:0:30"
  if [[ $? != 0 ]]; then exit 1; fi

# stop the listener target.  Have to go out to a OS script to use the 'lsnrctl set
current_listener' function

emcli execute_hostcmd -cmd="/bin/ksh" -osscript="FILE" -
input_file="FILE:/scratch/oraha/cfc_test/listener_stop.ksh" -
credential_set_name="HostCredsNormal" -
targets="host1.us.oracle.com:host"
  if [[ $? != 0 ]]; then exit 1; fi

# now, stop the database

emcli execute_sql -sql="shutdown abort" -
targets="db1:oracle_database" -
credential_set_name="DBCredsSYSDBA"
  if [[ $? != 0 ]]; then exit 1; fi

# relocate the targets to the new host

emcli relocate_targets -
src_agent=host1.us.oracle.com:3872 -
dest_agent=host2.us.oracle.com:3872 -
target_name=db1 -target_type=oracle_database -
copy_from_src -force=yes  -
changed_param=MachineName:host1vip.us.oracle.com
  if [[ $? != 0 ]]; then exit 1; fi

emcli relocate_targets -
src_agent=host1.us.oracle.com:3872 -
dest_agent=host2.us.oracle.com:3872 -
target_name=listener_db1 -target_type=oracle_listener -
copy_from_src -force=yes  -
changed_param=MachineName:host1vip.us.oracle.com
  if [[ $? != 0 ]]; then exit 1; fi

# Now, restart database and listener on the new host

emcli execute_hostcmd -cmd="/bin/ksh" -osscript="FILE" -
input_file="FILE:/scratch/oraha/cfc_test/listener_start.ksh" -
credential_set_name="HostCredsNormal" -
targets="host2.us.oracle.com:host"
  if [[ $? != 0 ]]; then exit 1; fi

emcli execute_sql -sql="startup" -
targets="db1:oracle_database" -
credential_set_name="DBCredsSYSDBA"
  if [[ $? != 0 ]]; then exit 1; fi

# Time to end the blackout and let the targets become visible

emcli stop_blackout -name="relocating active passive test targets"
  if [[ $? != 0 ]]; then exit 1; fi

# and finally, recheck the status of the targets

emcli get_targets -
targets="db1:oracle_database;listener_db1:oracle_listener" -noheader
  if [[ $? != 0 ]]; then exit 1; fi

Start Listener Script

#!/bin/ksh

export 
ORACLE_HOME=/oradbshare/app/oracle/product/11.1.0/db
export PATH=$ORACLE_HOME/bin:$PATH

lsnrctl << EOF
set current_listener listener_db1
start
exit
EOF

Stop Listener Script

#!/bin/ksh
export 
ORACLE_HOME=/oradbshare/app/oracle/product/11.1.0/db
export PATH=$ORACLE_HOME/bin:$PATH

lsnrctl << EOF
set current_listener listener_db1
stop
exit
EOF

Configuring Additional Oracle Enterprise Management Agents for Use in Active and Passive Environments

In a Cold Failover Cluster environment, one host is considered the active node where applications are run, accessing the data contained on the shared storage. The second node is considered the standby node, ready to run the same applications currently hosted on the primary node in the event of a failure. The cluster software is configured to present a Logical Host Name and IP address. This address provides a generic location for running applications that is not tied to either the active node or the standby node.

In the event of a failure of the active node, applications can be terminated either by the hardware failure or by the cluster software. These application can then be restarted on the passive node using the same logical host name and IP address to access the new node; resuming operations with little disruption. Automating failover of the virtual host name and IP, along with starting the applications on the passive node, requires the use of the third party cluster software. Several Oracle partner vendors provide high availability solutions in this area.

Installation and Configuration

Enterprise Manager can be configured to support Cold Failover Cluster configuration in this fashion using additional Management Agents communicating to the Oracle Management Service processes.

If your application is running in an Active and Passive environment, the clusterware does the job of bringing up the passive or standby database instance in case the active database goes down. For Enterprise Manager to continue monitoring the application instance in such a scenario, the existing Management Agents need additional configuration.

The additional configuration steps for this environment involve:

  • Installing an extra Management Agent using the logical host name and IP address generated through the cluster software.

  • Modifying the targets monitored by each Management Agent once the third Management Agent is installed.

In summary, this configuration results in the installation of three Management Agents, one for each hardware node and one for the IP address generated by the cluster software. Theoretically, if the cluster software supports the generation of multiple virtual IP addresses to support multiple high availability environments, the solution outlined here should scale to support the environment.

The following table documents the steps required to configure Management Agents in a CFC environment:

Table 18-2 Steps Required to Configure Management Agents in a Cold Failover Cluster Environment

Action Method Description/Outcome Verification

Install the vendor specific cluster software

Installation method varies depending on the cluster vendor.

The minimal requirement is a 2-node cluster that supports Virtual or Floating IP addresses and shared storage.

Use the ping command to verify the existence of the floating IP address.

Use nslookup or equivalent command to verify the IP address in your environment.

Ensure the machine is reachable on the network by using tools like traceroute or tracert.

Install Management Agents to each physical node of the cluster using the physical IP address or host name as the node name.

Use the Oracle Universal Installer (OUI) to install Management Agents to each node of the cluster.

Change the property AgentListenOnAllNICS to FALSE in the local Management Agent emd.properties file.

When complete, the OUI will have installed Management Agents on each node that will be visible through the Grid Control console.

Check that the Management Agent, host, and targets are visible in the Enterprise Manager environment.

Delete targets that will be configured for high availability using the cluster software.

Using the Grid Control console, delete all targets discovered during the previous installation step that are managed by the cluster software except for the Management Agent and the host.

Grid Control Console displays the Management Agent, hardware, and any target that is not configured for high availability.

Inspect the Grid Control console and verify that all targets that will be assigned to the Management Agent running on the floating IP address have been deleted from the Management Agents monitoring the fixed IP addresses.

Install a third Management Agent to the cluster using the logical IP address or logical host name as the host specified in the OUI at install time.

Note: This installation should not detect or install to more than one node.

This Management Agent must follow all the same conventions as any application using the cluster software to move between nodes (that is, installed on the shared storage using the logical IP address).

This installation requires an additional option to be used at the command line during installation time. The 'HOSTNAME' flag must be set as in the following example:

(/144)-

>./runInstaller HOSTNAME=<Logical IP address or host name>

Third Management Agent installed, currently monitoring all targets discovered on the host running physical IP.

To verify the Management Agent is configured correctly, type emctl status agent at the command line and verify the use of the logical IP virtual host name. Also, verify that the Management Agent is set to the correct Management Service URL and that the Management Agent is uploading the files.

When the Management Agent is running and uploading data, use the Grid Control console to verify that it has correctly discovered targets that will move to the standby node during a failover operation.

Delete any targets from the Management Agent monitoring the logical IP that will not switch to the passive node during failover.

Use the Grid Control console to delete any targets that will not move between hosts in a switchover or failover scenario. These might be targets that are not attached to this logical IP address for failover or are not configured for redundancy.

Grid Control console is now running three Management Agents. Any target that is configured for switchover using cluster software will be monitored by a Management Agent that will transition during switchover or failover operations.

The operation is also verified by inspecting the Grid Control console. All targets that will move between nodes should be monitored by the Management Agent running on the virtual host name. All remaining targets should be monitored by a Management Agent running on an individual node.

Add the new logical host to the cluster definition.

Using the All Targets tab on the Grid Control console, find the cluster target and add the newly discovered logical host to the existing cluster target definition.

It is also possible (not required) to use the Add Cluster Target option on the All Targets tab, making a new composite target using the nodes of the cluster.

The Grid Control console will now correctly display all the hosts associated with the cluster.

Place the Management Agent process running on the logical IP under the control of the cluster software.

This will vary based on the cluster software vendor.

Management Agent will transition along with applications.

A suggested order of operation is covered in the next section.

Verify that the Management Agent can be stopped and restarted on the standby node using the cluster software.


Switchover Steps

Each cluster vendor will implement the process of building a wrapper around the steps required to do a switchover or failover in a different fashion. The steps themselves are generic and are listed here:

  • Shut down the Management Agent

  • Shut down all the applications running on the virtual IP and shared storage

  • Switch the IP and shared storage to the new node

  • Restart the applications

  • Restart the Management Agent

Stopping the Management Agent first, and restarting it after the other applications have started, prevents Enterprise Manager from triggering any false target down alerts that would otherwise occur during a switchover or failover.

Performance Implications

While it is logical to assume that running two Management Agent processes on the active host may have some performance implications, this was not shown during testing. Keep in mind that if the Management Agents are configured as described in this chapter, the Management Agent monitoring the physical host IP will only have two targets to monitor. Therefore the only additional overhead is the two Management Agent processes themselves and the commands they issue to monitor a Management Agent and the operating system. During testing, it was noticed that an overhead of between 1-2% of CPU usage occurred.

Summary

Generically, configuring Enterprise Manager to support Cold Cluster Failover environments encompasses the following steps.

  • Install a Management Agent for each virtual host name that is presented by the cluster and insure that the Management Agent is correctly communicating to the Management Service.

  • Configure the Management Agent that will move between nodes to monitor the appropriate highly available targets.

  • Verify that the Management Agent can be stopped on the primary node and restarted on the secondary node automatically by the cluster software in the event of a switchover or failover.