Skip Headers
Oracle® Clusterware Administration and Deployment Guide
11g Release 1 (11.1)

B28255-07
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

1 Introduction to Oracle Clusterware

This chapter introduces Oracle Clusterware and describes how to install, administer, and deploy it. This chapter includes the following topics:

What is Oracle Clusterware?

Oracle Clusterware is software that enables servers to operate together as if they are one server. Each server looks like any standalone server. However, each server has additional processes that communicate with each other so the separate servers appear as if they are one server to applications and end users.

Figure 1-1 shows a configuration that uses Oracle Clusterware to extend the basic single-instance Oracle Database architecture. In the figure, both Cluster 1 and Cluster 2 are connected to Oracle Database and are actively servicing applications and users. Using Oracle Clusterware, you can use the same high availability mechanisms to make your Oracle database and your custom applications highly available.

Figure 1-1 Oracle Clusterware Configuration

Description of Figure 1-1 follows
Description of "Figure 1-1 Oracle Clusterware Configuration"

The benefits of using a cluster include:

You can program Oracle Clusterware to manage the availability of user applications and Oracle databases. In an Oracle Real Application Clusters (Oracle RAC) environment, Oracle Clusterware manages all of the Oracle Database processes automatically. Anything that Oracle Clusterware manages is known as a cluster resource, which could be a database, an instance, a service, a listener, a virtual IP (VIP) address, an application process, and so on.

Creating a cluster with Oracle Clusterware provides the ability to:

Oracle Clusterware is a requirement for using Oracle RAC and it is the only clusterware that you need for most platforms on which Oracle RAC operates. Although Oracle RAC continues to support select third-party clusterware products on specific platforms, you must also install and use Oracle Clusterware. Note that the servers on which you want to install and run Oracle Clusterware must be running the same operating system.

Using Oracle Clusterware eliminates the need for proprietary vendor clusterware and provides the benefit of using only Oracle software. Oracle provides an entire software solution, including everything from disk management with Oracle Automatic Storage Management (ASM) to data management with Oracle Database and Oracle RAC. In addition, Oracle Database features, such as Oracle Services, provide advanced features when used with the underlying Oracle Clusterware high availability framework.

Oracle Clusterware requires two components: a voting disk to record node membership information and the Oracle Cluster Registry (OCR) to record cluster configuration information. The voting disk and the OCR must reside on shared storage.

To use and install Oracle Clusterware, you need to understand the hardware and software concepts and requirements, as described in the following sections:

Oracle Clusterware Hardware Concepts and Requirements

Many hardware providers have validated cluster configurations that provide a single part number for a cluster. If you are new to clustering, the information in this section will make the hardware procurement easier when you work with hardware vendors to purchase the appropriate hardware to create a cluster.

A cluster is made up of one or more servers. A server in a cluster is similar to any standalone server, but a cluster requires a second network called the interconnect network. Therefore, the server minimally requires two network interface cards: one for the public network and one for the private network. The interconnect network is a private network using a switch (or multiple switches) that only the nodes in the cluster can access.Foot 1  Crossover cables are not supported for use with Oracle Clusterware interconnects.

The size of the server is dictated by the requirements of the workload you want to run on the cluster and the number of nodes you have chosen to configure in the cluster. If you are implementing the cluster for high availability, then configure redundancy for all components of the infrastructure. Therefore, you need to configure:

  • A network interface for the public network (generally this is an internal LAN)

  • A redundant network interface for the public network

  • A network interface for the private interconnect network

  • A redundant network interface for the private interconnect network

The cluster requires cluster-aware storageFoot 2  that is connected to each server in the cluster. This may also be referred to as a multihost device. Oracle Database supports both Storage Area Network (SAN) storage or Network Attached (NAS) storage.

Similar to the network, there are generally at least two connections from each server to the cluster-aware storage to provide redundancy. There may be more connections depending on your I/O requirements. It is important to consider the I/O requirements of the entire cluster when choosing the storage subsystem.

Most servers have at least one local disk that is internal to the server. Often, this disk is used for the operating system binaries and you can also use this disk for the Oracle binaries. The benefit of each server having its own copy of the binaries is that it simplifies rolling upgrades.

Oracle Clusterware Software Concepts and Requirements

Oracle Clusterware uses a shared common disk for its configuration files.

Oracle Clusterware requires two configuration files: a voting disk to record node membership information and the OCR to record cluster configuration information. During the Oracle Clusterware installation, Oracle recommends that you configure multiple voting disks and the OCR:

  • Voting Disk

    Oracle Clusterware uses the voting disk to determine which instances are members of a cluster. The voting disk must reside on a shared disk. For high availability, Oracle recommends that you have a minimum of three voting disks. If you configure a single voting disk, then you should use external mirroring to provide redundancy. You can have up to 32 voting disks in your cluster.

  • Oracle Cluster Registry (OCR)

    Oracle Clusterware uses the OCR to store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, VIPs, and services and any applications. The OCR repository stores configuration information in a series of key-value pairs in a directory tree structure.

    Oracle recommends that you use a multiplexed OCR to ensure cluster high availability. Consider the following points regarding the OCR:

    • The OCR must reside on a shared disk that is accessible by all of the nodes in the cluster.

    • Oracle Clusterware can multiplex the OCR.

    • You can replace a failed OCR online.

    • You must update the OCR through supported APIs such as Oracle Enterprise Manager, the Server Control Utility (SRVCTL), or the Database Configuration Assistant (DBCA).

    • Oracle Clusterware requires that each node be connected to a private network by way of a private interconnect. For redundancy, you can have up to 32 voting disks and a mirror of the OCR.

    See Also:

    Chapter 2, "Administering Oracle Clusterware" for more information about voting disks and the OCR

Oracle Clusterware requires a virtual IP address for each node in the cluster. This IP address must be on the same subnet as the public IP address for the node and should be an address that is assigned a name in the Domain Name Service, but is unused and cannot be pinged in the network before installation of Oracle Clusterware. The VIP is a node application (nodeapp) defined in the OCR that is managed by Oracle Clusterware. The VIP is configured with the VIPCA utility. The root script calls the VIPCA utility in silent mode.

Each server must first have an operating system that is certified with the Oracle Clusterware version you are installing. See the certification matrices available on OracleMetaLink (http://certify.oraclecorp.com/certifyv3/certify/cert_views.group_selection?p_html_source=0) for details. Once the operating system is installed and working, you can then install Oracle Clusterware to create the cluster.

Oracle Clusterware is installed independently of Oracle Database. Once Oracle Clusterware is installed, you can install ASM, Oracle Database, or Oracle RAC on any of the nodes in the cluster.

See Also:

Your platform-specific Oracle database installation documentation

Overview of Oracle Clusterware Platform-Specific Software Components

When Oracle Clusterware operates, several platform-specific processes or services also run on each node in the cluster. The UNIX, Linux, and Windows processes are described in the following sections:

Oracle Clusterware Processes on Linux and UNIX Systems

Oracle Clusterware processes on Linux and UNIX systems include the following:

  • crsd—Performs high availability recovery and management operations such as maintaining the OCR and managing application resources. This process runs as LocalSystem. This process restarts automatically upon failure.

  • evmd—Event manager daemon. This process also starts the racgevt process to manage FAN server callouts.

  • ocssd—Manages cluster node membership and runs as the oracle user; failure of this process results in a node restart.

  • oprocd—Process monitor for the cluster. Note that this process only appears on platforms that do not use third-party vendor clusterware with Oracle Clusterware.

Note:

Oracle Clusterware on Linux platforms can have multiple threads that appear as separate processes with separate process identifiers.

Oracle Clusterware Services on Windows Systems

Oracle Clusterware services on Windows systems include the following:

  • OracleCRService—Performs high availability recovery and management operations such as maintaining the OCR and managing application resources. This process as the LocalSystem user on Windows. This process restarts automatically upon failure.

  • OracleCSService—Manages cluster node membership and runs as the oracle user who installed Oracle Clusterware; failure of this process results in node restart.

  • OracleEVMService—Event manager service. This process also starts the racgevt process to manage FAN server callouts.

  • OraFenceService—Process monitor for the cluster.

  • Oracle Process Manager Daemon (OPMD)—OPMD is registered with the Windows Service Control Manager (WSCM) and the startup of all Oracle Clusterware services is dependent on OPMD. On system startup, and after the default time period of 60 seconds has elapsed, OPMD automatically starts all of the registered Oracle Clusterware services. This startup delay enables other services to start that are outside of the scope of Oracle control, such as storage access, anti-virus, or firewall services. You can set OPMD to start manually. However, this will delay the startup of the rest of the affected Oracle Clusterware components.

Oracle Clusterware Subcomponent Processes and Background Processes

Oracle Clusterware comprises several processes that facilitate cluster operations. The Cluster Ready Services (CRS), Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Clusterware components communicate with other cluster component layers in the other instances in the same cluster database environment. These components are also the main communication links between Oracle Database, applications, and the Oracle Clusterware high availability components. In addition, these background processes monitor and manage database operations.

See Also:

Chapter 5, "Making Applications Highly Available Using Oracle Clusterware" for more detailed information about the Oracle Clusterware API

The following list describes some of the major Oracle Clusterware background processes. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

  • Cluster Ready Services (CRS)—The primary program for managing high availability operations in a cluster. Anything that the CRS process manages is known as a cluster resource, which could be a database, an instance, a service, a listener, a virtual IP (VIP) address, an application process, and so on. The CRS process manages cluster resources based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. The CRS process generates events when a resource status changes. When you have installed Oracle RAC, the CRS process monitors the Oracle database instance, listener, and so on, and automatically restarts these components when a failure occurs. By default, the CRS process makes three attempts to start the Oracle Notification Service (ONS), one attempt to start an Oracle database, and five attempts to restart other database components. The CRS process does not attempt to restart the VIP. After these initial attempts, the CRS process does not make further restart attempts if the resource does not restart.

  • Cluster Synchronization Services (CSS)—Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then the css process interfaces with your clusterware to manage node membership information.

  • Event Management (EVM)—A background process that publishes events that Oracle Clusterware creates.

  • Oracle Notification Service (ONS)—A publish and subscribe service for communicating Fast Application Notification (FAN) events.

  • Oracle Process Monitor Daemon (OPROCD)—This process (the OraFenceService service in Windows) is locked in memory to monitor the cluster and provide I/O fencing. The OPROCD periodically wakes up and checks that the interval since it last awoke is within the expected time. If not, then OPROCD resets the processor and restarts the node. An OPROCD failure results in Oracle Clusterware restarting the node.

  • RACG—Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.

In Table 1-1, if a UNIX or a Linux system process has an (r) beside it, then the process runs as the root user. If a Windows system service has an (A) beside it, then the service runs as the Administrative user. Otherwise the process or service runs as the oracle user.

Table 1-1 List of Processes and Services Associated with Oracle Clusterware

Oracle Clusterware Component Linux/UNIX Process Windows Services Windows Processes

Oracle Process Monitor Daemon

oprocd (r)

OraFenceService

 

RACG

racgmain, racgimon

 

racgmain.exe racgimon.exe

Oracle Notification Service (ONS)

ons

 

ons.exe

Event Manager

evmd (r), evmd.bin, evmlogger

OracleEVMService

evmlogger.exe, evmd.exe

Cluster Ready Services (CRS)

crsd.bin (r)

OracleCRService (A)

OracleCRSToken_user

crsd.exe

Cluster Synchronization Services (CSS)

init.cssd (r), ocssd (r), ocssd.bin

OracleCSService

ocssd.exe


See Also:

"Clusterware Log Files and the Unified Log Directory Structure" for information about the location of log files created for processes

Overview of Installing Oracle Clusterware

Install Oracle Clusterware with the Oracle Universal Installer.

The following sections introduce the installation processes for Oracle Clusterware:

Oracle Clusterware Version Compatibility

You can install different releases of Oracle Clusterware, ASM, and Oracle Database software on your cluster. Follow these guidelines when installing different releases of software on your cluster:

  • There can be only be one installation of Oracle Clusterware running in the cluster, and it must be installed into its own home (CRS_home). The release of Oracle Clusterware you use must be equal to, or higher than the ASM and Oracle RAC versions running in the cluster; you cannot install a version of Oracle RAC that was released after the version of Oracle Clusterware that you are running on the cluster. That is:

    • Oracle Clusterware Release 11.1 supports ASM Release 11.1, 10.2, and 10.1.

    • Oracle Clusterware Release 11.1 supports Oracle Database 11g Release 1 (11.1), Oracle Database 10g Release 2 (10.2), and Release 1 (10.1).

    • ASM Release 11.1 requires Oracle Clusterware Release 11.1 and supports Oracle Database 11g Release 1 (11.1), Oracle Database 10g Release 2 (10.2), and Release 1 (10.1).

    • Oracle Database 11g Release 1 (11.1) requires Oracle Clusterware Release 11.1 and (if you are using ASM storage) you can run different releases of Oracle Database and ASM.

      For example:

      • If you have Oracle Clusterware Release 11.1 installed as your clusterware, then you can have an Oracle Database 10g Release 1 (10.1) single-instance database running on one node, and separate Oracle Real Application Clusters 10g Release 1, Release 2, and Oracle Real Application Clusters 11g Release 1 databases also running on the cluster. However, you cannot have Oracle Clusterware 10g Release 2 installed on your cluster, and install Oracle Real Application Clusters 11g. You can install Oracle Database 11g (single-instance) on a node in an Oracle Clusterware 10g Release 2 cluster.

      • When using different release ASM and Oracle Database releases, the functionality of each is dependent on the functionality of the earlier software release. Thus, if you install Oracle Clusterware 11g and you later install ASM, and you use it to support an existing Oracle Database 10g Release 10.2.0.3 installation, then ASM functionality is equivalent only to that available in the 10.2 release version.

  • There can be multiple Oracle homes for the Oracle database (both single instance and Oracle RAC) in the cluster. Note that the Oracle RAC databases must be running Oracle Database 10g Release 1 (10.1) or higher.

  • You can use different users for the Oracle Clusterware and Oracle database homes as long as they belong to the same primary group.

  • There can only be one installation of ASM running in the cluster. It is recommended that ASM is running the same (or higher) release than that of the Oracle database.

  • For Oracle RAC running Oracle9i you must run an Oracle9i cluster. For UNIX systems, that is HACMP, Serviceguard, Sun Cluster, or Veritas SF. For Windows and Linux systems, that is the Oracle Cluster Manager. If you want to install Oracle RAC 10g, then you must also install Oracle Clusterware.

  • You cannot install Oracle9i RAC on an Oracle Database 10g cluster. If you have an Oracle9i RAC cluster, you can add Oracle RAC 10g and they will work together. However, once you have installed Oracle Clusterware 10g, you can no longer install any new Oracle9i RAC.

  • Oracle recommends that you do not run different cluster software on the same servers unless they are certified to work together. However, if you are adding Oracle RAC to servers that are part of a cluster, either migrate to Oracle Clusterware or ensure that:

    • The clusterware you are running is supported to run with Oracle RAC Release 10g.

    • You have installed the correct options for Oracle Clusterware and the other-vendor clusterware to work together.

See Also:

Your platform-specific Oracle Clusterware installation guide for more version compatibility information

About the Oracle Clusterware Installation

This section discusses Oracle Clusterware installations at a high level. For detailed installation instructions, see your platform-specific Oracle Clusterware installation guide.

Oracle Clusterware is distributed on the Oracle database installation media. The Oracle Universal Installer installs Oracle Clusterware into a directory structure referred to as CRS home. This home is separate from the home directories for other Oracle products installed on the same server. Oracle Universal Installer creates the Oracle Clusterware home directory for you. Before you start the installation, you must have sufficient disk space on a file system for the Oracle Clusterware directory. As a part of the installation and configuration, the CRS home and all of its parent directories are changed to be owned by the root user.

Because Oracle Clusterware works closely with the operating system, system administrator access is required for some of the installation tasks. In addition, some of the Oracle Clusterware processes must run as the system administrator, which is generally the root user on Linux and UNIX systems and the LocalSystem account on Windows systems.

Before you install Oracle Clusterware, Oracle recommends that you run the Cluster Verification Utility (CVU) to ensure that your environment meets the Oracle Clusterware installation requirements. Oracle Universal Installer also automatically runs CVU at the end of the clusterware installation to verify various clusterware components. The CVU simplifies the installation, configuration, and overall management of the Oracle Clusterware installation process by identifying problems in cluster environments.

During the Oracle Clusterware installation, you must identify three IP addresses for each node that is going to be part of your installation. One IP address is for the private interconnect, one is for the public interconnect, and the third IP address is the virtual IP address that clients will use to connect to each instance.

The Oracle Clusterware installation process creates the voting disk and OCR on shared storage. When you use normal redundancy, Oracle Clusterware maintains two copies of the OCR file and three copies of the voting disk file. This prevents the files from becoming single points of failure. Normal redundancy also eliminates the need for third-party storage redundancy solutions.

Note:

If you choose external redundancy for the OCR and voting disk, then to enable redundancy, the disk subsystem must be configurable for RAID mirroring. Otherwise, your system may be vulnerable because the OCR and voting disk are single points of failure.

Overview of Managing Oracle Clusterware Environments

The following list describes the tools and utilities available to manage your Oracle Clusterware environment:

Overview of Extending or Removing Oracle Clusterware in Grid Environments

You can extend Oracle Clusterware in grid environments that have large numbers of nodes using cloned images of Oracle Clusterware homes. Oracle cloning is the preferred method of creating many new clusters by copying images of Oracle Clusterware software to other nodes that have similar hardware and software. Cloning is best suited to scenarios where you need to quickly create several clusters of the same configuration.

Oracle provides the following methods of extending Oracle Clusterware environments:

For new installations or if you have to install only one cluster, then you should use the traditional automated and interactive installation methods, such as Oracle Universal Installer or the Provisioning Pack feature of Oracle Enterprise Manager. If your goal is to add or delete Oracle Clusterware from nodes in the cluster, you can use the addNode.sh and rootdelete.sh scripts.

The cloning process assumes you successfully installed an Oracle Clusterware home on at least one node using the instructions in your platform-specific Oracle Clusterware installation guide. In addition, ensure that all root scripts run successfully on the node from which you are extending your cluster.

See Also:

Overview of the Oracle Clusterware High Availability Framework and Application Programming Interface

Oracle Clusterware provides a high availability application programming interface (API) that you use to enable Oracle Clusterware to manage applications or processes that run a cluster. The API enables you to provide high availability for all of your applications. Oracle Clusterware with ASM enables you to create a consolidated pool of storage to support both single-instance Oracle databases and the Oracle RAC databases that are running.

You can define a virtual IP address for an application so users can access the application independently of the node in the cluster where the application is running. This is referred to as the application VIP. You can define multiple application VIPs, with generally one application VIP defined for each application running. The application VIP is tied to the application by making it dependent on the application resource defined by Cluster Ready Services (CRS).

To maintain high availability, Oracle Clusterware components can respond to status changes to restart applications and processes according to defined high availability rules. You can use the Oracle Clusterware high availability framework by registering your applications with Oracle Clusterware and configuring the clusterware to start, stop, or relocate your application processes. That is, you can make custom applications highly available by using Oracle Clusterware to create profiles that monitor, relocate, and restart your applications.

For Oracle RAC to respond consistently and quickly to a failure, the virtual IP address removes network timeouts from the recovery process. When a node fails, its virtual IP relocates to another node in the cluster.

See Also:

Chapter 5, "Making Applications Highly Available Using Oracle Clusterware" for more detailed information about the Oracle Clusterware API


Footnote Legend

Footnote 1: Oracle Clusterware supports up to 100 nodes in a cluster on configurations running Oracle Database 10g Release 2 (10.2) and later releases.
Footnote 2: Cluster-aware storage may also be referred to as a multihost device.