4 Installing Oracle Clusterware

This chapter describes the procedures for installing Oracle Clusterware on Linux. If you are installing Oracle Database 10g Real Application Clusters, then this is phase one of a two-phase installation. The topics in this chapter are:

4.1 Verifying Oracle Clusterware Requirements with CVU

Using the following command syntax, start the Cluster Verification Utility (CVU) to check system requirements for installing Oracle Clusterware:

/mountpoint/crs/Disk1/cluvfy/runcluvfy.sh stage -pre crsinst -n node_list 

In the preceding syntax example, replace the variable mountpoint with the installation media mountpoint, and replace the variable node_list with the names of the nodes in your cluster, separated by commas.

For example, for a cluster with mountpoint /dev/dvdrom/, and with nodes node1, node2, and node3, enter the following command:

/dev/dvdrom/crs/Disk1/cluvfy/runcluvfy.sh stage -pre crsinst -n node1,node2,node3

The CVU Oracle Clusterware pre-installation stage check verifies the following:

  • Node Reachability: All of the specified nodes are reachable from the local node.

  • User Equivalence: Required user equivalence exists on all of the specified nodes.

  • Node Connectivity: Connectivity exists between all the specified nodes through the public and private network interconnections, and at least one subnet exists that connects each node and contains public network interfaces that are suitable for use as virtual IPs (VIPs).

  • Administrative Privileges: The oracle user has proper administrative privileges to install Oracle Clusterware on the specified nodes.

  • Shared Storage Accessibility: If specified, the OCR device and voting disk are shared across all the specified nodes.

  • System Requirements: All system requirements are met for installing Oracle Clusterware software, including kernel version, kernel parameters, memory, swap directory space, temporary directory space, and required users and groups.

  • Kernel Packages: All required operating system software packages are installed.

  • Node Applications: The virtual IP (VIP), Oracle Notification Service (ONS) and Global Service Daemon (GSD) node applications are functioning on each node.

4.1.1 Troubleshooting Oracle Clusterware Setup

If the CVU report indicates that your system fails to meet the requirements for Oracle Clusterware installation, then use the topics in this section to correct the problem or problems indicated in the report, and run the CVU command again.

User Equivalence Check Failed
Cause: Failure to establish user equivalency across all nodes. This can be due to not creating the required users, or failing to complete secure shell (SSH) configuration properly.
Action: The CVU provides a list of nodes on which user equivalence failed. For each node listed as a failure node, review the oracle user configuration to ensure that the user configuration is properly completed, and that SSH configuration is properly completed.

Use the command su - oracle and check user equivalence manually by running the ssh command on the local node with the date command argument using the following syntax:

ssh node_name date

The output from this command should be the timestamp of the remote node identified by the value that you use for node_name. If ssh is in the default location, the /usr/bin directory, then use ssh to configure user equivalence. You can also use rsh to confirm user equivalence.

If you have not attempted to use SSH to connect to the host node before running, then CVU indicates a user equivalence error. If you see a message similar to the following when entering the date command with SSH, then this is the probable cause of the user equivalence error:

The authenticity of host 'node1 (140.87.152.153)' can't be established.
RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9.
Are you sure you want to continue connecting (yes/no)?

Enter yes, and then run CVU again to determine if the user equivalency error is resolved.

If ssh is in a location other than the default, /usr/bin, then CVU reports a user equivalence check failure. To avoid this error, navigate to the directory $CV_HOME/cv/admin, open the file cvu_config with a text editor, and add or update the key ORACLE_SRVM_REMOTESHELL to indicate the ssh path location on your system. For example:

# Locations for ssh and scp commands
ORACLE_SRVM_REMOTESHELL=/usr/local/bin/ssh
ORACLE_SRVM_REMOTECOPY=/usr/local/bin/scp

Note the following rules for modifying the cvu_config file:

  • Key entries have the syntax name=value

  • Each key entry and the value assigned to the key defines one property only

  • Lines beginning with the number sign (#) are comment lines, and are ignored

  • Lines that do not follow the syntax name=value are ignored

When you have changed the path configuration, run the CVU check again. If ssh is in another location than the default, you also need to start OUI with additional arguments to specify a different location for the remote shell and remote copy commands. Enter runInstaller -help to obtain information about how to use these arguments.

Note:

When you or OUI run ssh or rsh commands, including any login or other shell scripts they start, you may see errors about invalid arguments or standard input if the scripts generate any output. You should correct the cause of these errors.

To stop the errors, remove all commands from the oracle user's login scripts that generate output when you run ssh or rsh commands.

If you see messages about X11 forwarding, then perform step 6 in Chapter 2, "Enabling SSH User Equivalency on Cluster Member Nodes" to resolve this issue.

If you see errors similar to the following:

stty: standard input: Invalid argument
stty: standard input: Invalid argument

These errors are produced if hidden files on the system (for example, .bashrc or .cshrc) contain stty commands. If you see these errors, then refer to Chapter 2, "Preventing Oracle Clusterware Installation Errors Caused by stty Commands" to correct the cause of these errors.

Node Reachability Check or Node Connectivity Check Failed
Cause: One or more nodes in the cluster cannot be reached using TCP/IP protocol, through either the public or private interconnects.
Action: Use the command /usr/sbin/ping address to check each node address. When you find an address that cannot be reached, check your list of public and private addresses to make sure that you have them correctly configured. Ensure that the public and private network interfaces have the same interface names on each node of your cluster.
User Existence Check or User-Group Relationship Check Failed
Cause: The administrative privileges for users and groups required for installation are missing or incorrect.
Action: Use the id command on each node to confirm that the oracle user is created with the correct group membership. Ensure that you have created the required groups, and create or modify the user account on affected nodes to establish required group membership.

See Also:

"Creating Required Operating System Groups and User" in Chapter 2 for instructions about how to create required groups, and how to configure the oracle user

4.2 Preparing to Install Oracle Clusterware with OUI

Before you install Oracle Clusterware with Oracle Universal Installer (OUI), use the following checklist to ensure that you have all the information you will need during installation, and to ensure that you have completed all tasks that must be done before starting to install Oracle Clusterware. Mark the check box for each task as you complete it, and write down the information needed, so that you can provide it during installation.

  • Shut Down Running Oracle Processes

    If you are installing Oracle Clusterware on a node that already has a single-instance Oracle Database 10g installation, then stop the existing ASM instances. After Oracle Clusterware is installed, start up the ASM instances again. When you restart the single-instance Oracle database and then the ASM instances, the ASM instances use the Cluster Synchronization Services (CSSD) Daemon instead of the daemon for the single-instance Oracle database.

    You can upgrade some or all nodes of an existing Cluster Ready Services installation. For example, if you have a six-node cluster, then you can upgrade two nodes each in three upgrading sessions.Base the number of nodes that you upgrade in each session on the load the remaining nodes can handle. This is called a "rolling upgrade."

    If a Global Services Daemon (GSD) from Oracle9i Release 9.2 or earlier is running, then stop it before installing Oracle Database 10g Oracle Clusterware by running the following command:

    ORACLE_home/bin/gsdctl stop
    

    where Oracle_home is the Oracle Database home that is running the GSD.

    Caution:

    If you have an existing Oracle9i Release 2 (9.2) Oracle Cluster Manager (Oracle CM) installation, then do not shut down the Oracle CM service. Doing so prevents the Oracle Clusterware 10g Release 2 (10.2) software from detecting the Oracle9i Release 2 nodelist, and causes failure of the Oracle Clusterware installation.

    Note:

    If you receive a warning to stop all Oracle services after starting OUI, then run the command
    Oracle_home/bin/localconfig delete
    

    where Oracle_home is the home that is running CSS.

  • Prepare for Clusterware Upgrade If You Have Existing Oracle Cluster Ready Services Software

    During an Oracle Clusterware installation, if OUI detects an existing Oracle Database 10g Release 1 (10.1) Cluster Ready Services (CRS), then you are given the option to perform a rolling upgrade by installing Oracle Database 10g Release 2 (10.2) Oracle Clusterware on a subset of cluster member nodes.

    If you intend to perform a rolling upgrade, then you should shut down the CRS stack on the nodes you intend to upgrade, and unlock the CRS home using the script mountpoint/clusterware/upgrade/preupdate.sh, which is available on the 10g Release 2 (10.2) installation media.

    If you intend to perform a standard upgrade, then shut down the CRS stack on all nodes, and unlock the CRS home using the script mountpoint/clusterware/upgrade/preupdate.sh.

    When you run OUI and select the option to install Oracle Clusterware on a subset of nodes, OUI installs Oracle Database 10g Release 2 (10.2) Oracle Clusterware software into the existing CRS home on the local and remote node subset. When you run the root script, it starts the Oracle Clusterware 10g Release 2 (10.2) stack on the subset cluster nodes, but lists it as an inactive version.

    When all member nodes of the cluster are running Oracle Clusterware 10g Release 2 (10.2), then the new clusterware becomes the active version.

    If you intend to install RAC, then you must first complete the upgrade to Oracle Clusterware 10g Release 2 (10.2) on all cluster member nodes before you install the Oracle Database 10g Release 2 (10.2) version of RAC.

  • Prevent Oracle Clusterware Installation Errors Caused by stty Commands

    During an Oracle Clusterware installation, OUI uses SSH (if available) to run commands and copy files to the other nodes. If you see errors similar to the following, then you have hidden files on the system (for example, .bashrc or .cshrc) that contain stty commands:

    stty: standard input: Invalid argument
    stty: standard input: Invalid argument
    

    If you see these errors, then stop the installation, and refer to Chapter 2, "Preventing Oracle Clusterware Installation Errors Caused by stty Commands" to correct the cause of these errors.

  • Determine the Oracle Inventory location

    If you have already installed Oracle software on your system, then OUI detects the existing Oracle Inventory directory from the /etc/oraInst.loc file, and uses this location.

    If you are installing Oracle software for the first time on your system, and your system does not have an Oracle inventory, then you are asked to provide a path for the Oracle inventory, and you are also asked the name of the Oracle Inventory group (typically, oinstall).

    See Also:

    The pre-installation chapters in Part II for information about creating the Oracle Inventory, and completing required system configuration
  • Obtain root account access

    During installation, you are asked to run configuration scripts as the root user. You must run these scripts as root, or be prepared to have your system administrator run them for you.

  • Decide if you want to install other languages

    During installation, you are asked if you want to install additional languages other than the default.

    Note:

    If the language set for the operating system is not supported by Oracle Universal Installer, then Oracle Universal Installer, by default, runs in the English language.
  • Determine your cluster name, public node names, private node names, and virtual node names for each node in the cluster

    If you install the clusterware during installation, then you are asked to provide a public node name and a private node name for each node.

    When you enter the public node name, use the primary host name of each node. In other words, use the name displayed by the hostname command. This node name can be either the permanent or the virtual host name.

    In addition, ensure that the following are true:

    • Determine a cluster name with the following characteristics:

      • It must be globally unique throughout your host domain.

      • It must be at least one character long and less than 15 characters long.

      • It must consist of the same character set used for host names: underscores (_), hyphens (-), and single-byte alphanumeric characters (a to z, A to Z, and 0 to 9).

    • Determine a private node name or private IP address for each node. The private IP address is an address that is accessible only by the other nodes in this cluster. Oracle Database uses private IP addresses for internode, or instance-to-instance Cache Fusion traffic. Oracle recommends that you provide a name in the format public_hostname-priv. For example: myclstr2-priv.

    • Determine a virtual host name for each node. A virtual host name is a public node name that is used to reroute client requests sent to the node if the node is down. Oracle Database uses VIPs for client-to-database connections, so the VIP address must be publicly accessible. Oracle recommends that you provide a name in the format public_hostname-vip. For example: myclstr2-vip.

      Note:

      The following is a list of additional information about node IP addresses:
      • For the local node only, OUI automatically fills in public, private, and VIP fields. If your system uses vendor clusterware, then OUI may fill additional fields.

      • Host names, private names, and virtual host names are not domain-qualified. If you provide a domain in the address field during installation, then OUI removes the domain from the address.

      • Private IP addresses should not be accessible as public interfaces. Using public interfaces for Cache Fusion can cause performance problems.

  • Identify shared storage for Oracle Clusterware files and prepare disk partitions if necessary

    During installation, you are asked to provide paths for two files that must be shared across all nodes of the cluster, either on a shared raw device, or a shared file system file:

    • The voting disk is a partition that Oracle Clusterware uses to verify cluster node membership and status.

      The voting disk must be owned by the oracle user, must be in the dba group, and must have permissions set to 644. Provide at least 256MB disk space for each voting disk.

    • The Oracle Cluster Registry (OCR) contains cluster and database configuration information for the RAC database and for Oracle Clusterware, including the node list, and other information about cluster configuration and profiles.

      The OCR disk must be owned by root, must be in the oinstall group, and must have permissions set to 640. Provide at least 256 MB disk space for the OCR.

    If your disks do not have external storage redundancy, then Oracle recommends that you provide one additional location for the OCR disk, and two additional locations for the voting disk. Creating redundant storage locations protects the OCR and voting disk in the event of a disk failure on the partitions you choose for the OCR and the voting disk.

    See Also:

    The pre-installation chapter in Part II for information about the minimum raw device sizes

4.3 Preparing to Install Oracle Clusterware on IBM zSeries Based Linux

Because zSeries systems do not support the direct attachment of DVD-ROM drives, you must copy the installation files from the discs to a hard disk on a system that does support a DVD-ROM drive, or download installation files to your system from the Oracle Technology Network Web site:

http://www.oracle.com/technology/software

If you copy the installation files to another system, then:

For each disc, create a directory named Diskn, where n is the disc number, and then copy the files from the disc to that directory.

After you have copied the installation files, you can use one of the following methods to access them on the zSeries based Linux system:

  • Copy the installation files to the zSeries based Linux system (for example, using FTP).

  • Use a method such as NFS or Samba to make the file system containing the installation files available on the zSeries based Linux system.

4.4 Installing Oracle Clusterware with OUI

This section provides you with information about how to use Oracle Universal Installer (OUI) to install Oracle Clusterware. It contains the following sections:

4.4.1 Running OUI to Install Oracle Clusterware

Complete the following steps to install Oracle Clusterware on your cluster. At any time during installation, if you have a question about what you are being asked to do, click the Help button on the OUI page.

  1. Start the runInstaller command from the clusterware directory on the Oracle Database 10g Release 2 (10.2) installation media. When OUI displays the Welcome page, click Next.

  2. Provide information or run scripts as root when prompted by OUI. If you need assistance during installation, click Help.

  3. After you run root.sh on all the nodes, OUI runs the Oracle Notification Server Configuration Assistant, Oracle Private Interconnect Configuration Assistant, and Cluster Verification Utility. These programs run without user intervention.

When you have verified that your Oracle Clusterware installation is completed successfully, Oracle Clusterware installation is complete.

If you intend to install Oracle Database 10g with RAC, then continue to Chapter 5, "t Installing Oracle Database 10g with Oracle Real Application Clusters". If you intend to use Oracle Clusterware by itself, then refer to the single-instance Oracle Database installation guide.

4.4.2 Installing Oracle Clusterware Using a Cluster Configuration File

During installation of Oracle Clusterware, on the Specify Cluster Configuration page, you are given the option either of providing cluster configuration information manually, or of using a cluster configuration file. A cluster configuration file is a text file that you can create before starting OUI, which provides OUI with information about the cluster name and node names that it needs to configure the cluster.

Oracle suggests that you consider using a cluster configuration file if you intend to perform repeated installations on a test cluster, or if you intend to perform an installation on many nodes.

To create a cluster configuration file:

  1. On the installation media, navigate to the directory Disk1/response.

  2. Using a text editor, open the response file crs.rsp, and find the section CLUSTER_CONFIGURATION_FILE.

  3. Follow the directions in that section for creating a cluster configuration file.

4.4.3 Troubleshooting Oracle Clusterware Installation Verification

If the CVU report indicates that your Oracle Clusterware installation has a component issue, then use the topics in this section to correct the problem or problems indicated in the report, and run the CVU command again.

CSS is probably working with a non-clustered, local-only configuration on nodes:
Cause: OCR configuration error. The error message specifies the nodes on which this error is found.

This error occurs when, for each specified node, either the contents of the OCR configuration file ocr.loc cannot be retrieved, or the configuration key local_only is set to TRUE in the configuration file of nodes listed in the error message.

Action: Confirm that Oracle Clusterware was installed on the node. Correct the OCR configuration, if it is incorrect. Also, ensure that you have typed the node name correctly when entering the CVU command.
Unable to obtain OCR integrity details from nodes:
Cause: Unable to run the ocrcheck tool successfully on the nodes listed in the error message.
Action: If the ocrcheck tool indicates an error on only some nodes in the cluster, then OCR is not configured on that set of nodes. If the ocrcheck tool indicates that the OCR integrity check failed on all nodes, then the OCR storage area is corrupted. Refer to Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for instructions about how to use ocrconfig -repair to resolve this issue.

To configure OCR, you can use ocrconfig -repair, as described in Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide, or you can configure OCR manually.

To configure OCR manually, as the oracle user, enter the following command from the bin directory in the CRS home:

$ ./ocrcheck

To test if the OCR storage area is corrupted, complete the following task:

  1. Enter the following command:

    ocrconfig -showbackups
    
  2. View the contents of the OCR file using the following command syntax:

    ocrdump -backupfile OCR_filename
    
  3. Select a backup file, and use the following command to attempt to restore the file:

    ocrconfig -restore backupfile
    

    If the command returns a failure message, then both the primary OCR and the OCR mirror have failed.

    See Also:

    Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for additional information about testing and restoring the Oracle Cluster Registry
OCR version is inconsistent amongst the nodes.
Cause: The OCR version does not match on all the cluster member nodes. Either all nodes are not part of the same cluster, or nodes do not point to the same OCR, or an OCR configuration file has been changed manually to an invalid configuration on one or more nodes.
Action: Perform the following checks:
  1. Ensure that all listed nodes are part of the cluster.

  2. Use the ocrcheck utility (/crs/home/bin/ocrcheck) to find the location of OCR on each node. Start ocrcheck with one of the following command:

    As root:

    # ocrcheck
    

    As the oracle user, or as a user with OSDBA group privileges, from the user home directory:

    $ /crs/home/bin/ocrcheck
    
  3. Repair invalid OCR configurations by logging into a node you suspect has a faulty configuration, stopping the CRS daemon, and entering the following command:

    ocrconfig –repair ocrmirror device_name
    

    the ocrconfig -repair command changes the OCR configuration only on the node from which you run the command.

    See Also:

    Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for information about how to use the ocrconfig tool to repair OCR files
Incorrect OCR version found for nodes:
Cause: the OCR version on the specified nodes does not match the version required for Oracle Database 10g Release 2 (10.2).
Action: Follow the same actions described in the preceding error message, "OCR version is inconsistent amongst the nodes.".
OCR integrity is invalid.
Cause: The data integrity of the OCR is invalid, which indicates that OCR storage is corrupted.
Action: Follow the same actions described in the preceding error message, "Unable to obtain OCR integrity details from nodes:".
OCR ID is inconsistent amongst the nodes.
Cause: One or more nodes list the OCR in a different location.
Action: Follow the same actions described in the preceding error message, "OCR version is inconsistent amongst the nodes.".

4.4.4 Oracle Clusterware Background Processes

The following processes must be running in your environment after the Oracle Clusterware installation for Oracle Clusterware to function:

  • evmd: Event manager daemon that starts the racgevt process to manage callouts.

  • ocssd: Manages cluster node membership and runs as oracle user; failure of this process results in node restart.

  • crsd: Performs high availability recovery and management operations such as maintaining the OCR. Also manages application resources and runs as root user and restarts automatically upon failure.