A Troubleshooting the Oracle Clusterware and Oracle Real Application Clusters Installation Process

This appendix provides troubleshooting information for installing Oracle Clusterware and Oracle Real Application Clusters (RAC).

See Also:

The Oracle Database 10g Oracle Real Application Clusters documentation set included with the installation media in the Documentation directory:
  • Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide

A.1 Troubleshooting the Oracle Real Application Clusters Installation

This section contains these topics:

A.1.1 General Installation Issues

The following is a list of examples of types of errors that can occur during installation:

An error occurred while trying to get the disks
Cause: There is an entry in /etc/oratab pointing to a non-existent Oracle home. The OUI error file should show the following error: "java.io.IOException: /home/oracle/OraHome//bin/kfod: not found" (OracleMetalink bulletin 276454.1)
Action: Remove the entry in /etc/oratab pointing to a non-existing Oracle home.
Nodes unavailable for selection from the OUI Node Selection screen
Cause: Oracle Clusterware is either not installed, or the Oracle Clusterware services are not up and running.
Action: Install Oracle Clusterware, or review the status of your Oracle Clusterware. Consider restarting the nodes, as doing so may resolve the problem.
Node nodename is unreachable
Cause: Unavailable IP host
Action: Attempt the following:
  1. Run the shell command ifconfig -a. Compare the output of this command with the contents of the /etc/hosts file to ensure that the node IP is listed.

  2. Run the shell command nslookup to see if the host is reachable.

  3. As the oracle user, attempt to connect to the node with ssh or rsh. If you are prompted for a password, then user equivalence is not set up properly. Review the section "Configuring SSH on All Cluster Nodes".

Time stamp is in the future
Cause: One or more nodes has a different clock time than the local node. If this is the case, then you may see output similar to the following:
time stamp 2005-04-04 14:49:49 is 106 s in the future
Action: Ensure that all member nodes of the cluster have the same clock time.
YPBINDPROC_DOMAIN: Domain not bound
Cause: This error has been seen during post-installation testing when a node public network interconnect is pulled out, and the VIP does not fail over. Instead, the node hangs, and users are unable to log in to the system. This error occurs when the Oracle home, listener.ora, Oracle log files, or any action scripts are located on an NAS device or NFS mount, and the name service cache daemon nscd has not been activated.
Action: Enter the following command on all nodes in the cluster to start the nscd service:
/sbin/service  nscd start

A.1.2 Real Application Clusters Installation Error Messages

Oracle Real Application Clusters Management Tools Error Messages are in Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide.

A.1.3 Performing Cluster Diagnostics During Real Application Clusters Installations

If Oracle Universal Installer (OUI) does not display the Node Selection page, then perform clusterware diagnostics by running the olsnodes -v command from the binary directory in your Oracle Clusterware home (CRS_home) and analyzing its output. Refer to your clusterware documentation if the detailed output indicates that your clusterware is not running.

In addition, use the following command syntax to check the integrity of the Cluster Manager:

cluvfy comp clumgr -n node_list -verbose

In the preceding syntax example, the variable node_list is the list of nodes in your cluster, separated by commas.