Sun Cluster 2.2 Software Installation Guide

Troubleshooting the Installation

Table 3-1 describes some common installation problems and solutions.

Table 3-1 Common Sun Cluster Installation Problems and Solutions

Problem Description 

Solution 

When you start a cluster node, it cannot join the cluster because the private net is not configured correctly. 

Specify the correct private net interface by running the scconf(1M) command with the -i option. Then restart the cluster.

When you start a cluster node, it aborts after a failed reservation attempt, because of an incorrectly specified Ethernet address for one of the private nets. 

Specify the correct Ethernet address of the node by running the scconf(1M) command with the -N option. Then restart the cluster.

If the cluster contains an invalid quorum device, the first node is unable to join the cluster because it cannot reserve the quorum device. 

Specify a valid quorum device (controller or disk) by running the scconf(1M) command with the -q option. After configuring a valid quorum device, restart the cluster.

When you try to start the cluster, one node aborts after receiving signals from node 0 to do so. 

The problem might be mismatched CDB files (/etc/opt/SUNWcluster/conf/clustername.cdb). Compare the CDB files on the different nodes using cksum. If they differ, copy the CDB file from the working node to the other node(s). You also might need to copy over the ccd.database.init file from the working node to the other nodes.

Recovering From an Aborted Installation

If your scinstall(1M) session did not run to completion during either the client or server installation process, you can re-run scinstall(1M) after cleaning up the environment using this procedure.

How to Recover From an Aborted Client Installation
  1. On the administrative workstation, save the /etc/serialports and /etc/clusters files to a safe location, to be restored later.

  2. On the administrative workstation, use pkgrm to remove the client packages.

  3. Use scinstall(1M) to remove the Sun Cluster 2.2 client packages that have been installed already.


    # cd /cdrom/multi_suncluster_sc_2_2/Sun_Cluster_2_2/Sol_2.x/Tools
    # ./scinstall
    
    ============ Main Menu =================
     
    1) Install/Upgrade - Install or Upgrade Server Packages or Install Client Packages.
    2) Remove  - Remove Server or Client Packages.
    3) Change  - Modify cluster or data service configuration
    4) Verify  - Verify installed package sets.
    5) List    - List installed package sets.
     
    6) Quit    - Quit this program.
    7) Help    - The help screen for this menu.
     
    Please choose one of the menu items: [6]:  2
    

  4. Rerun scinstall(1M) using the procedure "How to Prepare the Administrative Workstation and Install the Client Software".

  5. Restore the /etc/serialports and /etc/clusters files you saved in Step 1.

How to Recover From an Aborted Server Installation
  1. If dfstab.logicalhost and vfstab.logicalhost files exist already, save them to a safe location to be restored later.

    Look for the files in /etc/opt/SUNWcluster/conf/hanfs. You will restore these files after re-running scinstall(1M) and configuring the cluster.

  2. Use scinstall(1M) to remove the Sun Cluster 2.2 server packages that have been installed already.


    # cd /cdrom/multi_suncluster_sc_2_2/Sun_Cluster_2_2/Sol_2.x/Tools
    # ./scinstall
     
    ============ Main Menu =================
     
    1) Install/Upgrade - Install or Upgrade Server Packages or Install Client Packages.
    2) Remove  - Remove Server or Client Packages.
    3) Change  - Modify cluster or data service configuration
    4) Verify  - Verify installed package sets.
    5) List    - List installed package sets.
     
    6) Quit    - Quit this program.
    7) Help    - The help screen for this menu.
     
    Please choose one of the menu items: [6]:  2
    

  3. Manually remove the following Sun Cluster 2.2 directories and files from all nodes.


    Caution - Caution -

    The scinstall(1M) command will not remove the SUNWdid package. Do NOT remove the SUNWdid package manually. Removing the package can cause loss of data.


    Note that some of these directories might have been removed already by scinstall(1M).


    # rm /etc/pnmconfig
    # rm /etc/sci.ifconf
    # rm /etc/sma.config
    # rm /etc/sma.ip
    # rm -r /etc/opt/SUNWcluster
    # rm -r /etc/opt/SUNWpnm
    # rm -r /opt/SUNWcluster
    # rm -r /opt/SUNWpnm
    # rm -r /var/opt/SUNWcluster
    

  4. Restart scinstall(1M) to install Sun Cluster 2.2.

    Return to the procedure "How to Install the Server Software" and begin at Step 3.

  5. Configure the cluster.

    Use the procedure "How to Configure the Cluster".

  6. Restore the dfstab.logicalhost and vfstab.logicalhost files you saved in Step 1.

    Before starting the cluster, restore the dfstab.logicalhost and vfstab.logicalhost files to /etc/opt/SUNWcluster/conf/hanfs on all nodes.