Sun Cluster 2.2 System Administration Guide

Administering the Cluster Configuration Database

The ccdadm(1M) command is used to perform administrative procedures on the Cluster Configuration Database (CCD). Refer to the ccdadm(1M) man page for additional information.


Note -

As root, you can run the ccdadm(1M) command from any active node. This command updates all the nodes in your cluster.


It is good practice to checkpoint the CCD using the -c option (checkpoint) to ccdadm(1M) each time cluster configuration is updated. The CCD is extensively used by the Sun Cluster framework to store configuration data related to logical hosts and HA data services. The CCD is also used to store the network adapter configuration data used by PNM. We strongly recommended that after any changes to the HA or PNM configuration of the cluster, you capture the current valid snapshot of the CCD by using the -c option as an insurance against problems that can occur under fault scenarios in the future. This requirement is no different from requiring database administrators or system administrators to frequently backup their data to avoid catastrophes in the future due to unforeseen circumstances.

How to Verify CCD Global Consistency
  1. Run the -v option whenever there may be a problem with the Dynamic CCD.

    This option compares the consistency record of each CCD copy on all the cluster nodes, enabling you to verify that the database is consistent across all the nodes. CCD queries are disabled while the verification is in progress.


    # ccdadm clustername -v
    

How to Back Up the CCD
  1. Run the -c option once a week or whenever you back up the CCD.

    This option makes a backup copy of the Dynamic CCD. The backup copy subsequently can be used to restore the Dynamic CCD by using the -r option. See "How to Restore the CCD" for more information.


    Note -

    When backing up the CCD, put all logical hosts in maintenance mode before running the ccdadm -c command. The logical hosts must be in maintenance mode when restoring the CCD database. Therefore, having a backup file similar to the restore state will prevent unnecessary errors or problems.



    # ccdadm clustername -c checkpoint-filename
    

    In this command, checkpoint-filename is the name of your backup copy.

How to Restore the CCD

Run ccdadm(1M) with the -r option whenever the CCD has been corrupted. This option discards the current copy of the Dynamic CCD and restores it with the contents of the restore file you supply. Use this command to initialize or restore the Dynamic CCD after the ccdd(1M) reconfiguration algorithm failed to elect a valid CCD copy upon cluster restart. The CCD is then marked valid.

  1. If necessary, disable the quorum.

    See "How to Enable or Disable the CCD Quorum" for more information.


    # ccdadm clustername -q off
    

  2. Put the logical hosts in maintenance mode.


    # haswitch -m logicalhosts
    

  3. Restore the CCD.

    In this command, restore-filename is the name of the file you are restoring.


    # ccdadm clustername -r restore-filename
    

  4. If necessary, turn the CCD quorum back on.


    # ccdadm clustername -q on
    

  5. Bring the logical hosts back online.

    For example:


    # haswitch phys-host1 logicalhost1
    # haswitch phys-host2 logicalhost2
    

How to Enable or Disable the CCD Quorum
  1. Typically, the cluster software requires a quorum before updating the CCD. The -q option enables you to disable this restriction and to update the CCD with any number of nodes.

    Run this option to enable or disable a quorum when updating or restoring the Dynamic CCD. The quorum_flag is a toggle: on (to enable) or off (to disable) a quorum. By default, the quorum is enabled.

    For example, if you have three physical nodes, you need at least two nodes to perform updates. Because of a hardware failure, you can bring up only one node. The cluster software does not enable you to update the CCD. If, however, you run the ccdadm -q command, you can toggle off the software control, and update the CCD.


    # ccdadm clustername -q on|off
    

How to Purify the CCD
  1. The -p option enables you to purify (verify the contents and check the syntax of) the CCD database file. Run this option whenever there is a syntax error in the CCD database file.


    # ccdadm -p CCD-filename
    

    The -p option reports any format errors in the candidate file and writes a corrected version of the file into the file filename.pure. You can then restore this "pure" file as the new CCD database. See "How to Restore the CCD" for more information.

Troubleshooting the CCD

The system logs errors in the CCD to the /var/opt/SUNWcluster/ccd/ccd.log file. Critical error messages are also passed to the Cluster Console. Additionally, in the rare case of a crash, the software creates a core file under /var/opt/SUNWcluster/ccd.

The following is an example of the ccd.log file.


lpc204# cat ccd.log
Apr 16 14:54:05 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) starting `START' transition with time-out 10000
Apr 16 14:54:05 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) completed `START' transition with status 0
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) starting `STEP1' transition with time-out 20000
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1000]: (info) Nodeid = 0 Up = 0 Gennum = 0 Date = Feb 14 10h30m00 1997 Restore = 4
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1002]: (info) start reconfiguration elected CCD from Nodeid = 0
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1004]: (info) the init CCD database is consistent
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1001]: (info) Node is up as a one-node cluster after scadmin startcluster; skipping ccd quorum test
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) completed `STEP1' transition with status 0

The following table lists the most common error messages with suggestions for resolving the problem. Refer to the Sun Cluster 2.2 Error Messages Manual for the complete list of error messages.

Table 4-1 Common Error Messages for the Cluster Configuration Database

Number Range 

Explanation 

Action 

4200 

Cannot open file 

Restore the CCD by running the ccdadm -r command.

4302 

File not found 

Restore the CCD by running the ccdadm -r command.

4307 

Inconsistent Init CCD 

Remove, then reinstall, the Sun Cluster software. 

4402 

Error registering RPC server 

Check your public network (networking problem). 

4403 

RPC client create failed 

Check your public network (networking problem). 

5000 

System execution error 

The synchronization script has an error. Check the permissions on the script. 

5300 

Invalid CCD, needs to be restored 

Restore the CCD by running the ccdadm -r command.

5304 

Error running freeze command 

There are incorrect arguments in the executed synchronization script. Check that the format of the script is correct. 

5306 

Cluster pointer is null 

This message indicates that the cluster does not exist (ccdadm cluster). Check that you typed the cluster name correctly.