JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster System Administration Guide     Oracle Solaris Cluster 4.1
search filter icon
search icon

Document Information

Preface

1.  Introduction to Administering Oracle Solaris Cluster

2.  Oracle Solaris Cluster and RBAC

3.  Shutting Down and Booting a Cluster

4.  Data Replication Approaches

5.  Administering Global Devices, Disk-Path Monitoring, and Cluster File Systems

6.  Administering Quorum

7.  Administering Cluster Interconnects and Public Networks

8.  Adding and Removing a Node

9.  Administering the Cluster

Overview of Administering the Cluster

How to Change the Cluster Name

How to Map Node ID to Node Name

How to Work With New Cluster Node Authentication

How to Reset the Time of Day in a Cluster

SPARC: How to Display the OpenBoot PROM (OBP) on a Node

How to Change the Node Private Hostname

How to Rename a Node

How to Change the Logical Hostnames Used by Existing Oracle Solaris Cluster Logical Hostname Resources

How to Put a Node Into Maintenance State

How to Bring a Node Out of Maintenance State

How to Uninstall Oracle Solaris Cluster Software From a Cluster Node

Troubleshooting a Node Uninstallation

Unremoved Cluster File System Entries

Unremoved Listing in Device Groups

Creating, Setting Up, and Managing the Oracle Solaris Cluster SNMP Event MIB

How to Enable an SNMP Event MIB

How to Disable an SNMP Event MIB

How to Change an SNMP Event MIB

How to Enable an SNMP Host to Receive SNMP Traps on a Node

How to Disable an SNMP Host From Receiving SNMP Traps on a Node

How to Add an SNMP User on a Node

How to Remove an SNMP User From a Node

Configuring Load Limits

How to Configure Load Limits on a Node

Changing Port Numbers for Services or Management Agents

How to Use the Common Agent Container to Change the Port Numbers for Services or Management Agents

Performing Zone Cluster Administrative Tasks

How to Add a Network Address to a Zone Cluster

How to Remove a Zone Cluster

How to Remove a File System From a Zone Cluster

How to Remove a Storage Device From a Zone Cluster

Troubleshooting

Running an Application Outside the Global Cluster

How to Take a Solaris Volume Manager Metaset From Nodes Booted in Noncluster Mode

Restoring a Corrupted Diskset

How to Save the Solaris Volume Manager Software Configuration

How to Purge the Corrupted Diskset

How to Recreate the Solaris Volume Manager Software Configuration

10.  Configuring Control of CPU Usage

11.  Updating Your Software

12.  Backing Up and Restoring a Cluster

A.  Example

Index

Overview of Administering the Cluster

This section describes how to perform administrative tasks for the entire global cluster or zone cluster. The following table lists these administrative tasks and the associated procedures. You generally perform cluster administrative tasks in the global zone. To administer a zone cluster, at least one machine that will host the zone cluster must be up in cluster mode. All zone-cluster nodes are not required to be up and running; Oracle Solaris Cluster replays any configuration changes when the node that is currently out of the cluster rejoins the cluster.


Note - By default, power management is disabled so that it does not interfere with the cluster. If you enable power management for a single-node cluster, the cluster is still running but it can become unavailable for a few seconds. The power management feature attempts to shut down the node, but it does not succeed.


In this chapter, phys-schost# reflects a global-cluster prompt. The clzonecluster interactive shell prompt is clzc:schost>.

Table 9-1 Task List: Administering the Cluster

Task
Instructions
Add or remove a node from a cluster
Change the name of the cluster
List node IDs and their corresponding node names
Permit or deny new nodes to add themselves to the cluster
Change the time for a cluster by using the NTP
Shut down a node to the OpenBoot PROM ok prompt on a SPARC based system or to the Press any key to continue message in a GRUB menu on an x86 based system
Add or change the private hostname
Put a cluster node in maintenance state
Rename a Node
Bring a cluster node out of maintenance state
Uninstall cluster software from a cluster node
Add and manage an SNMP Event MIB
Configure load limits for each node
Move a zone cluster; prepare a zone cluster for applications, remove a zone cluster

How to Change the Cluster Name

If necessary, you can change the cluster name after initial installation.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume the root role on any node in the global cluster.
  2. Start the clsetup utility.
    phys-schost# clsetup

    The Main Menu is displayed.

  3. To change the cluster name, type the number for the option for Other Cluster Properties.

    The Other Cluster Properties menu is displayed.

  4. Make your selection from the menu and follow the onscreen instructions.
  5. If you want the service tag for Oracle Solaris Cluster to reflect the new cluster name, delete the existing Oracle Solaris Cluster tag and restart the cluster.

    To delete the Oracle Solaris Cluster service tag instance, complete the following substeps on all nodes in the cluster.

    1. List all of the service tags.
      phys-schost# stclient -x
    2. Find the Oracle Solaris Cluster service tag instance number, then run the following command.
      phys-schost# stclient -d -i service_tag_instance_number
    3. Reboot all the nodes in the cluster.
      phys-schost# reboot

Example 9-1 Changing the Cluster Name

The following example shows the cluster command generated from the clsetup utility to change to the new cluster name, dromedary.

phys-schost# cluster rename -c dromedary

For more information, see the cluster(1CL) and clsetup(1CL) man pages.

How to Map Node ID to Node Name

During Oracle Solaris Cluster installation, each node is automatically assigned a unique node ID number. The node ID number is assigned to a node in the order in which it joins the cluster for the first time. After the node ID number is assigned, the number cannot be changed. The node ID number is often used in error messages to identify which cluster node the message concerns. Use this procedure to determine the mapping between node IDs and node names.

You do not need to be the root role to list configuration information for a global cluster or a zone cluster. One step in this procedure is performed from a node of the global cluster. The other step is performed from a zone-cluster node.

  1. Use the clnode command to list the cluster configuration information for the global cluster.
    phys-schost# clnode show | grep Node

    For more information, see the clnode(1CL) man page.

  2. You can also list the Node IDs for a zone cluster.

    The zone-cluster node has the same Node ID as the global cluster-node where it is running.

    phys-schost# zlogin sczone clnode -v | grep Node

Example 9-2 Mapping the Node ID to the Node Name

The following example shows the node ID assignments for a global cluster.

phys-schost# clnode show | grep Node
=== Cluster Nodes ===
Node Name:                phys-schost1
  Node ID:                1
Node Name:                 phys-schost2
  Node ID:                2
Node Name:                phys-schost3
  Node ID:                3

How to Work With New Cluster Node Authentication

Oracle Solaris Cluster enables you to determine if new nodes can add themselves to the global cluster and the type of authentication to use. You can permit any new node to join the cluster over the public network, deny new nodes from joining the cluster, or indicate a specific node that can join the cluster. New nodes can be authenticated by using either standard UNIX or Diffie-Hellman (DES) authentication. If you select DES authentication, you must also configure all necessary encryption keys before a node can join. See the keyserv(1M) and publickey(4) man pages for more information.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume the root role on any node in the global cluster.
  2. Start the clsetup utility.
    phys-schost# clsetup

    The Main Menu is displayed.

  3. To work with cluster authentication, type the number for the option for new nodes.

    The New Nodes menu is displayed.

  4. Make your selection from the menu and follow the onscreen instructions.

Example 9-3 Preventing a New Machine From Being Added to the Global Cluster

The clsetup utility generates the claccess command. The following example shows the claccess command that prevents new machines from being added to the cluster.

phys-schost# claccess deny -h hostname

Example 9-4 Permitting All New Machines to Be Added to the Global Cluster

The clsetup utility generates the claccess command. The following example shows the claccess command that enables all new machines to be added to the cluster.

phys-schost# claccess allow-all

Example 9-5 Specifying a New Machine to Be Added to the Global Cluster

The clsetup utility generates the claccess command. The following example shows the claccess command that enables a single new machine to be added to the cluster.

phys-schost# claccess allow -h hostname

Example 9-6 Setting the Authentication to Standard UNIX

The clsetup utility generates the claccess command. The following example shows the claccess command that resets to standard UNIX authentication for new nodes that are joining the cluster.

phys-schost# claccess set -p protocol=sys

Example 9-7 Setting the Authentication to DES

The clsetup utility generates the claccess command. The following example shows the claccess command that uses DES authentication for new nodes that are joining the cluster.

phys-schost# claccess set -p protocol=des

When using DES authentication, you must also configure all necessary encryption keys before a node can join the cluster. For more information, see the keyserv(1M) and publickey(4) man pages.

How to Reset the Time of Day in a Cluster

Oracle Solaris Cluster software uses the NTP to maintain time synchronization between cluster nodes. Adjustments in the global cluster occur automatically as needed when nodes synchronize their time. For more information, see the Oracle Solaris Cluster Concepts Guide and the Network Time Protocol's User's Guide at http://download.oracle.com/docs/cd/E19065-01/servers.10k/.


Caution

Caution - When using NTP, do not attempt to adjust the cluster time while the cluster is up and running. Do not adjust the time by using the date, rdate, or svcadm commands interactively or within the cron scripts. For more information, see the date(1), rdate(1M), svcadm(1M), or cron(1M) man pages. The ntpd(1M) man page is delivered in the service/network/ntp Oracle Solaris 11 package.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume the root role on any node in the global cluster.
  2. Shut down the global cluster.
    phys-schost# cluster shutdown -g0 -y -i 0
  3. Verify that the node is showing the ok prompt on a SPARC based system or the Press any key to continue message on the GRUB menu on an x86 based system.
  4. Boot the node in noncluster mode.
    • On SPARC based systems, run the following command.

      ok boot -x
    • On x86 based systems, run the following commands.

      # shutdown -g -y -i0
      
      Press any key to continue
    1. In the GRUB menu, use the arrow keys to select the appropriate Oracle Solaris entry and type e to edit its commands.

      The GRUB menu appears.

      For more information about GRUB based booting, see Booting a System in Booting and Shutting Down Oracle Solaris 11.1 Systems.

    2. In the boot parameters screen, use the arrow keys to select the kernel entry and type e to edit the entry.

      The GRUB boot parameters screen appears.

    3. Add -x to the command to specify system boot into noncluster mode.
      [ Minimal BASH-like line editing is supported. For the first word, TAB
      lists possible command completions. Anywhere else TAB lists the possible
      completions of a device/filename. ESC at any time exits. ]
      
      grub edit> kernel$ /platform/i86pc/kernel/$ISADIR/unix _B $ZFS-BOOTFS -x
    4. Press the Enter key to accept the change and return to the boot parameters screen.

      The screen displays the edited command.

    5. Type b to boot the node into noncluster mode.

      Note - This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps again to add the -x option to the kernel boot parameter command.


  5. On a single node, set the time of day by running the date command.
    phys-schost# date HHMM.SS
  6. On the other machines, synchronize the time to that node by running the rdate(1M) command.
    phys-schost# rdate hostname
  7. Boot each node to restart the cluster.
    phys-schost# reboot
  8. Verify that the change occurred on all cluster nodes.

    On each node, run the date command.

    phys-schost# date

SPARC: How to Display the OpenBoot PROM (OBP) on a Node

Use this procedure if you need to configure or change OpenBoot™ PROM settings.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Connect to the console on the node to be shut down.
    # telnet tc_name tc_port_number
    tc_name

    Specifies the name of the terminal concentrator.

    tc_port_number

    Specifies the port number on the terminal concentrator. Port numbers are configuration dependent. Typically, ports 2 and 3 (5002 and 5003) are used for the first cluster installed at a site.

  2. Shut down the cluster node gracefully by using the clnode evacuate command, then the shutdown command.

    The clnode evacuate command switches over all device groups from the specified node to the next-preferred node. The command also switches all resource groups from the global cluster's specified node to the next-preferred node.

    phys-schost# clnode evacuate node
    # shutdown -g0 -y

    Caution

    Caution - Do not use send brk on a cluster console to shut down a cluster node.


  3. Execute the OBP commands.

How to Change the Node Private Hostname

Use this procedure to change the private hostname of a cluster node after installation has been completed.

Default private host names are assigned during initial cluster installation. The default private hostname takes the form clusternode< nodeid>-priv, for example: clusternode3-priv. Change a private hostname only if the name is already in use in the domain.


Caution

Caution - Do not attempt to assign IP addresses to new private host names. The clustering software assigns them.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Disable, on all nodes in the cluster, any data service resources or other applications that might cache private host names.
    phys-schost# clresource disable resource[,...]

    Include the following in the applications you disable.

    • HA-DNS and HA-NFS services, if configured

    • Any application that has been custom-configured to use the private hostname

    • Any application that is being used by clients over the private interconnect

    For information about using the clresource command, see the clresource(1CL) man page and the Oracle Solaris Cluster Data Services Planning and Administration Guide.

  2. If your NTP configuration file refers to the private hostname that you are changing, bring down the NTP daemon on each node of the cluster.

    Use the svcadm command to shut down the NTP daemon. See the svcadm(1M) man page for more information about the NTP daemon.

    phys-schost# svcadm disable ntp
  3. Run the clsetup utility to change the private hostname of the appropriate node.

    Run the utility from only one of the nodes in the cluster. for more information, see the clsetup(1CL) man page.


    Note - When selecting a new private hostname, ensure that the name is unique to the cluster node.


    You can also run the clnode command instead of the clsetup utility to change the private hostname. In the example below, the cluster node name is pred1. After you run the clnode command below, go to Step 6.

    phys-schost# /usr/cluster/bin/clnode set -p privatehostname=New-private-nodename pred1
  4. In the clsetup utility, type the number for the option for the private hostname.
  5. In the clsetup utility, type the number for the option for changing a private hostname.

    Answer the questions when prompted. You are asked the name of the node whose private hostname you are changing (clusternode< nodeid>-priv), and the new private hostname.

  6. Flush the name service cache.

    Perform this step on each node in the cluster. Flushing prevents the cluster applications and data services from trying to access the old private hostname.

    phys-schost# nscd -i hosts
  7. If you changed a private hostname in your NTP configuration or include file, update the NTP file on each node. If you changed a private hostname in your NTP configuration file (/etc/inet/ntp.conf) and you have peer host entries or a pointer to the include file for the peer hosts in your NTP configuration file (/etc/inet/ntp.conf.include), update the file on each node. If you changed a private hostname in your NTP include file, update the /etc/inet/ntp.conf.sc file on each node.
    1. Use the editing tool of your choice.

      If you perform this step at installation, also remember to remove names for nodes that are configured. Typically, the ntp.conf.sc file is identical on each cluster node.

    2. Verify that you can successfully ping the new private hostname from all cluster nodes.
    3. Restart the NTP daemon.

      Perform this step on each node of the cluster.

      Use the svcadm command to restart the NTP daemon.

      # svcadm enable svc:network/ntp:default
  8. Enable all data service resources and other applications that were disabled in Step 1.
    phys-schost# clresource enable resource[,...]

    For information about using the clresourcecommand, see the clresource(1CL) man page and the Oracle Solaris Cluster Data Services Planning and Administration Guide.

Example 9-8 Changing the Private Hostname

The following example changes the private hostname from clusternode2-priv to clusternode4-priv, on node phys-schost-2. Perform this action on each node.

[Disable all applications and data services as necessary.]
phys-schost-1# svcadm disable ntp
phys-schost-1# clnode show | grep node
 ...
 private hostname:                           clusternode1-priv
 private hostname:                           clusternode2-priv
 private hostname:                           clusternode3-priv
 ...
phys-schost-1# clsetup
phys-schost-1# nscd -i hosts
phys-schost-1# vi /etc/inet/ntp.conf.sc
 ...
 peer clusternode1-priv
 peer clusternode4-priv
 peer clusternode3-priv
phys-schost-1# ping clusternode4-priv
phys-schost-1# svcadm enable ntp
[Enable all applications and data services disabled at the beginning of the procedure.]

How to Rename a Node

You can change the name of a node that is part of an Oracle Solaris Cluster configuration. You must rename the Oracle Solaris hostname before you can rename the node. Use the clnode rename command to rename the node.

The following instructions apply to any application that is running in a global cluster.

  1. On the global cluster, assume a role that provides solaris.cluster.modify RBAC authorization.
  2. If you are renaming a node in an Oracle Solaris Cluster Geographic Edition cluster that is in a partnership of an Oracle Solaris configuration, you must perform additional steps.

    For more information on Geographic Edition clusters and nodes, see Chapter 5, Administering Cluster Partnerships, in Oracle Solaris Cluster Geographic Edition System Administration Guide.

    If the cluster where you are performing the rename procedure is primary for the protection group, and you want to have the application in the protection group online, you can switch the protection group to the secondary cluster during the rename procedure.

  3. Rename the Oracle Solaris host names by completing the steps in How to Change a System’s Identity in Managing System Information, Processes, and Performance in Oracle Solaris 11.1, except do not perform a reboot at the end of the procedure.

    Instead, perform a cluster shutdown after you complete these steps.

  4. Boot all cluster nodes into noncluster mode.
    ok> boot -x
  5. In noncluster mode on the node where you renamed the Oracle Solaris hostname, rename the node and run the cmd command on each renamed host.

    Rename one node at a time.

    # clnode rename -n newnodename oldnodename
  6. Update any existing references to the previous hostname in the applications that run on the cluster.
  7. Confirm that the node was renamed by checking the command messages and log files.
  8. Reboot all nodes into cluster mode.
    # sync;sync;sync;reboot
  9. Verify the node displays the new name.
    # clnode status -v
  10. If you are renaming a node on a Geographic Edition cluster node and the partner cluster of the cluster that contains the renamed node still references the previous nodename, the protection group's synchronization status will appear as an Error.

    You must update the protection group from one node of the partner cluster that contains the renamed node by using the geopg update <pg>. After you complete that step, run the geopg start -e global <pg> command. At a later time, you can switch the protection group back to the cluster with the renamed node.

  11. You can choose to change the logical hostname resources' hostnamelist property.

    See How to Change the Logical Hostnames Used by Existing Oracle Solaris Cluster Logical Hostname Resources for instructions on this optional step.

How to Change the Logical Hostnames Used by Existing Oracle Solaris Cluster Logical Hostname Resources

You can choose to change the logical hostname resource's hostnamelist property either before or after you rename the node by following the steps in How to Rename a Node. This step is optional.

  1. On the global cluster, assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Optionally, you can change the logical hostnames used by any of the existing Oracle Solaris Cluster Logical Hostname resources.

    The following steps show how to configure the apache-lh-res resource to work with the new logical hostname, and must be executed in cluster mode.

    1. In cluster mode, take the Apache resource groups that contain the logical hostnames offline.
      # clrg offline apache-rg
    2. Disable the Apache logical hostname resources.
      # clrs disable appache-lh-res
    3. Provide the new hostname list.
      # clrs set -p HostnameList=test-2 apache-lh-res
    4. Change the application's references for previous entries in the hostnamelist property to reference the new entries.
    5. Enable the new Apache logical hostname resources
      # clrs enable apache-lh-res
    6. Bring the Apache resource groups online.
      # clrg online -eM apache-rg
    7. Confirm that the application started correctly by running the following command checking a client.
      # clrs status apache-rs

How to Put a Node Into Maintenance State

Put a global-cluster node into maintenance state when taking the node out of service for an extended period of time. This way, the node does not contribute to the quorum count while it is being serviced. To put a node into maintenance state, the node must be shut down with the clnode evacuate and cluster shutdown commands. For more information, see the clnode(1CL) and cluster(1CL) man pages.


Note - Use the Oracle Solaris shutdown command to shut down a single node. Use the cluster shutdown command only when shutting down an entire cluster.


When a cluster node is shut down and put in maintenance state, all quorum devices that are configured with ports to the node have their quorum vote counts decremented by one. The node and quorum device vote counts are incremented by one when the node is removed from maintenance mode and brought back online.

Use the clquorum disable command to put a cluster node into maintenance state. For more information, see the clquorum(1CL) man page.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on the global-cluster node that you are putting into maintenance state.
  2. Evacuate any resource groups and device groups from the node.

    The clnode evacuate command switches over all resource groups and device groups from the specified node to the next-preferred node.

    phys-schost# clnode evacuate node
  3. Shut down the node that you evacuated.
    phys-schost# shutdown -g0 -y -i 0
  4. Assume a role that provides solaris.cluster.modify RBAC authorization on another node in the cluster and put the node that you shut down in Step 3 in maintenance state.
    phys-schost# clquorum disable  node
    node

    Specifies the name of a node that you want to put into maintenance mode.

  5. Verify that the global-cluster node is now in maintenance state.
    phys-schost# clquorum status node

    The node that you put into maintenance state should have a Status of offline and 0 (zero) for Present and Possible quorum votes.

Example 9-9 Putting a Global-Cluster Node Into Maintenance State

The following example puts a cluster node into maintenance state and verifies the results. The clnode status output shows the Node votes for phys-schost-1 to be 0 (zero) and the status to be Offline. The Quorum Summary should also show reduced vote counts. Depending on your configuration, the Quorum Votes by Device output might indicate that some quorum disk devices are offline as well.

[On the node to be put into maintenance state:]
phys-schost-1# clnode  evacuate phys-schost-1
phys-schost-1# shutdown -g0 -y -i0

[On another node in the cluster:]
phys-schost-2# clquorum disable phys-schost-1
phys-schost-2# clquorum status phys-schost-1

-- Quorum Votes by Node --

Node Name           Present       Possible       Status
---------           -------       --------       ------
phys-schost-1       0             0              Offline
phys-schost-2       1             1              Online
phys-schost-3       1             1              Online

See Also

To bring a node back online, see How to Bring a Node Out of Maintenance State.

How to Bring a Node Out of Maintenance State

Use the following procedure to bring a global-cluster node back online and reset the quorum vote count to the default. For cluster nodes, the default quorum count is one. For quorum devices, the default quorum count is N-1, where N is the number of nodes with nonzero vote counts that have ports to the quorum device.

When a node has been put in maintenance state, the node's quorum vote count is decremented by one. All quorum devices that are configured with ports to the node will also have their quorum vote counts decremented. When the quorum vote count is reset and a node removed from maintenance state, both the node's quorum vote count and the quorum device vote count are incremented by one.

Run this procedure any time a global-cluster node has been put in maintenance state and you are removing it from maintenance state.


Caution

Caution - If you do not specify either the globaldev or node options, the quorum count is reset for the entire cluster.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on any node of the global cluster other than the one in maintenance state.
  2. Depending on the number of nodes that you have in your global cluster configuration, perform one of the following steps:
    • If you have two nodes in your cluster configuration, go to Step 4.

    • If you have more than two nodes in your cluster configuration, go to Step 3.

  3. If the node that you are removing from maintenance state will have quorum devices, reset the cluster quorum count from a node other than the one in maintenance state.

    You must reset the quorum count from a node other than the node in maintenance state before rebooting the node, or the node might hang while waiting for quorum.

    phys-schost# clquorum reset
    reset

    The change flag that resets quorum.

  4. Boot the node that you are removing from maintenance state.
  5. Verify the quorum vote count.
    phys-schost# clquorum status

    The node that you removed from maintenance state should have a status of online and show the appropriate vote count for Present and Possible quorum votes.

Example 9-10 Removing a Cluster Node From Maintenance State and Resetting the Quorum Vote Count

The following example resets the quorum count for a cluster node and its quorum devices to their defaults and verifies the result. The cluster status output shows the Node votes for phys-schost-1 to be 1 and the status to be online. The Quorum Summary should also show an increase in vote counts.

phys-schost-2# clquorum reset
phys-schost-1# clquorum status

--- Quorum Votes Summary ---

            Needed   Present   Possible
            ------   -------   --------
            4        6         6


--- Quorum Votes by Node ---

Node Name        Present       Possible      Status
---------        -------       --------      ------
phys-schost-2    1             1             Online
phys-schost-3    1             1             Online


--- Quorum Votes by Device ---

Device Name           Present      Possible      Status
-----------           -------      --------      ------
/dev/did/rdsk/d3s2    1            1             Online
/dev/did/rdsk/d17s2   0            1             Online
/dev/did/rdsk/d31s2   1            1             Online
`

How to Uninstall Oracle Solaris Cluster Software From a Cluster Node

Perform this procedure to unconfigure Oracle Solaris Cluster software from a global-cluster node before you disconnect it from a fully established cluster configuration. You can use this procedure to uninstall software from the last remaining node of a cluster.


Note - To uninstall Oracle Solaris Cluster software from a node that has not yet joined the cluster or is still in installation mode, do not perform this procedure. Instead, go to How to Unconfigure Oracle Solaris Cluster Software to Correct Installation Problems in Oracle Solaris Cluster Software Installation Guide.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Ensure that you have correctly completed all prerequisite tasks in the task map to remove a cluster node.

    See Table 8-2.

    Ensure that you have removed the node from the cluster configuration by using clnode remove before you continue with this procedure. Other steps might include adding the node you plan to uninstall to the cluster's node–authentication list, uninstalling a zone cluster, and so on.


    Note - To unconfigure the node but leave Oracle Solaris Cluster software installed on the node, do not proceed further after you run the clnode remove command.


  2. Assume the root role on the node to uninstall.
  3. If your node has a dedicated partition for the global devices namespace, reboot the global-cluster node into noncluster mode.
    • On a SPARC based system, run the following command.

      # shutdown -g0 -y -i0 ok boot -x
    • On an x86 based system, run the following commands.

      # shutdown -g0 -y -i0
      ...
                            <<< Current Boot Parameters >>>
      Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/
      sd@0,0:a
      Boot args:
      
      Type    b [file-name] [boot-flags] <ENTER>  to boot with options
      or      i <ENTER>                           to enter boot interpreter
      or      <ENTER>                             to boot with defaults
      
                        <<< timeout in 5 seconds >>>
      Select (b)oot or (i)nterpreter: b -x
  4. In the /etc/vfstab file, remove all globally mounted file-system entries except the /global/.devices global mounts.
  5. Reboot the node into noncluster mode.
    • On SPARC based systems, perform the following command:
      ok boot -x
    • On x86 based systems, perform the following commands:
      1. In the GRUB menu, use the arrow keys to select the appropriate Oracle Solaris entry and type e to edit its commands.

        For more information about GRUB based booting, see Booting a System in Booting and Shutting Down Oracle Solaris 11.1 Systems.

      2. In the boot parameters screen, use the arrow keys to select the kernel entry and type e to edit the entry.
      3. Add -x to the command to specify that the system boot into noncluster mode.
      4. Press Enter to accept the change and return to the boot parameters screen.

        The screen displays the edited command.

      5. Type b to boot the node into noncluster mode.

        Note - This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps to again add the -x option to the kernel boot parameter command.


  6. Change to a directory, such as the root (/) directory, that does not contain any files that are delivered by the Oracle Solaris Cluster packages.
    phys-schost# cd /
  7. To unconfigure the node and remove Oracle Solaris Cluster software, run the following command.
    phys-schost# scinstall -r [-b bename]
    -r

    Removes cluster configuration information and uninstalls Oracle Solaris Cluster framework and data-service software from the cluster node. You can then reinstall the node or remove the node from the cluster.

    -b bootenvironmentname

    Specifies the name of a new boot environment, which is where you boot into after the uninstall process completes. Specifying a name is optional. If you do not specify a name for the boot environment, one is automatically generated.

    See the scinstall(1M) man page for more information.

  8. If you intend to reinstall the Oracle Solaris Cluster software on this node after the uninstall completes, reboot the node to boot into the new boot environment.
  9. If you do not intend to reinstall the Oracle Solaris Cluster software on this cluster, disconnect the transport cables and the transport switch, if any, from the other cluster devices.
    1. If the uninstalled node is connected to a storage device that uses a parallel SCSI interface, install a SCSI terminator to the open SCSI connector of the storage device after you disconnect the transport cables.

      If the uninstalled node is connected to a storage device that uses Fibre Channel interfaces, no termination is necessary.

    2. Follow the documentation that shipped with your host adapter and server for disconnection procedures.

    Tip - For more information about migrating a global-devices namespace to a lofi, see Migrating the Global-Devices Namespace.


Troubleshooting a Node Uninstallation

This section describes error messages that you might receive when you run the clnode remove command and the corrective actions to take.

Unremoved Cluster File System Entries

The following error messages indicate that the global-cluster node you removed still has cluster file systems referenced in its vfstab file.

Verifying that no unexpected global mounts remain in /etc/vfstab ... failed
clnode:  global-mount1 is still configured as a global mount.
clnode:  global-mount1 is still configured as a global mount.
clnode:  /global/dg1 is still configured as a global mount.
 
clnode:  It is not safe to uninstall with these outstanding errors.
clnode:  Refer to the documentation for complete uninstall instructions.
clnode:  Uninstall failed.

To correct this error, return to How to Uninstall Oracle Solaris Cluster Software From a Cluster Node and repeat the procedure. Ensure that you successfully complete Step 4 in the procedure before you rerun the clnode remove command.

Unremoved Listing in Device Groups

The following error messages indicate that the node you removed is still listed with a device group.

Verifying that no device services still reference this node ... failed
clnode:  This node is still configured to host device service "
service".
clnode:  This node is still configured to host device service "
service2".
clnode:  This node is still configured to host device service "
service3".
clnode:  This node is still configured to host device service "
dg1".
 
clnode:  It is not safe to uninstall with these outstanding errors.          
clnode:  Refer to the documentation for complete uninstall instructions.
clnode:  Uninstall failed.

Creating, Setting Up, and Managing the Oracle Solaris Cluster SNMP Event MIB

This section describes how to create, set up, and manage the Simple Network Management Protocol (SNMP) event Management Information Base (MIB). This section also describes how to enable, disable, and change the Oracle Solaris Cluster SNMP event MIB.

The Oracle Solaris Cluster software currently supports one MIB, the event MIB. The SNMP manager software traps cluster events in real time. When enabled, the SNMP manager automatically sends trap notifications to all hosts that are defined by the clsnmphost command. The MIB maintains a read-only table of the most current 50 events. Because clusters generate numerous notifications, only events with a severity of warning or greater are sent as trap notifications. This information does not persist across reboots.

The SNMP event MIB is defined in the sun-cluster-event-mib.mib file and is located in the /usr/cluster/lib/mib directory. You can use this definition to interpret the SNMP trap information.

The default port number for the event SNMP module is 11161, and the default port for the SNMP traps is 11162. These port numbers can be changed by modifying the Common Agent Container property file, which is /etc/cacao/instances/default/private/cacao.properties.

Creating, setting up, and managing an Oracle Solaris Cluster SNMP event MIB can involve the following tasks.

Table 9-2 Task Map: Creating, Setting Up, and Managing the Oracle Solaris Cluster SNMP Event MIB

Task
Instructions
Enable an SNMP event MIB
Disable an SNMP event MIB
Change an SNMP event MIB
Add an SNMP host to the list of hosts that will receive trap notifications for the MIBs
Remove an SNMP host
Add an SNMP user
Remove an SNMP user

How to Enable an SNMP Event MIB

This procedure shows how to enable an SNMP event MIB.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Enable the SNMP event MIB.
    phys-schost-1# clsnmpmib enable [-n node] MIB
    [-n node]

    Specifies the node on which the event MIB that you want to enable is located. You can specify a node ID or a node name. If you do not specify this option, the current node is used by default.

    MIB

    Specifies the name of the MIB that you want to enable. In this case, the MIB name must be event.

How to Disable an SNMP Event MIB

This procedure shows how to disable an SNMP event MIB.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Disable the SNMP event MIB.
    phys-schost-1# clsnmpmib disable -n node MIB
    -n node

    Specifies the node on which the event MIB that you want to disable is located. You can specify a node ID or a node name. If you do not specify this option, the current node is used by default.

    MIB

    Specifies the type of the MIB that you want to disable. In this case, you must specify event.

How to Change an SNMP Event MIB

This procedure shows how to change the protocol for an SNMP event MIB.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Change the protocol of the SNMP event MIB.
    phys-schost-1# clsnmpmib set -n node -p version=value MIB
    -n node

    Specifies the node on which the event MIB that you want to change is located. You can specify a node ID or a node name. If you do not specify this option, the current node is used by default.

    -p version=value

    Specifies the version of SNMP protocol to use with the MIBs. You specify value as follows:

    • version=SNMPv2

    • version=snmpv2

    • version=2

    • version=SNMPv3

    • version=snmpv3

    • version=3

    MIB

    Specifies the name of the MIB or MIBs to which to apply the subcommand. In this case, you must specify event.

How to Enable an SNMP Host to Receive SNMP Traps on a Node

This procedure shows how to add an SNMP host on a node to the list of hosts that will receive trap notifications for the MIBs.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Add the host to the SNMP host list of a community on another node.
    phys-schost-1# clsnmphost add -c SNMPcommunity [-n node] host
    -c SNMPcommunity

    Specifies the SNMP community name that is used in conjunction with the hostname.

    You must specify the SNMP community name SNMPcommunity when you add a host to a community other than public. If you use the add subcommand without the -c option, the subcommand uses public as the default community name.

    If the specified community name does not exist, this command creates the community.

    -n node

    Specifies the name of the node of the SNMP host that is provided access to the SNMP MIBs in the cluster. You can specify a node name or a node ID. If you do not specify this option, the current node is used by default.

    host

    Specifies the name, IP address, or IPv6 address of a host that is provided access to the SNMP MIBs in the cluster.

How to Disable an SNMP Host From Receiving SNMP Traps on a Node

This procedure shows how to remove an SNMP host on a node from the list of hosts that will receive trap notifications for the MIBs.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Remove the host from the SNMP host list of a community on the specified node.
    phys-schost-1# clsnmphost remove -c SNMPcommunity -n node host
    remove

    Removes the specified SNMP host from the specified node.

    -c SNMPcommunity

    Specifies the name of the SNMP community from which the SNMP host is removed.

    -n node

    Specifies the name of the node on which the SNMP host is removed from the configuration. You can specify a node name or a node ID. If you do not specify this option, the current node is used by default.

    host

    Specifies the name, IP address, or IPv6 address of the host that is removed from the configuration.

    To remove all hosts in the specified SNMP community, use a plus sign (+) for host with the -c option. To remove all hosts, use the plus sign (+) for host.

How to Add an SNMP User on a Node

This procedure shows how to add an SNMP user to the SNMP user configuration on a node.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Add the SNMP user.
    phys-schost-1# clsnmpuser create -n node -a authentication \
                  -f password user
    -n node

    Specifies the node on which the SNMP user is added. You can specify a node ID or a node name. If you do not specify this option, the current node is used by default.

    -a authentication

    Specifies the authentication protocol that is used to authorize the user. The value of the authentication protocol can be SHA or MD5.

    -f password

    Specifies a file that contains the SNMP user passwords. If you do not specify this option when you create a new user, the command prompts for a password. This option is valid only with the add subcommand.

    You must specify user passwords on separate lines in the following format:

    user:password

    Passwords cannot contain the following characters or a space:

    • ; (semicolon)

    • : (colon)

    • \ (backslash)

    • \n (newline)

    user

    Specifies the name of the SNMP user that you want to add.

How to Remove an SNMP User From a Node

This procedure shows how to remove an SNMP user from the SNMP user configuration on a node.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization.
  2. Remove the SNMP user.
    phys-schost-1# clsnmpuser delete -n node user
    -n node

    Specifies the node from which the SNMP user is removed. You can specify a node ID or a node name. If you do not specify this option, the current node is used by default.

    user

    Specifies the name of the SNMP user that you want to remove.

Configuring Load Limits

You can enable the automatic distribution of resource group load across nodes by setting load limits. You can configure a set of load limits for each cluster node. You assign load factors to resource groups, and the load factors correspond to the defined load limits of the nodes. The default behavior is to distribute resource group load evenly across all the available nodes in the resource group's node list.

The resource groups are started on a node from the resource group's node list by the RGM so that the node's load limits are not exceeded. As resource groups are assigned to nodes by the RGM, the resource groups' load factors on each node are summed up to provide a total load. The total load is then compared against that node's load limits.

A load limit consists of the following items:

You can set both the hard limit and the soft limit in a single command. If one of the limits is not explicitly set, the default value is used. Hard and soft load limits for each node are created and modified with the clnode create-loadlimit, clnode set-loadlimit, and clnode delete-loadlimit commands. See the clnode(1CL) man page for more information.

You can configure a resource group to have a higher priority so that it is less likely to be displaced from a specific node. You can also set a preemption_mode property to determine if a resource group will be preempted from a node by a higher-priority resource group because of node overload. A concentrate_load property also allows you to concentrate the resource group load onto as few nodes as possible. The default value of the concentrate_load property is FALSE by default.


Note - You can configure load limits on nodes in a global cluster or a zone cluster. You can use the command line, the clsetup utility, or the Oracle Solaris Cluster Manager interface to configure load limits. The following procedure illustrates how to configure load limits using the command line.


How to Configure Load Limits on a Node

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on any node of the global cluster.
  2. Create and set a load limit for the nodes that you want to use load balancing.
    # clnode create-loadlimit -p limitname=mem_load -Z zc1 -p 
    softlimit=11 -p hardlimit=20 node1 node2 node3

    In this example, the zone cluster name is zc1 The sample property is called mem_load and has a soft limit of 11 and a hard load limit of 20. Hard and soft limits are optional arguments and default to unlimited if you do not specifically define them. See the clnode(1CL) man page for more information.

  3. Assign load factor values to each resource group.
    # clresourcegroup set -p load_factors=mem_load@50,factor2@1 rg1 rg2

    In this example, the load factors are set on the two resource groups, rg1 and rg2. The load factor settings correspond to the defined load limits of the nodes. You can also perform this step during the creation of the resource group with the clresourceroup create command. See the clresourcegroup(1CL) man page for more information.

  4. If desired, you can redistribute the existing load (clrg remaster).
    # clresourcegroup remaster rg1 rg2

    This command can move resource groups off their current master to other nodes to achieve uniform load distribution.

  5. If desired, you can give some resource groups a higher priority than others.
    # clresourcegroup set -p priority=600 rg1

    The default priority is 500. Resource groups with higher priority values get precedence in node assignment over resource groups with lower priorities.

  6. If desired, you can set the Preemption_mode property.
    # clresourcegroup set -p Preemption_mode=No_cost rg1

    See the clresourcegroup(1CL) man page for more information on the HAS_COST, NO_COST, and NEVER options.

  7. If desired, you can also set the Concentrate_load flag.
    # cluster set -p Concentrate_load=TRUE
  8. If desired, you can specify an affinity between resource groups.

    A strong positive or negative affinity takes precedence over load distribution. A strong affinity can never be violated, nor can a hard load limit. If you set both strong affinities and hard load limits, some resource groups might be forced to remain offline if both constraints cannot be satisfied.

    The following example specifies a strong positive affinity between resource group rg1 in zone cluster zc1 and resource group rg2 in zone cluster zc2.

    # clresourcegroup set -p RG_affinities=++zc2:rg2 zc1:rg1
  9. Verify the status of all global-cluster nodes and zone-cluster nodes in the cluster.
    # clnode status -Z all -v

    The output includes any load limit settings that are defined on the node.

Changing Port Numbers for Services or Management Agents

The common agent container is started automatically when you boot the cluster.


Note - If you receive a System Error message when you try to view information about a node, check whether the common agent container network-bind-address parameter is set to the correct value of 0.0.0.0.

Perform these steps on each node of the cluster.

1. Display the value of the network-bind-address parameter.

# cacaoadm get-param network-bind-address network-bind-address=0.0.0.0

2. If the parameter value is anything other than 0.0.0.0, change the parameter value.

# cacaoadm stop # cacaoadm set-param network-bind-address=0.0.0.0 # cacaoadm start


How to Use the Common Agent Container to Change the Port Numbers for Services or Management Agents

If the default port numbers for your common agent container services conflict with other running processes, you can use the cacaoadm command to change the port number of the conflicting service or management agent on each node of the cluster.

  1. On all cluster nodes, stop the common agent container management daemon.
    # /opt/bin/cacaoadm stop
  2. Retrieve the port number currently used by the common agent container service with the get-param subcommand.
    # /opt/bin/cacaoadm get-param
    parameterName

    You can use the cacaoadm command to change the port numbers for the following common agent container services. The following list provides some examples of services and agents that can be managed by the common agent container, along with corresponding parameter names.

    JMX connector port

    jmxmp-connector-port

    SNMP port

    snmp-adapter-port

    SNMP trap port

    snmp-adapter-trap-port

    Command stream port

    commandstream-adapter-port


    Note - If you receive a System Error message when you try to view information about a node, check whether the common agent container network-bind-address parameter is set to the correct value of 0.0.0.0.

    Perform these steps on each node of the cluster.

    1. Display the value of the network-bind-address parameter.

    # cacaoadm get-param network-bind-address network-bind-address=0.0.0.0

    2. If the parameter value is anything other than 0.0.0.0, change the parameter value.

    # cacaoadm stop # cacaoadm set-param network-bind-address=0.0.0.0 # cacaoadm start


  3. Change a port number.
    # /opt/bin/cacaoadm set-param parameterName=parameterValue
  4. Repeat Step 3 on each node of the cluster.
  5. Restart the common agent container management daemon on all cluster nodes.
    # /opt/bin/cacaoadm start