JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster System Administration Guide     Oracle Solaris Cluster 4.1
search filter icon
search icon

Document Information

Preface

1.  Introduction to Administering Oracle Solaris Cluster

2.  Oracle Solaris Cluster and RBAC

3.  Shutting Down and Booting a Cluster

4.  Data Replication Approaches

5.  Administering Global Devices, Disk-Path Monitoring, and Cluster File Systems

Overview of Administering Global Devices and the Global Namespace

Global Device Permissions for Solaris Volume Manager

Dynamic Reconfiguration With Global Devices

Administering Storage-Based Replicated Devices

Administering EMC Symmetrix Remote Data Facility Replicated Devices

How to Configure an EMC SRDF Replication Group

How to Configure DID Devices for Replication Using EMC SRDF

How to Verify EMC SRDF Replicated Global Device Group Configuration

Example: Configuring an SRDF Replication Group for Oracle Solaris Cluster

Overview of Administering Cluster File Systems

Cluster File System Restrictions

Administering Device Groups

How to Update the Global-Devices Namespace

How to Change the Size of a lofi Device That Is Used for the Global-Devices Namespace

Migrating the Global-Devices Namespace

How to Migrate the Global-Devices Namespace From a Dedicated Partition to a lofi Device

How to Migrate the Global-Devices Namespace From a lofi Device to a Dedicated Partition

Adding and Registering Device Groups

How to Add and Register a Device Group (Solaris Volume Manager)

How to Add and Register a Device Group (Raw-Disk)

How to Add and Register a Replicated Device Group (ZFS)

Maintaining Device Groups

How to Remove and Unregister a Device Group (Solaris Volume Manager)

How to Remove a Node From All Device Groups

How to Remove a Node From a Device Group (Solaris Volume Manager)

How to Remove a Node From a Raw-Disk Device Group

How to Change Device Group Properties

How to Set the Desired Number of Secondaries for a Device Group

How to List a Device Group Configuration

How to Switch the Primary for a Device Group

How to Put a Device Group in Maintenance State

Administering the SCSI Protocol Settings for Storage Devices

How to Display the Default Global SCSI Protocol Settings for All Storage Devices

How to Display the SCSI Protocol of a Single Storage Device

How to Change the Default Global Fencing Protocol Settings for All Storage Devices

How to Change the Fencing Protocol for a Single Storage Device

Administering Cluster File Systems

How to Add a Cluster File System

How to Remove a Cluster File System

How to Check Global Mounts in a Cluster

Administering Disk-Path Monitoring

How to Monitor a Disk Path

How to Unmonitor a Disk Path

How to Print Failed Disk Paths

How to Resolve a Disk-Path Status Error

How to Monitor Disk Paths From a File

How to Enable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail

How to Disable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail

6.  Administering Quorum

7.  Administering Cluster Interconnects and Public Networks

8.  Adding and Removing a Node

9.  Administering the Cluster

10.  Configuring Control of CPU Usage

11.  Updating Your Software

12.  Backing Up and Restoring a Cluster

A.  Example

Index

Administering Disk-Path Monitoring

Disk path monitoring (DPM) administration commands enable you to receive notification of secondary disk-path failure. Use the procedures in this section to perform administrative tasks that are associated with monitoring disk paths. Refer to Chapter 3, Key Concepts for System Administrators and Application Developers, in Oracle Solaris Cluster Concepts Guide for conceptual information about the disk-path monitoring daemon. Refer to the cldevice(1CL) man page for a description of the command options and related commands. For more information about tuning the scdpmd daemon, see the scdpmd.conf(4) man page. Also see the syslogd(1M) man page for logged errors that the daemon reports.


Note - Disk paths are automatically added to the monitoring list monitored when I/O devices are added to a node by using the cldevice command. Disk paths are also automatically unmonitored when devices are removed from a node by using Oracle Solaris Cluster commands.


Table 5-5 Task Map: Administering Disk-Path Monitoring

Task
Instructions
Monitor a disk path.
Unmonitor a disk path.
Print the status of faulted disk paths for a node.
Monitor disk paths from a file.
Enable or disable the automatic rebooting of a node when all monitored shared-disk paths fail.
Resolve an incorrect disk-path status. An incorrect disk-path status can be reported when the monitored DID device is unavailable at boot time, and the DID instance is not uploaded to the DID driver.

The procedures in the following section that issue the cldevice command include the disk-path argument. The disk-path argument consists of a node name and a disk name. The node name is not required and defaults to all if you do not specify it.

How to Monitor a Disk Path

Perform this task to monitor disk paths in your cluster.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on any node in the cluster.
  2. Monitor a disk path.
    # cldevice monitor -n node disk
  3. Verify that the disk path is monitored.
    # cldevice status device

Example 5-24 Monitoring a Disk Path on a Single Node

The following example monitors the schost-1:/dev/did/rdsk/d1 disk path from a single node. Only the DPM daemon on the node schost-1 monitors the path to the disk /dev/did/dsk/d1 .

# cldevice monitor -n schost-1 /dev/did/dsk/d1
# cldevice status d1

Device Instance   Node           Status
--------------- ---- ------
/dev/did/rdsk/d1   phys-schost-1 Ok

Example 5-25 Monitoring a Disk Path on All Nodes

The following example monitors the schost-1:/dev/did/dsk/d1 disk path from all nodes. DPM starts on all nodes for which /dev/did/dsk/d1 is a valid path.

# cldevice monitor /dev/did/dsk/d1
# cldevice status /dev/did/dsk/d1

Device Instance   Node           Status
--------------- ---- ------
/dev/did/rdsk/d1   phys-schost-1 Ok

Example 5-26 Rereading the Disk Configuration From the CCR

The following example forces the daemon to reread the disk configuration from the CCR and prints the monitored disk paths with status.

# cldevice monitor +
# cldevice status
Device Instance              Node               Status
---------------              ----               ------
/dev/did/rdsk/d1             schost-1           Ok
/dev/did/rdsk/d2             schost-1           Ok
/dev/did/rdsk/d3             schost-1           Ok
                             schost-2           Ok
/dev/did/rdsk/d4             schost-1           Ok
                             schost-2           Ok
/dev/did/rdsk/d5             schost-1           Ok
                             schost-2           Ok
/dev/did/rdsk/d6             schost-1           Ok
                             schost-2           Ok
/dev/did/rdsk/d7             schost-2           Ok
/dev/did/rdsk/d8             schost-2           Ok

How to Unmonitor a Disk Path

Use this procedure to unmonitor a disk path.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on any node in the cluster.
  2. Determine the state of the disk path to unmonitor.
    # cldevice status device
  3. On each node, unmonitor the appropriate disk paths.
    # cldevice unmonitor -n node disk

Example 5-27 Unmonitoring a Disk Path

The following example unmonitors the schost-2:/dev/did/rdsk/d1 disk path and prints disk paths with status for the entire cluster.

# cldevice unmonitor -n schost2 /dev/did/rdsk/d1
# cldevice status -n schost2 /dev/did/rdsk/d1

Device Instance              Node               Status
---------------              ----               ------
/dev/did/rdsk/d1             schost-2           Unmonitored

How to Print Failed Disk Paths

Use the following procedure to print the faulted disk paths for a cluster.

  1. Assume the root role on any node in the cluster.
  2. Print the faulted disk paths throughout the cluster.
    # cldevice status -s fail

Example 5-28 Printing Faulted Disk Paths

The following example prints faulted disk paths for the entire cluster.

# cldevice status -s fail
     
Device Instance               Node              Status
---------------               ----              ------
dev/did/dsk/d4                phys-schost-1     fail

How to Resolve a Disk-Path Status Error

If the following events occur, DPM might not update the status of a failed path when it comes back online:

The incorrect disk-path status is reported because the monitored DID device is unavailable at boot time, and therefore the DID instance is not uploaded to the DID driver. When this situation occurs, manually update the DID information.

  1. From one node, update the global-devices namespace.
    # cldevice populate
  2. On each node, verify that command processing has completed before you proceed to the next step.

    The command executes remotely on all nodes, even though the command is run from just one node. To determine whether the command has completed processing, run the following command on each node of the cluster.

    # ps -ef | grep cldevice populate
  3. Verify that, within the DPM polling time frame, the status of the faulted disk path is now Ok.
    # cldevice status disk-device
    
    Device Instance               Node                  Status
    ---------------               ----                  ------
    dev/did/dsk/dN                phys-schost-1         Ok

How to Monitor Disk Paths From a File

Use the following procedure to monitor or unmonitor disk paths from a file.

To change your cluster configuration by using a file, you must first export the current configuration. This export operation creates an XML file that you can then modify to set the configuration items you are changing. The instructions in this procedure describe this entire process.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.modify RBAC authorization on any node in the cluster.
  2. Export your device configuration to an XML file.
    # cldevice export -o configurationfile
    -o configurationfile

    Specify the file name for your XML file.

  3. Modify the configuration file so that device paths are monitored.

    Find the device paths that you want to monitor, and set the monitored attribute to true.

  4. Monitor the device paths.
    # cldevice monitor -i configurationfile
    -i configurationfile

    Specify the file name of the modified XML file.

  5. Verify that device path is now monitored.
    # cldevice status

Example 5-29 Monitor Disk Paths From a File

In the following example, the device path between the node phys-schost–2 and device d3 is monitored by using an XML file.

The first step is to export the current cluster configuration.

# cldevice export -o deviceconfig

The deviceconfig XML file shows that the path between phys-schost–2 and d3 is not currently monitored.

<?xml version="1.0"?>
<!DOCTYPE cluster SYSTEM "/usr/cluster/lib/xml/cluster.dtd">
<cluster name="brave_clus">
.
.
.
   <deviceList readonly="true">
    <device name="d3" ctd="c1t8d0">
      <devicePath nodeRef="phys-schost-1" monitored="true"/>
      <devicePath nodeRef="phys-schost-2" monitored="false"/>
    </device>
  </deviceList>
</cluster>

To monitor that path, set the monitored attribute to true, as follows.

<?xml version="1.0"?>
<!DOCTYPE cluster SYSTEM "/usr/cluster/lib/xml/cluster.dtd">
<cluster name="brave_clus">
.
.
.
   <deviceList readonly="true">
    <device name="d3" ctd="c1t8d0">
      <devicePath nodeRef="phys-schost-1" monitored="true"/>
      <devicePath nodeRef="phys-schost-2" monitored="true"/>
    </device>
  </deviceList>
</cluster>

Use the cldevice command to read the file and turn on monitoring.

# cldevice monitor -i deviceconfig

Use the cldevice command to verify that the device is now monitored.

# cldevice status

See Also

For more detail about exporting cluster configuration and using the resulting XML file to set cluster configuration, see the cluster(1CL) and the clconfiguration(5CL) man pages.

How to Enable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail

When you enable this feature, a node automatically reboots, provided that the following conditions are met:

Rebooting the node restarts all resource groups and device groups that are mastered on that node on another node.

If all monitored shared-disk paths on a node remain inaccessible after the node automatically reboots, the node does not automatically reboot again. However, if any disk paths become available after the node reboots but then fail, the node automatically reboots again.

When you enable the reboot_on_path_failure property, the states of local-disk paths are not considered when determining if a node reboot is necessary. Only monitored shared disks are affected.

  1. On any node in the cluster, assume a role that provides solaris.cluster.modify RBAC authorization.
  2. For all nodes in the cluster, enable the automatic rebooting of a node when all monitored shared-disk paths to it fail.
    # clnode set -p reboot_on_path_failure=enabled +

How to Disable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail

When you disable this feature and all monitored shared-disk paths on a node fail, the node does not automatically reboot.

  1. On any node in the cluster, assume a role that provides solaris.cluster.modify RBAC authorization.
  2. For all nodes in the cluster, disable the automatic rebooting of a node when monitored all monitored shared-disk paths to it fail.
    # clnode set -p reboot_on_path_failure=disabled +