Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Cluster System Administration Guide Oracle Solaris Cluster 4.1 |
1. Introduction to Administering Oracle Solaris Cluster
2. Oracle Solaris Cluster and RBAC
3. Shutting Down and Booting a Cluster
4. Data Replication Approaches
5. Administering Global Devices, Disk-Path Monitoring, and Cluster File Systems
Overview of Administering Global Devices and the Global Namespace
Global Device Permissions for Solaris Volume Manager
Dynamic Reconfiguration With Global Devices
Administering Storage-Based Replicated Devices
Administering EMC Symmetrix Remote Data Facility Replicated Devices
How to Configure an EMC SRDF Replication Group
How to Configure DID Devices for Replication Using EMC SRDF
How to Verify EMC SRDF Replicated Global Device Group Configuration
Example: Configuring an SRDF Replication Group for Oracle Solaris Cluster
Overview of Administering Cluster File Systems
Cluster File System Restrictions
How to Update the Global-Devices Namespace
How to Change the Size of a lofi Device That Is Used for the Global-Devices Namespace
Migrating the Global-Devices Namespace
How to Migrate the Global-Devices Namespace From a Dedicated Partition to a lofi Device
How to Migrate the Global-Devices Namespace From a lofi Device to a Dedicated Partition
Adding and Registering Device Groups
How to Add and Register a Device Group (Solaris Volume Manager)
How to Add and Register a Device Group (Raw-Disk)
How to Add and Register a Replicated Device Group (ZFS)
How to Remove and Unregister a Device Group (Solaris Volume Manager)
How to Remove a Node From All Device Groups
How to Remove a Node From a Device Group (Solaris Volume Manager)
How to Remove a Node From a Raw-Disk Device Group
How to Change Device Group Properties
How to Set the Desired Number of Secondaries for a Device Group
How to List a Device Group Configuration
How to Switch the Primary for a Device Group
How to Put a Device Group in Maintenance State
Administering the SCSI Protocol Settings for Storage Devices
How to Display the Default Global SCSI Protocol Settings for All Storage Devices
How to Display the SCSI Protocol of a Single Storage Device
How to Change the Default Global Fencing Protocol Settings for All Storage Devices
How to Change the Fencing Protocol for a Single Storage Device
Administering Cluster File Systems
How to Add a Cluster File System
How to Remove a Cluster File System
How to Check Global Mounts in a Cluster
Administering Disk-Path Monitoring
How to Print Failed Disk Paths
How to Resolve a Disk-Path Status Error
How to Monitor Disk Paths From a File
How to Enable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail
How to Disable the Automatic Rebooting of a Node When All Monitored Shared-Disk Paths Fail
7. Administering Cluster Interconnects and Public Networks
10. Configuring Control of CPU Usage
Disk path monitoring (DPM) administration commands enable you to receive notification of secondary disk-path failure. Use the procedures in this section to perform administrative tasks that are associated with monitoring disk paths. Refer to Chapter 3, Key Concepts for System Administrators and Application Developers, in Oracle Solaris Cluster Concepts Guide for conceptual information about the disk-path monitoring daemon. Refer to the cldevice(1CL) man page for a description of the command options and related commands. For more information about tuning the scdpmd daemon, see the scdpmd.conf(4) man page. Also see the syslogd(1M) man page for logged errors that the daemon reports.
Note - Disk paths are automatically added to the monitoring list monitored when I/O devices are added to a node by using the cldevice command. Disk paths are also automatically unmonitored when devices are removed from a node by using Oracle Solaris Cluster commands.
Table 5-5 Task Map: Administering Disk-Path Monitoring
|
The procedures in the following section that issue the cldevice command include the disk-path argument. The disk-path argument consists of a node name and a disk name. The node name is not required and defaults to all if you do not specify it.
Perform this task to monitor disk paths in your cluster.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
# cldevice monitor -n node disk
# cldevice status device
Example 5-24 Monitoring a Disk Path on a Single Node
The following example monitors the schost-1:/dev/did/rdsk/d1 disk path from a single node. Only the DPM daemon on the node schost-1 monitors the path to the disk /dev/did/dsk/d1 .
# cldevice monitor -n schost-1 /dev/did/dsk/d1 # cldevice status d1 Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 phys-schost-1 Ok
Example 5-25 Monitoring a Disk Path on All Nodes
The following example monitors the schost-1:/dev/did/dsk/d1 disk path from all nodes. DPM starts on all nodes for which /dev/did/dsk/d1 is a valid path.
# cldevice monitor /dev/did/dsk/d1 # cldevice status /dev/did/dsk/d1 Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 phys-schost-1 Ok
Example 5-26 Rereading the Disk Configuration From the CCR
The following example forces the daemon to reread the disk configuration from the CCR and prints the monitored disk paths with status.
# cldevice monitor + # cldevice status Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 schost-1 Ok /dev/did/rdsk/d2 schost-1 Ok /dev/did/rdsk/d3 schost-1 Ok schost-2 Ok /dev/did/rdsk/d4 schost-1 Ok schost-2 Ok /dev/did/rdsk/d5 schost-1 Ok schost-2 Ok /dev/did/rdsk/d6 schost-1 Ok schost-2 Ok /dev/did/rdsk/d7 schost-2 Ok /dev/did/rdsk/d8 schost-2 Ok
Use this procedure to unmonitor a disk path.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
# cldevice status device
# cldevice unmonitor -n node disk
Example 5-27 Unmonitoring a Disk Path
The following example unmonitors the schost-2:/dev/did/rdsk/d1 disk path and prints disk paths with status for the entire cluster.
# cldevice unmonitor -n schost2 /dev/did/rdsk/d1 # cldevice status -n schost2 /dev/did/rdsk/d1 Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 schost-2 Unmonitored
Use the following procedure to print the faulted disk paths for a cluster.
# cldevice status -s fail
Example 5-28 Printing Faulted Disk Paths
The following example prints faulted disk paths for the entire cluster.
# cldevice status -s fail Device Instance Node Status --------------- ---- ------ dev/did/dsk/d4 phys-schost-1 fail
If the following events occur, DPM might not update the status of a failed path when it comes back online:
A monitored-path failure causes a node reboot.
The device under the monitored DID path does not come back online until after the rebooted node is back online.
The incorrect disk-path status is reported because the monitored DID device is unavailable at boot time, and therefore the DID instance is not uploaded to the DID driver. When this situation occurs, manually update the DID information.
# cldevice populate
The command executes remotely on all nodes, even though the command is run from just one node. To determine whether the command has completed processing, run the following command on each node of the cluster.
# ps -ef | grep cldevice populate
# cldevice status disk-device Device Instance Node Status --------------- ---- ------ dev/did/dsk/dN phys-schost-1 Ok
Use the following procedure to monitor or unmonitor disk paths from a file.
To change your cluster configuration by using a file, you must first export the current configuration. This export operation creates an XML file that you can then modify to set the configuration items you are changing. The instructions in this procedure describe this entire process.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
# cldevice export -o configurationfile
Specify the file name for your XML file.
Find the device paths that you want to monitor, and set the monitored attribute to true.
# cldevice monitor -i configurationfile
Specify the file name of the modified XML file.
# cldevice status
Example 5-29 Monitor Disk Paths From a File
In the following example, the device path between the node phys-schost–2 and device d3 is monitored by using an XML file.
The first step is to export the current cluster configuration.
# cldevice export -o deviceconfig
The deviceconfig XML file shows that the path between phys-schost–2 and d3 is not currently monitored.
<?xml version="1.0"?> <!DOCTYPE cluster SYSTEM "/usr/cluster/lib/xml/cluster.dtd"> <cluster name="brave_clus"> . . . <deviceList readonly="true"> <device name="d3" ctd="c1t8d0"> <devicePath nodeRef="phys-schost-1" monitored="true"/> <devicePath nodeRef="phys-schost-2" monitored="false"/> </device> </deviceList> </cluster>
To monitor that path, set the monitored attribute to true, as follows.
<?xml version="1.0"?> <!DOCTYPE cluster SYSTEM "/usr/cluster/lib/xml/cluster.dtd"> <cluster name="brave_clus"> . . . <deviceList readonly="true"> <device name="d3" ctd="c1t8d0"> <devicePath nodeRef="phys-schost-1" monitored="true"/> <devicePath nodeRef="phys-schost-2" monitored="true"/> </device> </deviceList> </cluster>
Use the cldevice command to read the file and turn on monitoring.
# cldevice monitor -i deviceconfig
Use the cldevice command to verify that the device is now monitored.
# cldevice status
See Also
For more detail about exporting cluster configuration and using the resulting XML file to set cluster configuration, see the cluster(1CL) and the clconfiguration(5CL) man pages.
When you enable this feature, a node automatically reboots, provided that the following conditions are met:
All monitored shared-disk paths on the node fail.
At least one of the disks is accessible from a different node in the cluster.
Rebooting the node restarts all resource groups and device groups that are mastered on that node on another node.
If all monitored shared-disk paths on a node remain inaccessible after the node automatically reboots, the node does not automatically reboot again. However, if any disk paths become available after the node reboots but then fail, the node automatically reboots again.
When you enable the reboot_on_path_failure property, the states of local-disk paths are not considered when determining if a node reboot is necessary. Only monitored shared disks are affected.
# clnode set -p reboot_on_path_failure=enabled +
When you disable this feature and all monitored shared-disk paths on a node fail, the node does not automatically reboot.
# clnode set -p reboot_on_path_failure=disabled +