C H A P T E R  7

Disk Control and Monitor Utility (DCMU) for SLES 10

This chapter describes how to use the Disk Control and Monitor Utility (DCMU) on a SUSE Linux Enterprise Server 10 (SLES 10) 64-bit operating system. It includes the following sections:


Overview of the Disk Control and Monitor Utility for SLES 10

The Disk Control and Monitor Utility (DCMU) controls and monitors all 48-disk drives on the Sun Fire X4500 server and provides the following features:


DCMU Installation Procedure

To use Disk Control and Monitor Utility (DCMU), you must install the application. To install the application, you should perform the following steps:

Installing DCMU

The installation of DCMU consists of one step because the package is in rpm format. The DCMU package comes with two rpm files. One is the source rpm and other is the binary rpm.


procedure icon  To Install DCMU

Enter the following command:

# rpm -ivh dcmu-1.3-7.x86_64.rpm 

The following files are installed as components of the DCMU installation:

IPMI Service Must be Running to Use DCMU Utilities

The initial installation of the DCMU components prepares the system for running the DCMU utilities described in this chapter. However, since the DCMU utilities also require that the IPMI service is running, you have two options before you can start using the DCMU utilities: manually start the IPMI service, or reboot the server (which automatically starts diskmond and IPMI).

If rebooting the server after the initial DCMU installation is not possible, and you wish to run DCMU utilities, you must first start the IPMI service by entering the following command:

# service ipmi start


Note - After the initial installation of DCMU, rebooting the server starts both IPMI and diskmond.


Uninstalling DCMU

To uninstall DCMU, perform the following procedure.


procedure icon  To uninstall DCMU

Enter the following command:

# rpm -e dcmu-1.3-7


diskmond Command

The Disk Control and Monitor Utility (DCMU) for SLES 10 has one primary utility called diskmond. diskmond is started at boot time with default polling interval of 60 minutes. It updates the FRU (Field Replacable Units), SDR (Sensor Data Record), SEL (System Event Log) and service processor logs.

diskmond spawns one thread to monitor hotplug event, another thread to monitor pending drive failure and reports both of these events to the service processor (SP). Diskmond performs the following functions:

diskmond Command Options

Use the diskmond command to connect, disconnect, and determine disk drive status by using the parameters shown in TABLE 7-1. The following options are supported for the functions shown:


TABLE 7-1 diskmond Command Options

Option

Description

-h

Displays help information

-V

Displays utility version information

-D

Displays disk drive information

-t minutes

Displays polling interval information (in minutes) in the syslog.



Examples Using the diskmond Command

This section contains examples of common diskmond commands issued from the command line. For more information and options, refer to the diskmon man page.

Starting diskmond From the Command-line

To start diskmon, enter the following command:

# service diskmond start

Stopping diskmond From the Command-line

To stop diskmon, enter the following command:

# service diskmond stop

Finding the status of diskmond From the Command-line

To obtain status from diskmon, enter the following command:

# service diskmond status


cfgdisk Command

cfgdisk queries and provides status of all 48-disk drives located in the Sun Fire X4500 server. cfgdisk also allows you to connect and disconnect disk drives from the OS while also allowing you to monitor disks connected to the server.

Use the cfgdisk command to connect, disconnect, and determine disk drive status by using the parameters shown in TABLE 7-2. The following options are supported for the functions shown:


TABLE 7-2 cfgdisk Command Options

Option

Description

-h

Displays help information

-V

Displays utility version information

-o

Connects and disconnects disk drive(s)

-d

Displays disk drive information



Examples Using the cfgdisk Command

This section contains examples of common cfgdisk commands issued from the command line. For more information and options, refer to the cfgdisk man page.

Displaying Disk, Device Nodes, Slots and Status

The following command displays a map of all disk drives:

# cfgdisk

Here is an example of cfgdisk command output listing physical slot number, logical name, and status information:


CODE EXAMPLE 7-1 cfgdisk Command Output
Device                        Slot Number                 Device Node                                                                Status
sata0/0       10           /dev/sda                         Connected
sata0/1       22           /dev/sdl                         Connected
sata0/2       34           /dev/sdx                         Connected
sata0/3       46           /dev/sdam                        Connected
sata0/4       11           /dev/sde                         Connected
sata0/5       23           /dev/sdn                         Connected
sata1/0        8           /dev/sdi                         Connected
sata1/1       20           /dev/sdj                         Connected
sata1/2       32           /dev/sdv                         Connected
sata1/3       44           /dev/sdak                        Connected
sata1/4        9           /dev/sdm                         Connected
sata1/5       21           /dev/sdk                         Connected
sata1/7       45           /dev/sdal                        Connected
sata2/1       14           /dev/sdd                         Connected
sata2/2       26           /dev/sdr                         Connected
sata2/6       27           /dev/sds                         Connected
sata2/7       39           /dev/sdae                        Connected
sata3/0        0           /dev/sdy                         Connected
sata3/1       12           /dev/sdb                         Connected
sata3/2       24           /dev/sdo                         Connected
sata3/3       36           /dev/sdaa                        Connected
sata3/4        1           /dev/sdac                        Connected
sata3/5       13           /dev/sdc                         Connected
sata4/0        6                                     Disconnected or not present
sata4/1       18                                     Disconnected or not present
sata4/3       42           /dev/sdaf                        Connected
sata4/4        7                                     Disconnected or not present
sata4/5       19           /dev/sdg                         Connected
sata4/6       31                                     Disconnected or not present
sata4/7       43           /dev/sdag                        Connected
sata5/0        4           /dev/sdaj                        Connected
sata5/1       16           /dev/sdh                         Connected
sata5/2       28           /dev/sdt                         Connected
sata5/4        5                                      Disconnected or not present
 

Disconnecting a Disk Using cfgdisk

Use the cfgdisk command to disconnect a disk before performing the hot plug event of physically removing it. The following command shows an example of how to use cfgdisk to disconnect a disk drive.

# cfgdisk -o disconnect -d sata5/1

The command returns the following prompts; enter Y at both:


Are you sure (y/n)? y
Are you sure sata5/1 device is not in use(y/n)? y
Device sata5/1 has been successfully disconnected

Connecting a Disk Using cfgdisk

After performing the hot plug event of physically adding a disk into the system, use the cfgdisk command to connect it. The following command shows an example of how to use cfgdisk to connect a disk drive.

# cfgdisk -o connect -d sata5/1

The command returns the following:


Command has been issued to connect sata5/1 device, it may take few seconds to connect sata5/1,check status by re-running ’cfgdisk’ 
command. 

Displaying cfgdisk Help Information

The following command shows how to use the cfgdisk command to display help information:

# cfgdisk -h

For additional information about cfgdisk or diskmond, refer to the man pages.


Viewing System and Service Processor Logs

As described above, dcmu monitors hotplug events and pending drive failures, controlled connect/disconnect events and logs these events in syslog and, more importantly, in the service processor logs (SDR, FRU, SEL). You may access these logs individually for specific information to aid in the administration or troubleshooting of the disk array. This section describes how to view individual log file information from the command line.

Viewing the SDR Log

The following commands show how view the SDR log file, either at the server:

# ipmitool -I open sdr elist

or over the network:

# ipmitool -I lan -H SP-IP -U root -P SP-password sdr elist

Where SP-IP represents the IP address of the service processor and SP-password represents the password for the service processor.

Viewing the FRU Log

The following commands show how view the FRU log file, either at the server:

# ipmitool -I open fru

or over the network:

# ipmitool -I lan -H SP-IP -U root -P SP-password fru

Where SP-IP represents the IP address of the service processor and SP-password represents the password for the service processor.



Note - When viewing the FRU log of a server running Linux, hard disk drive FRU information stored in the Service Processor FRU log may display a Product Name attribute. This attribute is meaningless, and should be ignored. Here’s an example of what you might see when viewing logged FRU data (via the ipmitool command or the server’s management tool) if this erroneous attribute were present:

FRU Device Description : hdd40.fru (ID 58)
Product Manufacturer : HITACHI
Product Name : 232VDDF12872G-40 <--
Ignore this line
Product Part Number : HDS7225SBSUN250G
Product Version : V44OA81A
Product Serial : VDK41BT4CAD0GE


Viewing the SEL log

The following commands show how view the SEL log file, either at the server:

# ipmitool -I open sel elist

or over the network:

# ipmitool -I lan -H SP-IP -U root -P SP-password sel elist

Where SP-IP represents the IP address of the service processor and SP-password represents the password for the service processor.

Viewing the System log

All events and error information from DCMU are logged in syslog (default: /var/log/messages). These include hard drive hotplug events, drive disconnect and connect events, and drive fault polling events.