Sun StorageTek MPIO Device Specific Module
The Sun StorageTek Multipath I/O (MPIO) Device Specific Module (DSM) for 6000 and 2500 Series Arrays is available for the following Windows OS versions:
Only one version of the DSM driver can exist on the server because the driver versions are not compatible.
You can use MPIO for all controllers that run array controller firmware version 06.19 or later.
|Note - The array controller firmware is delivered with the Sun StorageTek Common Array Manager (CAM) software. For the latest patches available for your system, check Sun Solve at: http://www.sunsolve.sun.com.|
Microsoft Multipath I/O (MPIO) provides an infrastructure to build highly available solutions for the Windows 2000 operating system (OS), the Windows Server 2003 OS, and the Windows Server 2008 OS. MPIO uses Device Specific Modules (DSMs) to provide I/O routing decisions, error analysis, and failover.
TABLE 1 lists the features provided by the DSM driver.
If multiple paths exist to a single controller and a SCSI-2 R/R is received for a volume, the DSM driver selects one path to each controller and repeats the request (called a Reservation Path). This function is necessary because the controllers cannot accept SCSI-2 R/R requests through multiple paths for a given volume. After the reservation path has been established, subsequent I/O requests for a volume are restricted to that path until a SCSI-2 release command is received. The DSM driver distributes the reservation paths if multiple volumes are mapped to the host, which distributes the load across multiple paths to the same controller.
The DSM driver also supports the ability to translate the SCSI-2 R/R commands into SCSI-3 Persistent Reservations (PR). This allows a volume to use all of the available controller paths rather than being restricted to a single Reservation Path as described above. This feature requires the DSM driver to establish a unique “reservation key” for each host. This key is stored in the Registry and is named “S2toS3Key.” If this key is present translations are performed; else, the “cloning” method mentioned above is used.
Clustering for Windows Server 2008 uses SCSI-3 Persistent Reservations natively, so the DSM driver does not perform any of the methods listed for SCSI-2 R/R. Translations still occur if the DSM driver is running in a Windows 2000 or 2003 server-based environment.
The MPIO solution relies on the DSM drivers to provide I/O routing, error resolution, and failover resolution. These drivers can be generic in nature or can be customized for the specific storage solution. As a result, multiple DSMs can be installed on a single host, each one potentially managing different storage arrays. Windows allows multiple copies of the same driver to run at the same time; however, the drive’s binary name must be different.
The SynchTimeout parameter determines the I/O timeout for synchronous requests generated by the DSM driver. Examples include the SCSI-2 to SCSI-3 PR translations and Inquiry commands used during device discovery.
If the SynchTimeout value is defined in the Registry key of the DSM driver, document it if the TimeOutValue of the MS Disk Driver is defined in the Registry, use the higher of the two values as the SynchTimeout value.
For example, if the SynchTimeout is 120 seconds and the TimeOutValue is 60 seconds, use 120 seconds for the value. If the SynchTimeout is 120 seconds and the TimeOutValue is 180 seconds, use 180 seconds for the value of the synchronous I/O requests for the DSM driver.
The following storage arrays are supported by the DSM driver.
For a failover driver to claim ownership over the default Disk.sys driver, you must modify two items during installation:
Installation of the failover drivers takes very little time compared to the required modifications to the device nodes in the Registry. Also, the installation takes very little time compared to restarting the devices.
Set the hardware and compatible IDs so that their IDs return an identifier known only to the failover driver’s INF configuration file. Use a filter driver that resides on the HBA bus driver. The HBA bus driver intercepts the identification process.
The plug-and-play (PnP) process is asynchronous, and, because it is event driven, no concept of completion exists. As a result, even though an installation has completed, PnP itself might continue through the driver re-initialization process.
This type of behavior can be due to a target device that has all possible support items entered in the Registry. The DSM driver looking for a device that is not attached, has to wait for an I/O timeout to move to the next device. Also, long delays might occur if you immediately restart a host after failover driver installation, because this process has not yet completed.
Multipath DSM drivers are available from the Sun Download Center. Packages are available for the Sun StorageTek 6000 array series (includes Sun Storage 6180, 6580, and 6780 arrays) and Sun StorageTek 2500 array series.
|Caution - MPIO and RDAC cannot coexist on the same server.|
1. To download the MPIO driver, go to:
2. Select the package for the appropriate platform version and array family.
3. Open the installation program.
The installation window appears.
4. Click Next.
The copyright page is displayed.
5. Click Next.
6. Accept the terms of the license agreement, and click Next.
7. Click Install.
8. Select one of these options, and click Done.
The MPIO failover driver contains configuration settings that can modify the behavior of the driver. Any changes to the settings take effect on the next reboot of the host. The default values listed here are the platform-independent settings. Many of these values are overridden by the failover installer for Windows. For Windows, the configuration settings can be found in the Registry under:
where DSM_Driver is the name of the OEM-specific driver. The default driver is named mppdsm.sys.
|Caution - Modifying Registry entries should only be done by experienced users.|
1. To view or modify Registry settings, right-click Start and select Run.
2. Type regedit and click OK.
|Caution - You might lose access to the storage array if you change these settings from their configured values.|
The number of times a selection timeout is retried for an I/O request before the path fails. If another path to the same controller exists, the I/O is retried. If no other path to the same controller exists, a failover takes place. If no valid paths exist to the alternate controller, the I/O is failed.
The number of times a command timeout is retried for an I/O request before the path fails. If another path to the same controller exists, the I/O is retried. If another path does not exist, a failover takes place. If no valid paths exist to the alternate controller, the I/O fails.
The timeout, in seconds, for synchronous I/O requests generated internally by the failover driver. Examples of internal requests include those related to rebalancing, path validation, and issuing of failover commands.
When the failover driver receives an I/O request for the first time, the failover driver logs timestamp information for the request. If a request returns an error and the failover driver retries the request, the current time is compared with the original timestamp information. Depending on the error and the amount of time that has elapsed, the request is retried to the current owning controller for the LUN or a failover is performed and the request sent to the alternate controller. This process is known as a "Wait Time."
For the Windows OS, the configuration settings can be found in the Registry under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\<DSM_Driver>, where <DSM_Driver> is the name of the OEM-specific driver. The default driver is named mppdsm.sys. Any changes to the settings take effect on the next reboot of the host.
|Caution - Possible loss of data access. If you change the Wait Time settings from their configured values, you might lose access to the storage array.|
Provides an upper-bound limit, in seconds, an I/O is retried on a controller regardless of retry status before a failover is performed. If the limit is exceeded on the alternate controller the I/O is again attempted on the original controller. This process continues until the ArrayIoWaitTime limit.
Provides an upper-bound limit, in seconds, an I/O is retried to the array regardless of which controller the request is attempted to. Once this limit is exceeded the I/O is returned with a failure status.
The path congestion detection feature allows the DSM driver to place a path offline based on the I/O latency of a path. You can change the settings for this feature by using the dsmUtil utility. Any settings that you change and save with the SaveSettings option will then be found in the Registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\<DSM_Driver> where <DSM_Driver> is the name of the OEM-specific driver. Any changes to the settings take effect on the next reboot of the host.
A Boolean value that indicates whether the path congestion detection is enabled. If this parameter is not defined or is set to 0x0, the value is false, the path congestion feature is disabled, and all of the other parameters are ignored. If set to 0x1, the path congestion feature is enabled. The allowed values are 0x0 or 0x1.
If CongestionIoCount is 0x0 or not defined, this parameter represents an average response time in seconds allowed for an I/O request. If the value of the CongestionIoCount parameter is nonzero, then this parameter is the absolute time allowed for an I/O request. The allowed values range from 0x1 to 0x10000 (approximately 18 hours).
The number of I/O requests that have exceeded the value of the CongestionResponseTime parameter within the value of the CongestionTimeFrame parameter. The allowed values range from 0x0 to 0x1000 (approximately 4000 requests).
The number of I/O requests that must be sent to a path before the nth request is used in the average response time calculation. For example, if this parameter is set to 100, every 100th request sent to a path will be used in the average response time calculation. If this parameter is set to 0x0 or not defined, the path congestion feature is disabled for performance reasons every I/O request would incur a calculation. The allowed values range from 0x1 to 0xFFFFFFFF (approximately 4 billion requests).
A Boolean value that indicates whether the DSM driver will take the last path available to the storage array offline if the congestion thresholds have been exceeded. If this parameter is not defined or is set to 0x0, the value is false. The allowed values are 0x0 or 0x1.
A Boolean value that indicates whether the DSM driver will place a path into an offline state. If this parameter is not defined or set to 0x0, the value is false, and a warning message will be sent to the logs, but the path is not taken offline. If the parameter is set to 0x1, an error message is sent to the logs and the path is taken offline. The allowed values are 0x0 or 0x1.
The time period during which failed or lost paths can be recovered before a failover is performed to hold the alternate controller in reset. The value is set in seconds. The allowed range of values is 0x3C (60 seconds) to 0x12C (5 minutes).
The maximum number of failover events allowed during the value set in the FailoverEventThresholdTimeFram e parameter. If the number of failover events exceeds the value of this parameter, a failover is performed, holding the failing controller in reset. The allowed range of values is 0x1 to 0x0A.
For the best performance of a redundant controller system, divide I/O activity between the two RAID controllers. You can use the Sun StorageTek Common Array Manager graphical user interface (GUI) or the command line interface (CLI).
To divide I/O activity between two RAID controllers in the management software, perform one of these steps:
You can change the preferred path setting for a volume or a set of volumes online without stopping the applications. The driver uses the new preferred path immediately. Refer to the CAM online help for more details about changing ownership.
An event and an associated alert notification are delivered for a volume that is not on preferred path condition.
When installing the DSM driver, you see the message "1633 This installation package is not supported by this processor type."
The installers are processor dependent. The appropriate file will be of the form: SMIA-<processor chip>-<version>.exe. The codes are:
WS32 = All 32-bit processors
WS64 = Intel Itanium 64-bit processors only
WSX64 = Intel XEON 64-bit /AMD 64-bit (Opteron) processors
Disk devices or HBAs show a yellow exclamation point.
When you use Windows OS Device Manager, you might observe that a disk device or an HBA icon has a yellow exclamation point on it. If new volumes have been mapped to the host, the exclamation point might appear on the icon for a few seconds. This action occurs because the Plug and Play Manager is reconfiguring the device, and, during this time, the device or the HBA might not be used. If the exclamation point stays for more than one minute, a configuration error has occurred.
Disk devices or HBAs show a red X.
When you use Windows OS Device Manager, you might notice that a disk device icon or an HBA icon has a red X on it. This X indicates that the device has been disabled. A disabled device cannot be used or communicated with until it is re-enabled. If the disabled device is an adapter, any disk devices that were connected to that adapter are removed from Device Manager.
Cannot see any volumes.
If the host has not detected the volumes, an HBA problem or a controller problem has occurred. Make sure that the HBAs are logging into the switch or the controller. If they are not logging in, the problem is probably HBA related.
If the HBAs have logged into the controller, a controller issue might be the problem.
Perform these steps:
1. Make sure that all fiber cables are seated correctly.
2. Make sure that all gigabit interface connectors (GBICs) are seated correctly.
3. Determine the HBA BIOS and driver versions that the system uses, and make sure that the HBA BIOS and driver versions are correct.
4. Make sure that your host-to-volume mappings are correctly mapped in CAM. Do not use any HBA mapping tools.
5. Use WinObj to determine if the host has detected the volumes.
How do I know if a host has detected my volumes?
Use WinObj to determine if the host has seen the volumes.
1. If the host does not see the volumes, an HBA problem or a controller problem has occurred.
2. Make sure that the HBAs log into the switch or the controller. If they are not logging correctly, the problem is probably HBA related.
3. If the HBAs have logged into the controller, the problem might be a controller issue.
When I boot my system, I get a "Registry Corrupted" message.
Refer to the Microsoft Knowledge Base article 277222 at
Registry limitations can result in devices and paths that are not recognizable by the host OS and the failover driver.
My controller failover test does not fail over.
Make sure that you have looked through the rest of this document for the problem. If you think that the problem is still DSM related, contact a Sun Customer Support representative.
After I install MPIO, why does my system takes a long time to start?
You might still experience long start times after you install MPIO because the Windows OS is completing its configuration for each device.
For example, you install MPIO on a host with no storage attached, and you restart the host. Before the Windows OS actually starts, you plug in a cable to a storage array with 32 volumes. In the start-up process, Plug andPlay detects the configuration change and starts to process it. After the configuration change has completed, subsequent restarts do not experience any delays unless additional configuration changes are detected. The same process can occur even if the host has already started.
What host type must I use for MPIO/ DSM solutions?
If you use Microsoft Cluster Server, select a host type of the Windows 2000/Server 2003 clustered OS. If you do not use Microsoft Cluster Server, select a host type of the Windows 2000/Server 2003 non-clustered OS. You set the host type using the Sun StorageTek Common Array Manager (CAM). Select the array you want to configure, go to Administration > Details, and select the appropriate default host type setting.
|Note - The Windows 2000/Server 2003 host type settings can also be used for Windows 2008 platforms.|
When I install MPIO on the Windows 2000 OS, the installation fails.
If you have recently installed Service Pack 4 or Critical Update QFE 813044 to a host, you must install a HotFix (KB822831) to fix a known problem with these updates. After the HotFix has been installed, you can try install or update MPIO again. Determine if Service Pack 4 or the QFE update is installed on your system. Go to the Control Panel on the Start menu, and then select Add/Remove Programs.
How can I tell if MPIO is installed?
Perform these steps:
1. Go to the Control Panel on the Start menu, and double-click Administrative Tools.
2. Select Computer Management > Device Manager >SCSI and RAID controllers.
3. Look for Multi-Path Support. If it is present, MPIO is installed.
How can I tell if the DSM driver is installed?
Perform these steps:
1. Go to the Control Panel on the Start menu, and then double-click Administrative Tools.
2. Select Computer Management > Device Manager > System Devices.
3. Look for the installed DSM. The name ends with the text “Device-Specific Module for Multi-Path.”
If it is present, DSM is installed.
Why is the path down at the time of registration?
As part of the SCSI reservation handling, a path might be down at the time that the driver tries to register all paths to a volume. If this is the case, the driver registers all working paths. The failback task performs the registration for the down path after it is working again.
What should I do if I receive this message? "Warning: Changing the storage array name can cause host applications to lose access to the storage array if the host is running certain path failover drivers. If any of your hosts are running path failover drivers, please update the storage array name in your path failover driver’s configuration file before rebooting the host machine to insure uninterrupted access to the storage array. Refer to your path failover driver documentation for more details."
In the Windows OS, you do not need to update files. The information is dynamically created only when the storage array is found initially. Use one of these two options to correct this behavior:
With the failover driver, two cases determine if a path has failed:
An entry is made in the OS system log that shows that the failover driver has detected a path failure. CAM does not generate an alarm because no internal problems exist for the array.
CAM generates the “Volume Not on Preferred Path" alarm for all volumes affected by this scenario. If the array administrator has configured notifications in CAM, the administrator will receive email from CAM or a configured SNMP server. You also have the option of opening a service request using the Auto Service Request (ASR) feature of CAM. The resultant message and alarm will provide information about the fault, along with possible recovery instructions.
The failover driver has five error levels for messages that are logged to the Windows System Event log:
TABLE 4 lists the possible Windows OS fatal driver errors.
The following items are examples of failover driver controller events and path failover events:
You can use WinObj to view the Object Manager namespace that is maintained by the operating system. Every Windows OS driver that creates objects in the system can associate a name with the object that can be viewed from WinObj. With WinObj, you can view the volumes and paths that the HBAs have identified. You can also view what the failover driver identifies from a storage array.
For more information about WinObj, see: http://www.microsoft.com/technet/sysinternals/SystemInformation/WinObj.mspx
The directory structures for the DSM driver include these paths:
A named device object that represents a drive. The <adapter> value represents the HBA vendor. For QLogic, this value is based on the HBA model number (for example, ql2300). The <port> value, the <path> value, and the <target> value represent the location of the volume on the HBA.
With this information, you can reach these conclusions:
Device Manager is part of the Windows operating system. To view information about the driver:
1. Select Control Panel from the Start menu.
2. Select Administrative Tools > Computer Management > Device Manager.
The Device Manager tree for MPIO bases the volume names on the vendor information and product ID information of the underlying physical device, along with the text Multi-Path Disk Device.
3. Scroll down to System Devices to view information about the DSM driver itself.
The Disk Drives section shows both the drives identified by the HBA drivers and the volumes created by MPIO.
4. Select one of the MPIO volumes, and right-click it.
5. Select Properties to show the Multi-Path Disk Device Properties window.
This properties window shows if the device is working correctly.
6. Select the Driver tab to view the driver information.
The dsmUtil utility is a general purpose command-line driven utility that works only with MPIO-based DSM solutions. The utility is used primarily as a way to instruct the DSM driver to perform various maintenance tasks, but the utility can also serve as a troubleshooting tool when necessary.
To use the dsmUtil utility, type this command, and press Enter.
dsmUtil [[-a [target_id]] [-c array_name | missing ] [-d debug_level] [-e error_level] [-g virtual_target_id] [-o feature_action_name[=value][, SaveSettings]]
[-M] [-P [GetMpioParameters | MpioParameter=value | ...]] [-R] [-s "failback" | "avt" | "busscan" | "forcerebalance"] [-w target_wwn, controller_index]
|Note - The quotation marks must surround the parameters.|
Typing dsmUtil without any parameters will show the usage information..
Shows a summary of all storage arrays seen by the DSM driver. The summary displays the target_id, storage array WWID, and storage array name. If target_id is specified, DSM point-in-time state information appears for the storage array. On Unix OSes, the virtual HBA specifies unique target IDs for each storage array. The Windows MPIO virtual HBA driver does not use target IDs. The parameter for this option can be viewed as an offset into the DSM driver information structures, with each offset representing a different storage array.
Clears the WWN file entries. This file is located in the Program Files\DSMDrivers\mppdsm\WWN_FILES directory with the extension .wwn. If the array_name keyword is specified, the WWN file for the specific storage array is deleted. If the missing keyword is used, all WWN files for previously attached storage arrays are deleted. If neither keyword is used, all of the WWN files, for both currently attached and previously attached storage arrays are deleted.
Sets the current debug reporting level. This option only works if the RDAC driver has been compiled with debugging enabled. Debug reporting is comprised of two segments. The first segment refers to a specific area of functionality, and the second segment refers to the level of reporting within that area. The debug_level is one of these hexadecimal numbers:
Troubleshoots a feature or changes a configuration setting. Without the SaveSettings keyword, the changes only affect the in-memory state of the variable. The SaveSettings keyword changes both the inmemory state and the persistent state. Some example commands are:
Extended Path Recovery Options--For a description of this feature, see Windows DSM Configuration Settings.
dsmUtil -o ControllerPathRecoveryTimeFrame=70, SaveSettings--Sets the time during which the path can be recovered before a failover occurs to 70 seconds. Saves the setting in the Registry so it persists after host reboots.
Path Congestion Detection Options--For a description of this feature, see Windows DSM Configuration Settings.
dsmUtil -o CongestionDetectionEnabled=0x1--Enables the path congestion feature. Before it can be enabled, the CongestionResponseTime parameter, the CongestionTimeFrame parameter, and the CongestionSamplingInterval parameter must be set to valid values.
Manually initiates one of the DSM driver’s scan tasks. A failback scan will cause the DSM driver to reattempt communications with any failed controllers. An avt scan causes the DSM driver to check whether AVT has been enabled/disabled for an entire storage array. A busscan scan causes the DSM driver to go through its unconfigured devices list to see if any of them have become configured. A forcerebalance scan causes the DSM driver to move storage array volumes to its preferred controller and ignores the value of the DisableLunRebalance configuration parameter of the DSM driver.