Sun Storage 2500-M2 Arrays Hardware Release Notes, October 2013 Update

This document contains important release information about Oracle’s Sun Storage 2500-M2 arrays managed by Sun Storage Common Array Manager (CAM), Version 6.9.0. Read this document so that you are aware of issues or requirements that can affect the installation and operation of the array.

The release notes consist of the following sections:


What’s New in This Release



Note - It is recommended that you avoid using disk drives with different spindle speeds in the same volume group. Doing so will impact system performance.


Platform and Firmware Patch IDs

Refer to the latest Sun Storage Common Array Manager release notes for the most recent patch versions.


TABLE 1 Platform and Firmware Patch IDs

Platform patch

Operating System

Firmware patch

Operating System

147416-xx[1]

Solaris SPARC

147660-xx

Solaris

147417-xx

Windows

147661-xx

Windows

147418-xx

Linux

147662-xx

Linux

147419-xx

Solaris X86

 

 


Upgrading Controller Firmware

Before you perform an online controller firmware upgrade, read the article Recommended Setting for the "fcp_offline_delay" Variable When Upgrading a Sun Storage 6000, 2500 or 2500-M2 Controller Firmware (Doc ID 1569976.1) located on MOS. This article describes how to modify fibre channel timeout values for Solaris SPARC and x86 hosts.


Product Overview

The Sun Storage 2500-M2 arrays are a family of storage products that provide high-capacity, high-reliability storage in a compact configuration. The controller tray, with two controller modules, provides the interface between a data host and the disk drives. Two array models and one expansion tray are offered:

The Sun Storage 2500-M2 arrays are modular and rack-mountable in industry-standard cabinets. The arrays are scalable from a single controller tray configuration to a maximum configuration of one controller tray and seven expansion trays. The maximum configuration creates a storage array with a total of 96 drives connected to 2530-M2 or 2540-M2 controllers or a total of 192 drives connected to 4GB 2540-M2 controllers (available as an upgrade or with new 2540-M2 controllers).

Use the latest version of Sun Storage Common Array Manager to manage the arrays. See About the Management Software for more information.


About the Management Software

Oracle’s Sun Storage Common Array Manager (CAM) software is a key component for the initial configuration, operation, and monitoring of Sun Storage 2500-M2 arrays hardware. It is installed on a management host cabled to the array via out-of-band Ethernet. Note: In-band management is also supported.

To download CAM, follow the procedure in the section Downloading Patches and Updates. Then, review the latest Sun Storage Common Array Manager Quick Start Guide and Sun Storage Common Array Manager Installation and Setup Guide to begin installation. CAM documentation can be found here:

http://www.oracle.com/technetwork/documentation/disk-device-194280.html


Downloading Patches and Updates

Download the latest platform and firmware patch (see TABLE 1) from My Oracle Support (MOS).

For detailed patch download steps, see the Knowledge article 1296274.1 available on MOS.



Note - Each array should be managed by one CAM management host only. Installing the management software on more than one host to manage the same array can cause discrepancies in the information reported by CAM.



System Requirements

The software and hardware products that have been tested and qualified to work with Sun Storage 2500-M2 arrays are described in the following sections. Sun Storage 2500-M2 arrays require Sun Storage Common Array Manager, Version 6.9.0 (or higher) software.

Firmware Requirements

The Sun Storage 2500-M2 arrays firmware version 07.84.47.10 is installed on the array controllers prior to shipment and is also delivered with the latest platform and firmware patches for Sun Storage Common Array Manager (CAM), Version 6.9.0.

Firmware is bundled with the CAM software download package. To download CAM, follow the procedure in Downloading Patches and Updates.

Supported Disk Drives and Tray Capacity

See the Sun System Handbook for the latest disk drive information:

https://support.oracle.com/handbook_partner/Systems/2530_M2/2530_M2.html

https://support.oracle.com/handbook_partner/Systems/2540_M2/2540_M2.html

Array Expansion Module Support

The Sun Storage 2530-M2 and 2540-M2 arrays can be expanded by adding Sun Storage 2501-M2 array expansion trays. To add capacity to an array, refer to the following Service Advisor procedures:



caution icon Caution - To add trays with existing stored data, contact My Oracle Support for assistance to avoid data loss.



TABLE 1 IOM Code for Sun Storage 2501-M2 Expansion Tray

Array Controller

Firmware

Supported Expansion Tray

IOM Code

Sun Storage 2500-M2

07.84.47.10

2501-M2[2]

0366


Data Host Requirements

Multipathing Software

You must install multipathing software on each data host that communicates with the Sun Storage 2500-M2 arrays.

Supported Host Bus Adaptors (HBAs)

SAS-1 HBA Settings

See the release notes for your HBA hardware for firmware support information.

Configuration: Firmware 01.29.06.00-IT with NVDATA 2DC5, BIOS 6.28.00.00, FCode 1.00.49.


TABLE 8 SAS-1 HBA Settings

Host OS

Settings

Solaris 10u9, SPARC

HBA defaults

Solaris 10u9, x86

IODeviceMissingDelay 20

ReportDeviceMissingDelay 20

Oracle Linux 5.8, 5.7, 5.6, 5.5
RHEL 5.8, 5.7, 5.6, 5.5

IODeviceMissingDelay 8

ReportDeviceMissingDelay 144

Oracle Linux 6.3, 6.2, 6.1, 6.0

RHEL 6.3, 6.2, 6.1, 6.0

IODeviceMissingDelay 8

ReportDeviceMissingDelay 144


Supported FC and Multilayer Switches

The following FC fabric and multilayer switches are compatible for connecting data hosts and the Sun Storage 2540-M2 array. See the release notes for your switch hardware for firmware support information.


Expansion Tray Specifications

The following information updates the specifications published in the Sun Storage 2500-M2 Arrays Site Preparation Guide.


TABLE 2 Physical Specifications

Expansion Tray

Height

Width

Depth

Weight--Maximum

2501-M2

3.4" (8.64 cm)

19" (48.26 cm)

21.75" (55.25 cm)

59.52 lb (27 kg)


 


TABLE 3 Maximum Power and Cooling Expansion Tray Specifications

Expansion Tray

KVA

Watts (AC)

BTU/Hr

2501-M2

0.276

276

945



ALUA/TPGS with VMware

The following procedures describe how to add ALUA/TPGS with VMware support. Starting with firmware 07.84.44.10, the ALUA/TPGS enabled arrays will be managed by the VMW_SATP_ALUA plug-in. Arrays with firmware previous to 07.84.44.10 will be managed by the current VMW_SATP_LSI plug-in.

Prerequisite:

1. Firmware version previous to 07.84.44.10 is loaded on controllers.

2. Current storage array devices are managed by the standard VMW_SATP_LSI plug-in.

3. Management host is available.

4. Starting with firmware 07.84.44.10, the ALUA/TPGS enabled arrays will be managed by the VMW_SATP_ALUA plug-in.

5. Non-TPGS array will be managed by the current standard VMW_SATP_LSI plug-in.

6. Path policy supported is still Round-Robin (RR) or Most Recently Used (MRU).


procedure icon  Procedure (Offline)--for ESX4.1U2, ESXi5.0 and prior

1. Upgrade to firmware 07.84.44.10 (minimum) on the management host.

Currently, VMware (i.e. ESXi5.0 and 4.1u1/u2) do not have the claim rules automatically set to select the VMW_SATP_ALUA to claim our arrays that have the TPGS bit enabled. You must manually add the claim rule in ESX.

The following example adds the claim rule for the 2530-M2 using VID/PID = SUN/LCSM100_S. For 2540-M2 arrays, use VID/PID SUN/LCSM100_F.

a. To manually add the SATP rule in ESX 4.1Ux:

Open a terminal to the ESX host and run the following commands:

# esxcli nmp satp deleterule -s VMW_SATP_LSI -V SUN -M LCSM100_S

# esxcli nmp satp apprule -V SUN -M LCSM100_S -c tpgs_off
-s VMW_SATP_LSI

Reboot the ESX host.

b. To manually add the SATP rule in ESXi 5.0:

Open a terminal to the ESX host and run the following command:

# esxcli storage nmp satp rule add -s VMW_SATP_ALUA -V SUN -M LCSM100_S -c tpgs_on

Reboot the ESX host.

2. Verify the claim rule is added in ESX:

a. To show a list of all the claim rules:

# esxcli nmp satp listrules

b. List only the claim rules for the VMW_SATP_LSI:

# esxcli nmp satp listrules -s VMW_SATP_LSI

c. Verify that the claim rule for the VID/PID is SUN/LCSM100_S (for 2530-M2) or SUN/LCSM100_F (for 2540-M2) and the ’Claim Options’ ’tpgs_off’ flag is specified.

a. To show a list of all the claim rules:

# esxcli storage nmp satp rule list

b. List only the claim rules for the VMW_SATP_ALUA:

# esxcli storage nmp satp rule list -s VMW_SATP_ALUA

c. Verify that the claim rule for VMW_SATP_ALUA is VID/PID SUN/LCSM100_S (for 2530-M2) or SUN/LCSM100_FLS (for 2540-M2) and the ’Claim Options’ ’tpgs_on’ flag is specified.

3. Upgrade the storage array controllers to firmware 07.84.44.10 (minimum) and NVSRAM versions.

4. From the host management client, verify that the host OS type is set to ’VMWARE’. Starting with firmware 07.84.44.10, ’VMWARE’ host type will have the ALUA and TPGS bits enabled by default.

5. Perform a manual re-scan and verify from the ESX host that the TPGS/ALUA enabled devices are claimed by the VMW_SATP_ALUA plug-in:

To confirm that the host is using the ALUA plugin:

a. Run the command:

# esxcli nmp device list

b. The value for Storage Array Type should be "VMW_SATP_ALUA" on every device from the array with firmware 07.84.44.10 (or later). On arrays with firmware previous to 07.84.44.10, the value should be "VMW_SATP_LSI".

a. Run the command:

# esxcli storage nmp device list

b. The value for Storage Array Type should be "VMW_SATP_ALUA" on every device from the array with firmware 07.84.44.10 (or later). On arrays with firmware previous to 07.84.44.10, the value should be "VMW_SATP_LSI".


procedure icon  Procedure (Offline)--starting at ESX4.1U3, ESXi5.0U1 and above

1. Upgrade to firmware 07.84.44.10 (minimum) on the management station.

2. Starting with ESXi5.0 U1 and ESX4.1U3, VMware will automatically have the claim rules to select the VMW_SATP_ALUA plug-in to manage arrays that have the TPGS bit enabled. All arrays with the TPGS bit disabled will continue to be managed by the VMW_SATP_LSI plug-in.

3. Upgrade the storage array controllers to firmware 07.84.44.10 (minimum) and NVSRAM versions.

4. From the Host Management client, verify that the host OS type is set to ’VMWARE’. Starting with firmware 07.84.44.10, ’VMWARE’ host type will have the ALUA and TPGS bits enabled by default.

5. Perform a manual re-scan and verify from the ESX host that the TPGS/ALUA enabled devices are claimed by the VMW_SATP_ALUA plug-in:

To confirm that the host is using the ALUA plugin:

a. Run the command:

# esxcli nmp device list

b. The value for Storage Array Type should be "VMW_SATP_ALUA" on every device from the array with firmware 07.84.44.10 (or later) installed. On arrays with firmware previous to 07.84.44.10, the value should be "VMW_SATP_LSI".

a. Run the command:

# esxcli storage nmp device list

b. The value for Storage Array Type should be "VMW_SATP_ALUA" on every device from the array with firmware 07.84.44.10 (or later) installed. On arrays with firmware previous to 07.84.44.10, the value should be "VMW_SATP_LSI".


Notable Fixes

For a list of bug fixes, see the latest firmware patch README file.


Restrictions and Known Issues

The following are restrictions and known issues applicable to this product release.

Restrictions

Single Path Data Connections

In a single path data connection, a group of heterogeneous servers is connected to an array through a single connection. Although this connection is technically possible, there is no redundancy, and a connection failure will result in loss of access to the array.



caution icon Caution - Because of the single point of failure, single path data connections are not recommended.


SAS Host Ports on the Sun Storage 2540-M2

Although SAS host ports are physically present on the Sun Storage 2540-M2 array controller tray, they are not for use, not supported, and are capped at the factory. FIGURE 1 shows the location of these ports. The Sun Storage 2540-M2 only supports Fibre Channel host connectivity.

FIGURE 1 SAS Host Ports on the 2540-M2


Controller Issues

Log Events Using SLES 11.1 With smartd Monitoring Enabled

Bug 7014293 - When volumes are mapped to a SLES 11.1 host with smartd monitoring enabled, on either a Sun Storage 2500-M2 or 6780 array, it is possible to receive “IO FAILURE” and “Illegal Request ASC/ASCQ” log events.

Workaround - Either disable smartd monitoring or disregard the messages. This is an issue with the host OS.

After Re-Installing the Oracle Virtual Machine (OVM) Manager, International Standards Organizations (ISO) Files Are Listed by Universally Unique Identifier (UUID) Rather Than by Friendly Name

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you re-install the OVM manager on the host using the same ID as the previous installation. ISO file systems that were imported with the previous OVM manager are now renamed with their UUIDs rather than their friendly names. This makes it difficult to identify the ISO file systems.

Workaround

None.

After Un-Mapping a Volume from an Oracle Virtual Machine (OVM) Server, the Volume Continues to Appear in the Storage Database on the Server

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you un-map a volume on an OVM server. The OVM manager continues to show the volume along with those that are still mapped to the server. When you try to assign one of the affected volumes to a virtual machine, you see this error message:

disk doesn’t exist

Workaround

After you un-map the volumes, use the OVM manager to remove those volumes from the storage database on the server.

In the Oracle Virtual Machine (OVM) Manager User Interface, Only One Drive at a Time Can Be Selected for Deletion

Operating System

Hardware/Software/Firmware

Problem or Restriction

In the OVM user interface, only one drive at a time can be selected for deletion.

Workaround

None.

Kernel Panics During Controller Firmware (CFW) Download

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you upgrade CFW. The kernel panics on an attached host when downloading the CFW and shows the following message:

Kernel panic - not syncing: Fatal exception	BUG: unable to handle kernel NULL pointer dereference at 0000000000000180	IP: [<ffffffff8123450a>] kref_get+0xc/0x2a	PGD 3c275067 PUD 3c161067 PMD 0	Oops: 0000 [#1] SMP	last sysfs file: /sys/block/sdc/dev

Workaround

To avoid this problem, do not perform a CFW upgrade on a storage array that is attached to hosts running the affected operating system version. If the problems occurs, power cycle the host.

BCM Driver Fails to Load

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you attempt to install the BCM driver on a server. The driver installs, but the component reports one of the following errors:

This device is not configured correctly. (Code1) The system cannot find the file specified.

or

The drivers for this device are not installed. (Code 28) The system cannot find the file specified.

Workaround

None.

Kernel Panics During Controller Firmware Download

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you upgrade controller firmware. A host with the affected kernel with UEK support experiences a devloss error for one of the world-wide port numbers (WWPNs) followed by a kernel panic.

Workaround

To avoid this problem, upgrade the host kernel to release 2.6.32-300.23.1.

If the problems occurs, power cycle the host.

Network Interface on Device eth0 Fails to Come Online When Booting a Host

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs during a host boot process when a large number (112+) of volumes are mapped to the host.At the point in the boot process where the network interface should be brought online, the host displays the following message:

Bringing up interface eth0:  Device eth0 has different MAC address than expected. [FAILED]

The network interface does not come online during the boot process, and cannot subsequently be brought online.

Workaround

To avoid this problem, reduce the number of volumes mapped to host with the affected version of Oracle Linux. You can map additional volumes to the host after it boots.

When Over 128 Volumes are Mapped to a Host, Paths to Only the First 128 Volumes are Restored after the Controller is Reset

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you have more than 128 volumes mapped to a host, both controllers reboot, and only one controller comes back online. Only the first 128 volumes mapped to the host are accessible to the host for input/output (I/O) operations after the reboot. During the controller reboot, there might be a delay before any of the volumes are accessible to the host. I/O timeouts occur when the host tries to communicate with the inaccessible volumes.

Workaround

You can avoid this problem by mapping no more that 128 volumes to a host with the affected operating system release. If the problem occurs, run the multipath command again after the controller comes back online.

Tasks Aborts Are Logged During a Controller Firmware Upgrade

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs during a controller firmware upgrade. The operating system logs task abort messages similar to those shown below.

May  3 21:30:51 ictc-eats kernel: [118114.764601] sd 0:0:101:3: task abort: SUCCESS scmd(ffff88012383c6c0)	May  3 21:30:51 ictc-eats kernel: [118114.764606] sd 0:0:101:1: attempting task abort! scmd(ffff88022705c0c0)	May  3 21:30:51 ictc-eats kernel: [118114.764609] sd 0:0:101:1: CDB: Test Unit Ready: 00 00 00 00 00 00	May  3 21:30:51 ictc-eats kernel: [118114.764617] scsi target0:0:101: handle(0x000c), sas_address(0x50080e51b0bae000), phy(4)	May  3 21:30:51 ictc-eats kernel: [118114.764620] scsi target0:0:101: enclosure_logical_id(0x500062b10000a8ff), slot(4)	May  3 21:30:51 ictc-eats kernel: [118114.767084] sd 0:0:101:1: task abort: SUCCESS scmd(ffff88022705c0c0)

You might experience input/output (I/O) timeouts or read/write errors after the upgrade.

Workaround

If this problem occurs, restart input/output operations. the affected resources will come back online without further intervention.

Unable to Add More Than 117 Volumes to the Oracle Virtual Machine (OVM) Manager Database

Operating System

Hardware/Software/Firmware

All controllers

Problem or Restriction

This problem occurs when you attempt to add more that 117 volumes to the database of the OVM manager. When the OVM manager scans for the additional volumes, it returns the following error:

OSCPlugin.OperationFailedEx:'Unable to query ocfs2 devices’

Workaround

You can avoid this problem by deleting volumes from the OVM manager database when those volumes are no longer mapped to the OVM server.

Write-Back Cache is Disabled after Controllers Reboot with Multiple Failed Volumes in a Storage Array

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when power is turned off and then back on to a controller-drive tray while there are failed volumes in the storage array. When the controllers reboot after the power cycle, they attempt to flush restored cache data to disk. If the controllers are unable to flush the cache data because of failed volumes, all of the volumes in the storage array remain in write-through mode after the controllers reboot. This will cause a substantial reduction in performance on input/output operations.

Workaround

None.

During Multiple Node Failover/Failback Events, Input/Output (I/O) Operations Time Out Because a Resource is Not Available to a Cluster

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when a cluster loses access to a file system resource. A message similar to the following appears in the cluster log:

Device /dev/mapper/mpathaa not found. Will retry wait to see if it appears.	The device node /dev/mapper/mpathaa was not found or did not appear in the udev create time limit of 60 seconds	Fri Apr 27 18:45:08 CDT 2012 restore: END restore of file system /home/smashmnt11 (err=1)	ERROR: restore action failed for resource /home/smashmnt11	/opt/LifeKeeper/bin/lcdmachfail: restore in parallel of resource "dmmp19021	"has failed; will re-try serially	END vertical parallel recovery with return code -1

You might experience I/O timeouts.

Workaround

If this problem occurs, restart I/O operations on the storage array.

After an NVSRAM Download, a Controller Reboots a Second Time when the NVSRAM is Activated

Operating System

Hardware/Software/Firmware

All controllers

This problem occurs when a controller detects corruption in the signature of the NVSRAM loaded on the controller. The controller restores the NSVRAM from the physical drive, and then reboots.

Workaround

The controller recovers and continues normal operations.

When a Controller is Not Set Offline Before Being Replaced, an Exception Occurs when the Replacement Controller is Brought Online

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you fail to follow standard procedures when replacing a controller. If you do not set a controller offline before you replace it, and the replacement controller has a difference firmware level from the remaining controller, the firmware mismatch is not properly detected.

Workaround

You can avoid this problem by following the standard procedure for replacing a controller. If this problem occurs, the replacement controller reboots after the exception and the storage array returns to normal operations.

Input/Output (I/O) Errors Occur when Disconnection of Devices from a SAS Switch Is Not Detected

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when there is a heavy load of I/O operations between hosts and storage arrays that are connected through a SAS switch. The switch fails to notify the host when a volume is no longer available. A host experiences I/O errors or application timeouts.

Workaround

To avoid this problem, reduce some or all of the following factors:

A Path Failure and Premature Failover Occur when a Cable is Disconnected between a Host and a Controller

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you disconnect a SAS cable between a controller and a host. Even if you reconnect the cable before the normal failover timeout, the path fails and the controller fails over to the alternate.

Workaround

If this problem occurs, reconnect the cable. The path will be restored.

Input/Output (I/O) Errors Occur when a Cable is Disconnected between a Host and a Controller, and the Alternate Controller is Unavailable

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when the maximum number of volumes (256) is mapped to a host. If you disconnect the cable between a controller and a host, and then reconnect the cable, I/O errors occur if the alternate controller becomes unavailable before the host can rediscover all of the volumes on the connection.

Workaround

After some delay, the host will rediscover all of the volumes and normal operations will resume.

With 3 Gb/s SAS Host Bus Adapters (HBAs) and Heavy Input/Output (I/O), I/O Timeouts Occur During a Controller Firmware Upgrade

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when you upgrade controller firmware during a heavy load of I/O operations. The host experiences I/O timeouts during firmware activation.

Workaround

Do not perform an online controller firmware upgrade while the system is under heavy I/O load. If this problem occurs, restart I/O operations on the host.

Host Operating System Logs "Hung Task" During a Path Failure

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when there is a path failure through a host connection. The operating system logs a "Hung Task" message in /var/log/messages before the MPP driver marks the path failed and fails over to the alternate path.

Workaround

The logging of this message does not affect normal operation. You can disable the log message by entering the following command on the host command line:

echo 0 > /proc/sys/kernel/hung_task_timeout_secs

Backup Failure or I/O Errors with Snapshot Creation or Mounting Failure During Backup of Cluster Shared Volumes (CSV)

Operating System

Problem or Restriction

This problem occurs when a backup operation of CSVs begins. The backup application talks to the VSS provider and initiates the backup operation. The creation of a snapshot volume or mounting of a snapshot volume fails. The backup application then tries to backup the CSVs instead of a snapshot of the CSVs. If the Retry option is set with lock, the application hosted on the CSVs or data written to or read from these volumes might throw an error. If the Retry option is set without lock, the backup skips files. This error occurs because the backup application and the application hosted on the CSVs or data being written to or read from the CSVs tries to "lock" the volume or file, which results in a conflict.

Users encounter this issue whenever there is a resource conflict between the backup operation and the application trying to perform write or read operations to the volume undergoing a backup operation.

Depending on the option the customers choose, the backup operation reports one of these conditions:

Workaround

Run the backup operation at a time when the application is not doing write or read intensive work on the CSV undergoing backup.

Also, when using the option "Without Lock," files will be skipped and the user can then create another backup operation with the skipped files. For more information, see http://www.symantec.com/docs/TECH195868

With Multiple SAS Hosts Using Single-PHY, a Host Cable Pull During Input/Output (I/O) Operations Causes a Controller Reboot

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem rarely occurs when multiple hosts are connected by a quadfurcated cable to a single wide port on the controller. If the cable is disconnected, the controller reboots.

Workaround

The controller reboots and return to normal operations when the cable is reconnected.

Data is Misread when a Physical Drive Has an Unreadable Sector

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when issuing a read to a location where the length of the read includes an unreadable sector. The host operating system assumes that data up to the unreadable sector was read correctly, but this might not be the case. A bug has been opened with Red Hat: http://bugzilla.redhat.com/show_bug.cgi?id=845135

Workaround

Replace any drives that have media errors.

Solaris 10 Guest in Fault Tolerant Mode Is Unable to Relocate Secondary Virtual Machine (VM) Upon Host Failure

Operating System

Hardware/Software/Firmware

Problem or Restriction

This problem occurs when the host fails while the host was running a secondary VM for a Solaris 10 (u10) guest. The message in the event log for that VM that reads as follows:

No compatible host for the Fault Tolerant secondary VM

When this problem occurs, the secondary VM for the guest is stuck in an Unknown status and cannot re-enable Fault Tolerance for this VM. An attempt to disable and then re-enable Fault Tolerance fails because it cannot relocate the secondary VM from a host that is not responding. Also Fault Tolerance cannot be completely turned off on the VM for the same reason.

The main problem is that the HA service reports that there are not enough resources available to restart the secondary VM. However, even after reducing all used resources in the cluster to a level so that there is an overabundance of resources, the HA service still reports that there are not enough and therefore no available host in the cluster on which to run the secondary VM. After the VM fails completely, however, the VM can be restarted and put into Fault Tolerance mode again.

The shutdown of the VM is something that always happens if a Fault Tolerance enabled VM is running unprotected without a linked secondary VM and the host on which the primary VM is running fails for any reason. The failure of the secondary VM in a node failure scenario for Solaris 10 guests can be regularly reproduced.

When a node failure happens, the customer sees that Solaris 10 guests can have issues restoring a secondary VM for Fault Tolerance enabled VMs. This is seen by reviewing the vSphere client in the cluster VM view as well as in the event log for the VM.

Workaround

In most cases, the customer can correct the problem by performing one of the following actions in the order shown. Perform one action and if that does not work, proceed to the next until the problem is resolved.

1. Disable and re-enable fault tolerance on the affected VM.

2. Turn off fault tolerance for the VM altogether and turn it back on.

3. Attempt to live vMotion the VM and try action 1 and action 2 again.

It is possible that either the host CPU model is not compatible with turning Fault Tolerance off and on for running VMs, or that, even after performing the previous action, a secondary VM still does not start. If the secondary VM does not start, the customer needs to briefly shut down the affected VM, perform action 2, and then restart the VM.

Documentation Bugs

Hardware Installation Guide

Page 38 of the Sun Storage 2500-M2 Arrays Hardware Installation Guide mistakenly refers to AIX and HP-UX as supported data host platforms. Disregard HP-UX and AIX referenced in the following note:

"The data host multipathing software for Red Hat Linux, HP-UX, AIX, and Windows platforms is Sun Redundant Dual Array Controller (RDAC), also known as MPP."


Related Documentation

Product documentation for Sun Storage 2500-M2 arrays is available at:

http://www.oracle.com/technetwork/documentation/oracle-unified-ss-193371.html

Product documentation for Sun Storage Common Array Manager is available at:

http://www.oracle.com/technetwork/documentation/disk-device-194280.html


TABLE 9 Related Documentation

Application

Title

Review safety information

Sun Storage 2500-M2 Arrays Safety and Compliance Manual

Important Safety Information for Sun Hardware Systems

Review known issues and workarounds

Sun Storage Common Array Manager Release Notes

Prepare the site

Sun Storage 2500-M2 Arrays Site Preparation Guide

Install the support rails

Sun Storage 2500-M2 Arrays Support Rail Installation Guide

Install the array

Sun Storage 2500-M2 Arrays Hardware Installation Guide

Get started with the management software

Sun Storage Common Array Manager Quick Start Guide

Install the management software

Sun Storage Common Array Manager Installation and Setup Guide

Manage the array

Sun Storage Common Array Manager Array Administration Guide

Sun Storage Common Array Manager CLI Guide

Install and configure Multipath failover drivers

Sun StorageTek MPIO Device Specific Module Installation Guide For Microsoft Windows OS

Sun StorageTek RDAC Multipath Failover Driver Installation Guide For Linux OS



Documentation, Support, and Training

These web sites provide additional resources:

 


1 (TableFootnote) xx indicates the most recent patch revision.
2 (TableFootnote) Only 2501-M2 expansion trays are supported with a 2500-M2 controller tray
3 (TableFootnote) Oracle recommends installing the latest Solaris update.

 

4 (TableFootnote) For generic HBA support, contact the HBA manufacturer.
5 (TableFootnote) For generic HBA support, contact the HBA manufacturer.
6 (TableFootnote) See SAS-1 HBA Settings

Copyright

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

If this is software or related software documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:

U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications which may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.


Copyright © 2011- 2013 Oracle et/ou ses affiliés. Tous droits réservés.

Ce logiciel et la documentation qui l’accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d’utilisation et de divulgation. Sauf disposition de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, breveter, transmettre, distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d’interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.

Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu’elles soient exemptes d’erreurs et vous invite, le cas échéant, à lui en faire part par écrit.

Si ce logiciel, ou la documentation qui l’accompagne, est concédé sous licence au Gouvernement des Etats-Unis, ou à toute entité qui délivre la licence de ce logiciel ou l’utilise pour le compte du Gouvernement des Etats-Unis, la notice suivante s’applique :

U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

Ce logiciel ou matériel a été développé pour un usage général dans le cadre d’applications de gestion des informations. Ce logiciel ou matériel n’est pas conçu ni n’est destiné à être utilisé dans des applications à risque, notamment dans des applications pouvant causer des dommages corporels. Si vous utilisez ce logiciel ou matériel dans le cadre d’applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l’utilisation de ce logiciel ou matériel pour ce type d’applications.

Oracle et Java sont des marques déposées d’Oracle Corporation et/ou de ses affiliés.Tout autre nom mentionné peut correspondre à des marques appartenant à d’autres propriétaires qu’Oracle.

Intel et Intel Xeon sont des marques ou des marques déposées d’Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d’Advanced Micro Devices. UNIX est une marque déposée d’The Open Group.

Ce logiciel ou matériel et la documentation qui l’accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des dommages causés par l’accès à des contenus, produits ou services tiers, ou à leur utilisation.