Known Issues and Bugs

The following known issues and bugs affect the operation of the Oracle Solaris Cluster and Oracle Solaris Cluster Geographic Edition 4.0 software, as of the time of release. Bugs and issues are grouped into the following categories:

Administration
Data Services
Installation
Runtime

Check with Oracle support services to see if a code fix becomes available.

Administration

x86: `clzonecluster export` Command Fails (7066586)

Problem Summary: The following command might fail on x86 machines.

# clzonecluster export zonename
usage:
export [-f output-file]

Workaround: Use the following command instead:

# zonecfg -z zone-cluster-name export

Using `chmod` to `setuid` Returns Error in Non–Global Zone on PxFS Secondary Server (7020380)

Problem Summary: The chmod command run from a non-global zone might fail on a cluster file system. The chmod operation succeeds from a non-global zone on a node where the PxFS primary is located but fails from a non-global zone on a node where the PxFS secondary is located. For example:

# chmod 4755 /global/oracle/test-file

Workaround: Do one of the following:

Perform the operation on any global-cluster node that accesses the cluster file system.
Perform the operation on any non-global zone that runs on the PxFS primary node that has a loopback mount to the cluster file system.
Switch the PxFS primary to the global-cluster node where the non-global zone that encountered the error is running.

Cannot Create a Resource From a Configuration File With Non-Tunable Extension Properties (6971632)

Problem Summary: When you use an XML configuration file to create resources, if any of the resources have extension properties that are not tunable, that is, the Tunable resource property attribute is set to None, the command fails to create the resource.

Workaround: Edit the XML configuration file to remove the non-tunable extension properties from the resource.

`Cluster.CCR: libpnm system error: Failed to resolve pnm proxy pnm_server.2.zonename` (6942090)

Problem Summary: If using solaris10 branded non-global zones with exclusive IP on an Oracle Solaris Cluster host, the clnode status command with the -m or -v option reports an error in the /var/adm/messages file similar to the following:

Cluster.CCR: [ID 544775 daemon.error] libpnm system error: Failed to resolve pnm proxy zonename

This error does not affect the running of the non-global zone or the cluster. The solaris10 branded zone does not have to be under cluster control for the errors to be seen.

The issue is only seen on solaris10 branded zones with exclusive IP. The issue is not seen when the following conditions exist:

The zone is solaris branded and uses exclusive IP.
The zone is solaris10 branded and uses shared IP.
The zone is solaris branded and uses shared IP.

Workaround: There is no workaround. The error messages do not affect the running of the non-global zone or the global cluster.

Missing `/dev/rmt` Causes Incorrect Reservation Usage When Policy Is `pathcount` (6920996)

Problem Summary: When a new storage device is added to a cluster and is configured with three or more DID paths, the node on which the cldevice populate command is run might fail to register its PGR key on the device.

Workaround: Run the cldevice populate command on all cluster nodes, or run the cldevice populate command twice from the same node.

Disabling Device Fencing While Cluster Is Under Load Results in Reservation Conflict (6908466)

Public Summary: Turning off fencing for a shared device with an active I/O load might result in a reservation conflict panic for one of the nodes that is connected to the device.

Workaround: Quiesce I/O to a device before you turn off fencing for that device.

Removing Nodes From the Cluster Configuration Can Result in Node Panics (6735924)

Problem Summary: Changing a cluster configuration from a three-node cluster to a two-node cluster might result in complete loss of the cluster, if one of the remaining nodes leaves the cluster or is removed from the cluster configuration.

Workaround: Immediately after removing a node from a three-node cluster configuration, run the cldevice clear command on one of the remaining cluster nodes.

Data Services

Share Mount Point Matching Is Incorrect for Combination of UFS and ZFS Starting With a Common Pattern (7093237)

Problem Summary: If an NFS resource is created for a ZFS mount point and this mount-point prefix matches a UFS file system entry in the vfstab file, the HA for NFS data service will fail validation if the UFS file system is not mounted on the node.

Workaround: Mount the UFS file system on the node where the HAStoragePlus resource pertaining to the ZFS file system is online. You only need to do this if the resource is being created or updated. At any other time, there is no constraint that the UFS file system must be mounted before the resource group can be taken offline or brought online and the resource group can be switched to any node at will.

'Unable to Determine Oracle CRS Version' Error After Applying Patch 145333-09 (7090390)

Problem Summary: The Oracle Solaris Cluster code is unable to determine the Oracle CRS version when the su user is using the csh shell.

Workaround: A user that owns ${CRS_HOME}/bin/srvctl must not use the csh shell.

SPARC: HA for Oracle VM Server for SPARC Default `STOP_TIMEOUT` is Too Low - Need Better Monitoring Of Domain Migration Progress (7069269)

Problem Summary: The STOP_TIMEOUT value in the HA for Oracle VM Server for SPARC data service is too low to complete the migration of guest domains.

Workaround: Increase the default value for STOP_TIMEOUT to at least 900 or to the expected migration time interval multiplied by 4.

Scalable Applications Are Not Isolated Between Zone Clusters (6911363)

Problem Summary: If scalable applications configured to run in different zone clusters bind to INADDR_ANY and use the same port, then scalable services cannot distinguish between the instances of these applications that run in different zone clusters.

Workaround: Do not configure the scalable applications to bind to INADDR_ANY as the local IP address, or bind them to a port that does not conflict with another scalable application.

Running `clnas add` or `clnas remove` Command on Multiple Nodes at the Same Time Could Cause Problem (6791618)

Problem Summary: When adding or removing a NAS device, running the clnas addor clnas removecommand on multiple nodes at the same time might corrupt the NAS configuration file.

Workaround: Run the clnas addor clnas removecommand on one node at a time.

Installation

`cluster check` Fails for `cacaoadm` With `Insufficient Data` Before Node Is Configured in Cluster (7104375)

Problem Summary: The cluster check command uses common agent container (CAC) services for communication between nodes and requires CAC to be running. If any administrator runs the check S6979686 while the node is not a cluster member and CAC services are not running, the following message is displayed:

Insufficient Data: 1; /usr/sbin/cacaoadm status: Unable to check
SMF status

Workaround: This error is safe to ignore. Select the option ignore the error and continue to continue while installing the Oracle Solaris Cluster software.

Some Cluster Services Might Be Missing After Configuring Cluster on a Boot Environment That Previously Had the Cluster Software Installed (7103721)

Problem Summary: If you uninstall Oracle Solaris Cluster and then reinstall and configure it in the same boot environment, the cluster will boot successfully, but some of the cluster services might be missing. Run the svcs -x command and check for any services beginning with svc:/system/cluster.

# svcs -x
svc:/system/cluster/rgm-starter:default (Resource Group Manager Daemon)
 State: offline since Fri Oct 28 18:30:36 2011
Reason: Dependency svc:/system/cluster/rpc-fed:default is absent.
   See: http://sun.com/msg/SMF-8000-E2
Impact: 5 dependent services are not running.  (Use -v for list.)

Workaround: Use the following commands to add the absent service. The following example shows the addition of the svc:/system/cluster/rpc-fed:default service:

# service=svc:/system/cluster/rpc-fed:default 
# svccfg -s ${service%:*} add ${service##*:} 
# svccfg -s ${service} addpg general framework 
# svccfg -s ${service} delcust -M  
# svcadm enable ${service}

Then rerun the svcs -x command to check for any other missing cluster services.

`scinstall` Tries to Create an IPMP Group on a Standby Interface (7095759)

Problem Summary: If the cluster nodes have IPMP groups created with an active-standby configuration before Oracle Solaris Cluster configuration is performed, the scinstall command will fail with the following error messages during Oracle Solaris Cluster configuration:

Configuring IP multipathing groups ...failed 
scinstall: Failed to retrieve the broadcast value for this adapter

If the standby adapter does not have any broadcast value, the scinstall command prints the above error message and does not proceed further in group creation. The scinstall command will, however, continue further without any issues.

Workaround: No workaround is required and the message is safe to ignore.

The Command `clnode remove -F nodename` Fails to Remove the Node `nodename` From Solaris Volume Manager Device Groups (6471834)

Problem Summary: When a node is removed from the cluster by using the command clnode remove -F nodename, a stale entry for the removed node might remain in Solaris Volume Manager device groups.

Workaround: Remove the node from the Solaris Volume Manager device group by using the metaset command before you run the clnode remove -F nodename command.

If you ran the clnode remove -F nodename command before you removed the node from the Solaris Volume Manager device group, run the metaset command from an active cluster node to remove the stale node entry from the Solaris Volume Manager device group. Then run the clnode clear -F nodename command to completely remove all traces of the node from the cluster.

Autodiscovery Should Find Only One Interconnect Path for Each Adapter (6299097)

Problem Summary: If there are redundant paths in the network hardware between interconnect adapters, the scinstall utility might fail to configure the interconnect path between them.

Workaround: If autodiscovery discovers multiple interconnect paths, manually specify the adapter pairs for each path.

Runtime

Failure of Logical Hostname to Fail Over Caused by `getnetmaskbyaddr()` (7075347)

Problem Summary: Logical hostname failover requires getting the netmask from the network if nisis enabled for the netmasksname service. This call to getnetmaskbyaddr() hangs for a while due to CR 7051511, which might hang long enough for the Resource Group Manager (RGM) to put the resource in the FAILED state. This occurs even though the correct netmask entries are in the /etc/netmasks local files. This issue affects only multi-homed clusters, such as cluster nodes that reside on multiple subnets.

Workaround: Configure the /etc/nsswitch.conf file, which is handled by an SMF service, to only use files for netmasks lookups.

# /usr/sbin/svccfg -s svc:/system/name-service/switch setprop config/netmask = astring:\"files\"
# /usr/sbin/svcadm refresh svc:/system/name-service/switch

`ssm_start` Fails Due to Unrelated IPMP Down (6938555)

Problem Summary: A scalable resource that depends on a SUNW.SharedAddress resource fails to come online, due to failure of an IPMP group that is on a subnet that is not used by the shared-address resource. Messages similar to the following are seen in the syslog of the cluster nodes:

Mar 22 12:37:51 schost1 SC SUNW.gds:5,Traffic_voip373,Scal_service_voip373,SSM_START: 
ID 639855 daemon.error IPMP group sc_ipmp1 has status DOWN. Assuming this
node cannot respond to client requests.

Workaround: Repair the failed IPMP group and restart the failed scalable resource.

Skip Navigation Links
Exit Print View
	Oracle Solaris Cluster 4.0 Release Notes Oracle Solaris Cluster 4.0