Sun HPC ClusterTools 8.2.1 Software Release Notes |
This document describes late-breaking news about the Sun HPC ClusterTools 8.2.1 (ClusterTools 8.2.1) software. The information is organized into the following sections:
The following feature has been added to Sun HPC ClusterTools software:
Sun HPC ClusterTools 8.2.1 software works with the following versions of related software:
Note - When TotalView is used to debug applications compiled with the Intel compiler, the stack trace feature is unable to display the full execution stack. |
To improve ClusterTools, Sun collects anonymous information about your cluster during installation. If you want to turn this feature off, use the -w option with ctinstall.
The communication between ctinstall and Sun works only if the Sun HPC ClusterTools software installation process completes successfully. It does not work if the installation fails for any reason.
Sun HPC ClusterTools 8.2.1 software requires the Solaris OS to have the latest Infiniband updates to support use of the Mellanox ConnectX IB HCA.
This download is available at:
http://www.sun.com/download/index.jsp?cat=Hardware%20Drivers&tab=3&subcat=InfiniBand
For more information about Mellanox HCA support, contact the ClusterTools 8.2.1 software development alias at ct-feedback@sun.com.
This section highlights some of the outstanding CRs (Change Requests) for the ClusterTools 8.2.1 software components. A CR might be a defect, or it might be an RFE (request for enhancement).
Each CR has an identifying number assigned to it. To avoid ambiguity when inquiring about a CR, include its CR number in any communications. The heading for each CR description includes the associated CR number.
Running the default Clustertools 8.2.1 on a system with Hyper-Threads (such as Intel Xeon Processor x5570) could cause multiple processes to be bound to the same core, resulting in poor performance.
Workaround: Unless you are an expert user, you may want to avoid binding in this situation. You could use the default behavior or explicitly specify -bind-to-none. If you are an expert user, you can specify the exact binding behavior you want with rankfiles. See the mpirun man page for more information about rankfiles
Analyzer experiments may not contain ClusterTools MPI State profiling data on some Linux systems when the application is compiled with GNU or Intel compilers. This issue is exhibited on the Linux variants RHEL 5.3 and CentOS 5.3.
Workaround: Supply the option -Wl,--enable-new-dtags to ClusterTools mpi* link commands. This flag causes the compiled executable to define RUNPATH in addition to RPATH, allowing ClusterTools MPI State libraries to be enabled via the LD_LIBRARY_PATH environment variable.
The Pathscale and PGI environments in which ClusterTools 8.2.1 was built did not include OFED 1.3.1 or higher. Consequently, XRC support is not available with ClusterTools 8.2.1 built with either of these two compilers.
Workaround: Use ClusterTools 8.2.1 software with Sun Studio or GCC compiled libraries.
The Open MPI library does not currently support thread-safe operations. If your applications contain thread-safe operations, they might fail.
If you run an MPI program using the udapl BTL in a local (nonglobal) zone in the Solaris OS, your program might fail and display the following error message:
Workarounds: Either run the udapl BTL in the Solaris global zone only, or use another interconnect (such as tcp) in the local zone.
This condition happens when the udapl BTL is not available on one node in a cluster. The Infiniband adapter on the node could be unavailable or misconfigured, or there might not be an Infiniband adapter on the node.
When you run an Open MPI program using the udapl BTL under such conditions, the program might hang or fail, but no error message is displayed. When a similar operation fails under the tcp BTL, the failure results in an error message.
Workaround: Add the following MCA parameter to your command line to exclude the udapl BTL:
For more information about MCA parameters and how to exclude functions at the command line, refer to the Sun HPC ClusterTools 8.2.1 Software User’s Guide.
If an MPI job exhausts the resources of the CPUs, the program can fail or show segmentation faults. This might happen when nodes are oversubscribed.
Workaround: Avoid oversubscribing the nodes.
For more information about oversubscribing nodes and the --nooversubscribe option, refer to the Sun HPC ClusterTools 8.2.1 Software User’s Guide.
When you set up nonglobal zones in the Solaris OS, the Solaris OS packages propagate from the global zone to the new zones.
However, if you installed Sun HPC ClusterTools software on the system before setting up the zones, SUNWompiat (the Open MPI installer package) does not get propagated to the new nonglobal zone. It causes the Install_Utilities directory not to be available on nonglobal zones during new zone creation. This also means that the links to /opt/SUNWhpc do not get propagated to the local zone.
Workaround: There are two workarounds for this issue.
1. From the command line, use the full path to the Sun HPC ClusterTools executable you want to use. For example, type /opt/SUNWhpc/HPC8.2.1/bin/mpirun instead of /opt/SUNWhpc/bin/mpirun.
2. Reinstall Sun HPC ClusterTools 8.2.1 software in the non-global zone. This process allows you to activate Sun HPC ClusterTools 8.2.1 software (thus creating the links to the executables) on nonglobal zones.
When using a peer-to-peer connection with the udapl BTL (byte-transfer layer), the udapl BTL allocates a free list of fragments. This free list is used for send and receive operations between the peers. The free list does not have a specified maximum size, so a high amount of communication traffic at one peer might increase the size of the free list until it interferes with the ability of the other peers to communicate.
This issue might appear as a memory resource issue to an Open MPI application. This problem has only been observed on large jobs where the number of uDAPL connections exceeds the default value of btl_udapl_max_eager_rdma_peers.
Workaround: For example, if an Open MPI application running over uDAPL/IB (Infiniband) reports an out-of-memory error for alloc or for privileged memory, and if those two values have already been increased, the following might allow the program to run successfully.
1. At the command line, add the following MCA parameter to your mpirun command:
where x is equal to the number of peer uDAPL connections that the Open MPI job will establish.
2. If the setting in Step 1 does not fix the problem, then set the following MCA parameter with the mpirun command at the command line:
The TotalView debugger might not be able to determine if an MPI_Comm_spawn operation has occurred, and might not be able to locate the new processes that the operation creates. This is because the current version of the Open MPI message dumping library (ompi/debuggers/ompi_dll.c) does not implement the functions and interfaces for the support of MPI 2 debugging and message dumping.
The Open MPI DLL for the TotalView debugger does not support handling of unexpected messages. Only pending send and receive queues are supported.
On a large SMP (symmetric multiprocessor) with many CPUs, ORTE might take a long time to start up before the MPI job runs. This is a known issue with the MPI layer.
Note - This behavior has improved in the ClusterTools 8.2 release as a result of changes in shared memory use. But the CR continues to be in effect. |
Workaround: Reduce mpool_sm_min_size and btl_sm_eager_limit settings. This may shorten startup time. For more information, see the OMPI FAQ entry at:
http://www.open-mpi.org/faq/?category=sm#decrease-sm
When using the Allinea DDT debugger to debug an application compiled in 64-bit mode on a SPARC-based system, the program might not run when loaded into the DDT debugger. In addition, if you try to use the View ->Message Queue command, the debugger issues a popup dialog box with the message Gathering Data, and never finishes the operation.
Workaround: Set the environment variable DDT_DONT_GET_RANK to 1.
When using MPI_Comm_spawn or other spawn commands in Open MPI, the uDAPL BTL might hang and return timeout messages similar to the following:
[btl_udapl_component.c:1051:mca_btl_udapl_component_progress] WARNING: connection event not handled : DAT_CONNECTION_EVENT_TIMED_OUT |
Workaround: Use the TCP BTL with the spawn commands instead of the uDAPL BTL. For example:
Copyright © 2009 Sun Microsystems, Inc. All rights reserved.