Sun HPC ClusterTools 3.0 software is an integrated ensemble of parallel development tools that extend Sun's network computing solutions to high-end distributed-memory applications. The Sun HPC ClusterTools products are teamed with LSF Suite 3.2.3, Platform Computing Corporation's resource management software.
The Sun HPC ClusterTools 3.0 software runs under Solaris 2.6 or Solaris 7 (32-bit or 64-bit).
LSF Suite 3.2.3 is a collection of resource-management products that provide distributed batch scheduling, load balancing, job execution, and job termination services across a network of computers. The LSF products required by Sun HPC ClusterTools 3.0 software are: LSF Base, LSF Batch, and LSF Parallel.
LSF Base - Provides the fundamental services upon which LSF Batch and LSF Parallel depend. It supplies cluster configuration information as well as the up-to-date resource and load information needed for efficient job allocation. It also supports interactive job execution.
LSF Batch - Performs batch job processing, load balancing, and policy-based resource allocation.
LSF Parallel - Extends the LSF Base and Batch services with support for parallel jobs.
Refer to the LSF Administrator's Guide for a fuller description of LSF Base and LSF Batch and to the LSF Parallel User's Guide for more information about LSF Parallel.
LSF supports the concept of interactive batch execution of Sun HPC jobs as well the conventional batch method. Interactive batch mode allows users to submit jobs through the LSF Batch system and remain attached to the job throughout execution.
Sun MPI is a highly optimized version of the Message-Passing Interface (MPI) communications library. Sun MPI implements all of the MPI 1.2 standard as well as a significant subset of the MPI 2.0 feature list. For example, Sun MPI provides the following features:
Integration with Platform Computing's Load Sharing Facility (LSF).
Support for multithreaded programming.
Seamless use of different network protocols; for example, code compiled on a Sun HPC system that has a Scalable Coherent Interface (SCI) network, can be run without change on a cluster that has an ATM network.
Multiprotocol support such that MPI picks the fastest available medium for each type of connection (such as shared memory, SCI, or ATM).
Communication via shared memory for fast performance on clusters of SMPs.
Finely tunable shared-memory communication.
Optimized collectives for symmetric multiprocessors (SMPs).
Prism support - Users can develop, run, and debug programs in the Prism programming environment.
MPI I/O support for parallel file I/O.
Sun MPI is a dynamic library.
Sun MPI and MPI I/O provide full F77, C, and C++ support and Basic F90 support.
Sun HPC ClusterTools's Parallel File System (PFS) component provides high-performance file I/O for multiprocess applications running in a cluster-based, distributed-memory environment.
PFS file s closely resemble UFS file s, but provide significantly higher file I/O performance by striping files across multiple PFS I/O server nodes. This means the time required to read or write a PFS file can be reduced by an amount roughly proportional to the number of file server nodes in the PFS file .
PFS is optimized for the large files and complex data access patterns that are characteristic of parallel scientific applications.
Prism is the Sun HPC graphical programming environment. It allows you to develop, execute, debug, and visualize data in message-passing programs. With Prism you can
Control program execution, such as:
Start and stop execution.
Set breakpoints and traces.
Print values of variables and expressions.
Display the call stack.
Visualize data in various formats.
Analyze performance of MPI programs.
Control entire multiprocess parallel jobs, aggregating processes into meaningful groups, called process sets or psets.
Prism can be used with applications written in F77, F90, C, and C++.
The Sun Scalable Scientific Subroutine Library (Sun S3L) provides a set of parallel and scalable functions and tools that are used widely in scientific and engineering computing. It is built on top of MPI and provides the following functionality for Sun MPI programmers:
Vector and dense matrix operations (level 1, 2, 3 Parallel BLAS).
Iterative solvers for sparse s.
Matrix-vector multiply for sparse s.
FFT
LU factor and solve.
Autocorrelation.
Convolution/deconvolution.
Tridiagonal solvers.
Banded solvers.
Eigensolvers.
Singular value decomposition.
Least squares.
One-dimensional sort.
Multidimensional sort.
Selected ScaLAPACK and BLACS application program interface.
Conversion between ScaLAPACK and S3L.
Matrix transpose.
Random number generators (linear congruential and lagged Fibonacci).
Random number generator and I/O for sparse s.
Matrix inverse.
Array copy.
Safety mechanism.
An array syntax interface callable from message-passing programs.
Toolkit functions for operations on distributed data.
Support for the multiple instance paradigm (allowing an operation to be applied concurrently to multiple, disjoint data sets in a single call).
Thread safety.
Detailed programming examples and support documentation provided online.
Sun S3L routines can be called from applications written in F77, F90, C, and C++.
The Sun HPC ClusterTools 3.0 release supports the following Sun compilers:
Sun Compilers C/C++ 4.2 (also included in Sun Visual WorkShop C++ 3.0)
Sun WorkShop Compilers Fortran 4.2 (also included in Sun Performance WorkShop Fortran 3.0)
Sun Visual WorkShop C++ 5.0
Sun Performance WorkShop Fortran 5.0
The Cluster Console Manager is a suite of applications (cconsole, ctelnet, and crlogin) that simplify cluster administration by enabling the administrator to initiate commands on all nodes in the cluster simultaneously. When invoked, the selected Cluster Console Manager application opens a master window and a set of terminal windows, one for each node in the cluster. Any command entered in the master window is broadcast to all the nodes in the cluster. The commands are echoed in the terminal windows, as are messages received from the respective nodes.
The Switch Management Agent (SMA) supports management of the Scalable Coherent Interface (SCI), including SCI session management and various link and switch states.