C H A P T E R 1 |
Introduction to Sun HPC ClusterTools Software |
Sun HPC ClusterTools 6 software is a set of parallel development tools that extend the Sun network computing solutions to high-end distributed-memory applications. This chapter summarizes its required configuration and principal components. It contains the following sections:
Sun HPC ClusterTools 6 software requires the Solaris 10 Operating System (Solaris 10 OS). All programs that execute under the Solaris 10 OS will execute in the Sun HPC ClusterTools environment.
Sun HPC ClusterTools 6 software supports Sun Studio 8, 9, 10, and 11 C, C++, and Fortran compilers.
Sun HPC ClusterTools 6 software can run MPI jobs of up to 2048 processes on as many as 256 nodes. It also provides load balancing and support for spawning MPI processes.
The Sun HPC ClusterTools software runs on clusters connected by any TCP/IP-capable interconnect, such as high-speed Ethernet, Gigabit Ethernet, and Infiniband.
Sun HPC ClusterTools 6 software provides a command line interface (also called CRE for Cluster Runtime Environment) that starts jobs and provides status information. It performs four primary operations:
Each of these operations is summarized below. Subsequent chapters contain the procedures.
Sun HPC ClusterTools 6 software can start both serial and parallel jobs. It is particularly useful for balancing computing load in serial jobs executed across shared partitions, where multiple processes can be competing for the same node resources. The syntax and use of mprun are described in Chapter 4.
The runtime environment uses the mpkill command to kill jobs in progress and send signals to those jobs. Its syntax and use are described in Chapter 6.
The runtime environment uses the mpps command to display information about jobs and their processes. Its syntax and use are described in Chapter 7.
The runtime environment uses the mpinfo command to display information about nodes and their partitions. Its syntax and use are described in Chapter 10.
Sun HPC ClusterTools 6 software provides new integration facilities with three select distributed resource management (DRM) systems. These systems provide proper resource allocation, parallel job control and monitoring, and proper job accounting. They are:
The support of other available DRM systems than those stated above are possible through the use of open APIs. Please contact your Sun representative for more information.
You can launch parallel jobs directly from these distributed resource management systems. The DRM interacts closely with Sun CRE for proper resource description and with the multiple processes comprising the requested parallel job.
For a description of the scalable and open architecture of the DRM integration facilities, see How the CRE Environment Is Integrated With Distributed Resource Management Systems. For instructions, see Chapter 5.
Sun MPI is a highly optimized version of the Message Passing Interface (MPI) communications library. It implements all of the MPI 1.2 Standard and the MPI 2.0 Standard. Its highlights are:
TotalView is a third-party multiprocess debugger from Etnus that runs on many platforms. Support for using the TotalView debugger on Sun MPI applications includes:
Refer to the TotalView documentation at http://www.etnus.com for more information about using TotalView.
MPProf is a message-passing profiler intended for use with Sun MPI programs. It extracts information about calls to Sun MPI routines, storing the data in a set of intermediate files, one file per process. It then uses the intermediate data to generate a report profiling the program's message-passing activity.
MPProf's data-gathering operations are enabled by setting an environment variable before running the user program. If this environment variable is not set, program execution proceeds without generating profiling data. The MPProf report generator is invoked with the command-line utility, mpprof. The report is an ASCII text file that provides the following types of information:
MPProf also includes a data conversion utility, mpdump, which converts the intermediate data to user-readable ASCII files with the data in a raw (unanalyzed) state. You can then use the mpdump output files as input to a report generator, which you would supply in place of mpprof.
The MPProf tool is best suited for code analysis situations where message-passing behavior is of primary interest and where simplicity and ease-of-use are also important. For a comprehensive analysis of a complex MPI program, you would need to use MPProf in combination with other profiling tools. For example,
Copyright © 2006, Sun Microsystems, Inc. All Rights Reserved.