C H A P T E R 2 |
Fundamental Concepts |
This chapter summarizes a few basic concepts that you should understand to get the most out of Sun’s HPC ClusterTools software. It contains the following sections:
High performance computing clusters[1] are groups of servers interconnected by any Sun-supported, TCP/IP-capable interconnect. Each server in a cluster is called a node. A cluster can consist of a single node.
When using ORTE, you can select the cluster and nodes on which your MPI programs will run and how your processes will be distributed among them. For instructions, see Chapter 4, “Running Programs With the mpirun Command.”
For more information about how Open MPI allocates computing resources, see the FAQ entitled “Running MPI Jobs” at:
http://www.open-mpi.org/faq/?category=running
Open MPI allows you to control several aspects of job and process execution, such as:
For tasks and instructions, see Chapter 4.
As described in Chapter 1, the Open MPI/Sun HPC ClusterTools 7.1 environment provides close integration between ORTE and several different DRM systems, including the following:
The integration process is similar for all DRM systems, with some individual differences. At run time, mpirun calls the specified DRM system (launcher), which in turn launches the job.
For information on the ways in which mpirun interacts with DRM systems, see Chapter 4. In addition, see the FAQ on running MPI jobs at:
http://www.open-mpi.org/faq/?category=running
Instructions for script-based and interactive job launching are provided in Chapter 5.
The exact instructions vary from one resource manager to another, and are affected by your Open MPI configuration, but they all follow these general guidelines:
1. You can launch the job either interactively or through a script. Instructions for both are provided in Chapter 4 and Chapter 5.
2. You can enter the DRM processing environment (for example, Sun Grid Engine) before launching jobs with mpirun.
3. You can reserve resources for the parallel job and set other job control parameters from within the DRM, or use a hosts file to specify the parameters.
For more information about launching programs using ORTE or Sun Grid Engine, see Chapters 4 and 5.
The Solaris 10 Operating System (Solaris 10 OS) enables you to create secure, isolated areas within a single instance of the Solaris 10 OS. These areas, called zones, provide secure environments for running applications. Applications that execute in one zone cannot monitor or affect activity in another zone. You can create multiple non-global zones to run as virtual instances of the Solaris OS on the same hardware.
The global zone is the default zone for the Solaris system. You install Sun HPC ClusterTools software into the global zone. Any non-global zones running under that Solaris system “inherit” that installation. This means that you may install and configure Sun HPC ClusterTools and compile/run/debug your programs in either a global or a non-global zone.