Sun MPI 4.0 User's Guide: With CRE

Partitions

The system administrator can configure the nodes in a Sun HPC cluster into one or more logical sets, called partitions.


Note -

The CPUs in a Sun HPC 10000 server can be configured into logical nodes. These domains can be logically grouped to form partitions, which the CRE uses in the same way it deals with partitions containing other types of Sun HPC nodes.


Any job launched on a partition will run on one or more nodes in that partition, but not on nodes in any other partition. Partitioning a cluster allows multiple jobs to be executed on the partitions concurrently, without any risk of jobs on different partitions interfering with each other. This ability to isolate jobs can be beneficial in various ways: For example:

If you want your job to execute on a specific partition, the CRE provides you with the following methods for selecting the partition:

It is possible for cluster nodes to not belong to any cluster. If you log in to one of these independent nodes and do not request a particular partition, the CRE will launch your job on the cluster's default partition. This is a partition whose name is specified by the SUNHPC_PART environment variable or is defined by an internal attribute that the system administrator is able to set.

The system administrator can also selectively enable and disable partitions. Jobs can only be executed on enabled partitions. This restriction makes it possible to define many partitions in a cluster, but have only a few active at any one time.


Note -

It is also possible for a node to belong to more than one partition, so long as only one is enabled at a time.


In addition to enabling and disabling partitions, the system administrator can set and unset other partition attributes that influence various aspects of how the partition functions. For example, if you have an MPI job that requires dedicated use of a set of nodes, you could run it on a partition that the system administrator has configured to accept only one job at a time.

The administrator could configure a different partition to allow multiple jobs to execute concurrently. This shared paritition would be used for code development or other jobs that do not require exclusive use of their nodes.


Note -

Although a job cannot be run across partition boundaries, it can be run on a partition plus independent nodes.