Edit the Netif section to specify various characteristics of the network interfaces that are used by the nodes in the cluster. Example 3-4 illustrates the default Netif section that is in hpc.conf.template. This section discusses the various network interface attributes that are defined in the Netif section.
Begin Netif NAME RANK MTU STRIPE PROTOCOL LATENCY BANDWIDTH midnn 0 16384 0 tcp 20 150 idn 10 16384 0 tcp 20 150 scin 20 32768 1 tcp 20 150 : : : : : : : scid 40 32768 1 tcp 20 150 : : : : : : : scirsm 45 32768 1 rsm 20 150 : : : : : : : : : : : : : : smc 220 4096 0 tcp 20 150 End Netif |
Add to the first column the names of the network interfaces that are used in your cluster. The supplied Netif section contains an extensive list of commonly used interface types to simplify this task.
By convention, network interface names include a trailing number as a way to distinguish multiple interfaces of the same type. For example, if your cluster includes two 100 Mbit/second Ethernet networks, include the names hme0 and hme1 in the Netif section.
Decide the order in which you want the networks in your cluster to be preferred for use and then edit the RANK column entries to implement that order.
Network preference is based on the relative value of a network interface's ranking, with higher preference being given to interfaces with lower rank values. In other words, an interface with a rank of 10 will be selected for use over interfaces with ranks of 11 or higher, but interfaces with ranks of 9 or less will have a higher preference.
These ranking values are relative; their absolute values have no significance. This is why gaps are left in the default rankings, so that if a new interface is added, it can be given an unused rank value without having to change any existing values.
Decisions about how to rank two or more dissimilar network types are usually based on site-specific conditions and requirements. Ordinarily, a cluster`s fastest network is given preferential ranking over slower networks. However, raw network bandwidth is only one consideration. For example, an administrator might decide to dedicate a network that offers very low latency, but not the fastest bandwidth to all intra-cluster communication and use a higher-capacity network for connecting the cluster to systems outside the cluster.
This is a placeholder column. Its contents are not used at this time.
If your cluster includes an SCI (Scalable Coherent Interface) network, you can implement scalable communication between cluster nodes by striping MPI messages over the SCI interfaces. In striped communication, a message is split into smaller packets and transmitted in two or more parallel streams over a set of network interfaces that have been logically combined into a stripe-group.
The STRIPE column allows the administrator to include individual SCI network interfaces in a stripe-group pool. Members of this pool are available to be included in logical stripe groups. These stripe groups are formed on an as-needed basis, selecting interfaces from this stripe-group pool.
To include the SCI interface in a stripe-group pool, set its STRIPE value to 1. To exclude an interface from the pool, specify 0. Up to four SCI network interface cards per node can be configured for stripe-group membership.
When a message is submitted for transmission over the SCI network, an MPI protocol module distributes the message over as many SCI network interfaces as are available.
Stripe-group membership is made optional so you can reserve some SCI network bandwidth for non-striped use. To do so, simply set STRIPE = 0 on the SCI network interface(s) you wish to reserve in this way.
This column identifies the communication protocol used by the interface. The scirsm interface employs the RSM (Remote Shared Memory) protocol. The others in the default list all use TCP (Transmission Control Protocol).
If you add a network interface of a type not represented in the hpc.conf template, you will need to specify the type of protocol the new interface uses.
This is a placeholder column. Its contents are not used at this time.
This is a placeholder column. Its contents are not used at this time.