Figures

FIGURE 4-1 Basic Ring Sending Algorithm

FIGURE 4-2 Basic Ring Sending Algorithm With Synchronization

FIGURE 4-3 Bandwidth as a function of message size for Algorithms 1 and 2.

FIGURE 4-4 Bandwidth as a function of message size with directed polling.

FIGURE 4-5 Bandwidth as a function of message size with nonblocking operations.

FIGURE 4-6 Bandwidth as a function of message size with highly synchronized processes.

FIGURE 4-7 Bandwidth as a function of message size with load imbalance.

FIGURE 4-8 Bandwidth as a function of message size with MPI_Testall calls.

FIGURE 4-9 Bandwidth as a function of computation between MPI_Testall calls.

FIGURE 5-1 Matrix Transposition

FIGURE 5-2 Matrix Transposition, Distributed Over Two Processes.

FIGURE 5-3 Matrix Transposition, Distributed Over Two Processes (Fortran Perspective).

FIGURE 5-4 Matrix Transposition, Maximal Aggregation for 4X4 Transposition

FIGURE 6-1 Array Distribution Examples for a One-Dimensional Array

FIGURE 6-2 Array Distribution Examples for Two-Dimensional Array

FIGURE 6-3 LOCAL,BLOCK Distribution of a 16x16 Array Across 4 Processes

FIGURE 6-4 LOCAL,CYCLIC Distribution of a 16x16 Array Across 4 Processes

FIGURE 6-5 Examples of Column- and Row-Major Ordering for a 4x3 Process Grid

FIGURE 6-6 Process Grid and Runtime Mapping Phases (Column-Major Process Grid)

FIGURE 8-1 Relationship Between Bisection Bandwidth and Number of Nodes

FIGURE 9-1 Performance Analyzer--Main View

FIGURE 9-2 Performance Analyzer--Main View With Tracing Enabled

FIGURE 9-3 Performance Analyzer--Source View

FIGURE 9-4 Performance Analyzer--Callers-Callees View

FIGURE 9-5 Performance Analyzer--Timeline View

FIGURE 9-6 Performance Analyzer--Timeline View With Callstack

FIGURE 9-7 Examples of Functions That Might Appear in Profiles.

FIGURE A-1 Blocking Sends Interrupt Computation

FIGURE A-2 Nonblocking Operations Overlap With Computation

FIGURE A-3 Computational Resources Devoted Either to Computation or to MPI Operations

FIGURE A-4 Progress Made on Multiple Messages by a Single MPI Call That Does Not Explicitly Reference the Other Messages

FIGURE A-5 Snapshot of a Pipelined Message

FIGURE A-6 A Medium-Size Message Using Only One Postbox

FIGURE A-7 A Short Message Squeezing Data Into the Postbox -- No Buffers Used

FIGURE A-8 First Snapshot of a Cyclic Message

FIGURE A-9 Second Snapshot of a Cyclic Message

FIGURE A-10 Shared-Memory Resources Dedicated per Connection

FIGURE A-11 Shared-Memory Resources per Sender -- Example of Send-Buffer Pools

FIGURE A-12 Eager Message-Passing Protocol

FIGURE A-13 Rendezvous Message-Passing Protocol

FIGURE A-14 A Short RSM Message

FIGURE A-15 A Pipelined RSM Message

FIGURE A-16 Broadcast With Binary Fan-Out, First Example

FIGURE A-17 Broadcast With Binary Fan-Out, Second Example

FIGURE A-18 Broadcast With Binary Fan-Out, Third Example

FIGURE A-19 Broadcast With Binary Fan-Out, Fourth Example

FIGURE A-20 Broadcast Over Shared Memory With Binary Fan-Out, First Case

FIGURE A-21 Broadcast Over Shared Memory With Binary Fan-Out, Second Case

FIGURE A-22 Tree Broadcast versus Pipelined Broadcast of a Large Message

FIGURE B-1 Message of B Bytes Sent Over Shared Memory