As a general rule, porting an application program from one system and compiler to another can be made easier by eliminating any nonstandard coding. Optimizations or work-arounds that were successful on one system might only obscure and confuse compilers on other systems. In particular, optimized hand-tuning for one particular architecture can cause degradations in performance elsewhere. This is discussed later in the chapters on performance and tuning. However, the following issues are worth considering with regards to porting in general.
Some systems automatically initialize local and COMMON variables to zero or some “not-a-number” (NaN) value. However, there is no standard practice, and programs should not make assumptions regarding the initial value of any variable. To assure maximum portability, a program should initialize all variables.
Aliasing occurs when the same storage address is referenced by more than one name. This typically happens with pointers, or when actual arguments to a subprogram overlap between themselves or between COMMON variables within the subprogram. For example, arguments X and Z refer to the same storage locations, as do B and H:
COMMON /INS/B(100) REAL S(100), T(100) ... CALL SUB(S,T,S,B,100) ... SUBROUTINE SUB(X,Y,Z,H,N) REAL X(N),Y(N),Z(N),H(N) COMMON /INS/B(100) ... |
Many “dusty deck” Fortran programs utilized this sort of aliasing as a way of providing some kind of dynamic memory management that was not available in the language at that time.
Avoid aliasing in all portable code. The results could be unpredictable on some platforms and when compiled with optimization levels higher than -O2.
The f95 compiler assumes it is compiling a standard-conforming program. Programs that do not conform strictly to the Fortran standard can introduce ambiguous situations that interfere with the compiler’s analysis and optimization strategies. Some situations can produce erroneous results.
For example, overindexing arrays, use of pointers, or passing global variables as subprogram arguments when also used directly, can result in ambiguous situations that limit the compiler’s ability to generate optimal code that will be correct in all situations.
If you know that your program does contain some apparent aliasing situations you can use the -xalias option to specify the degree to which the compiler should be concerned. In some cases the program will not execute properly when compiled at optimization levels higher than -O2 unless the appropriate -xalias option is specified.
The option flag takes a comma-separated list of keywords that indicate a type of aliasing situation. Each keyword can be prefixed by no% to indicate an aliasing that is not present.
Table 7–2 -xalias Keywords and What They Mean
Here some examples of typical aliasing situations. At the higher optimization levels (-O3 and above) the f95 compiler can generate better code if your program does not contain the aliasing syndromes shown below and you compile with -xalias=no%keyword.
In some cases you will need to compile with -xalias=keyword to insure that the code generate will produce the correct results.
The following example needs to be compiled with -xalias=dummy
parameter (n=100) integer a(n) common /qq/z(n) call sub(a,a,z,n) ... subroutine sub(a,b,c,n) integer a(n), b(n) common /qq/z(n) a(2:n) = b(1:n-1) c(2:n) = z(1:n-1) The compiler must assume that the dummy variables and the common variable may overlap. |
This example works only when compiled with -xalias=craypointer, which is the default:
parameter (n=20) integer a(n) integer v1(*), v2(*) pointer (p1,v1) pointer (p2,v2) p1 = loc(a) p2 = loc(a) a = (/ (i,i=1,n) /) ... v1(2:n) = v2(1:n-1) The compiler must assume that these locations can overlap. |
Here is an example of Cray pointers that do not overlap. In this case, compile with -xalias=no%craypointer for possibly better performance:
parameter (n=10) integer a(n+n) integer v1(n), v2(n) pointer (p1,v1) pointer (p2,v2) p1 = loc(a(1)) p2 = loc(a(n+1)) ... v1(:) = v2(:) The Cray pointers to not point to overlapping memory areas. |
Compile the following example with -xalias=ftnpointer
parameter (n=20) integer, pointer :: a(:) integer, target :: t(n) interface subroutine sub(a,b,n) integer, pointer :: a(:) integer, pointer :: b(:) end subroutine end interface a => t a = (/ (i, i=1,n) /) call sub(a,a,n) .... end subroutine sub(a,b,n) integer, pointer :: a(:) real, pointer :: b(:) integer i, mold forall (i=2:n) a(i) = transfer(b(i-1), mold) The compiler must assume that a and b can overlap. |
Note that in this example the compiler must assume that a and b may overlap, even though they point to data of different data types. This is illegal in standard Fortran. The compiler gives a warning if it can detect this situation.
Compile the following example with -xalias=overindex
integer a,z common // a(100),z z = 1 call sub(a) print*, z subroutine sub(x) integer x(10) x(101) = 2 The compiler may assume that the call to sub may write to z The program prints 2, and not 1, when compiled with -xalias=overindex |
Overindexing appears in many legacy Fortran 77 programs and should be avoided. In many cases the result will be unpredictable. To insure correctness, programs should be compiled and tested with the -C (runtime array bounds checking) option to flag any array subscripting problems.
In general, the overindex flag should only be used with legacy Fortran 77 programs. -xalias=overindex does not apply to array syntax expressions, array sections, WHERE, and FORALL statements.
Fortran 95 programs should always conform to the subscripting rules in the Fortran standard to insure correctness of the generated code. For example, the following example uses ambiguous subscripting in an array syntax expression that will always produce an incorrect result due to the overindexing of the array:
This example of array syntax overindexing DOES NOT GIVE CORRECT RESULTS! parameter (n=10) integer a(n),b(n) common /qq/a,b integer c(n) integer m, k a = (/ (i,i=1,n) /) b = a c(1) = 1 c(2:n) = (/ (i,i=1,n-1) /) m = n k = n + n C C the reference to a is actually a reference into b C so this should really be b(2:n) = b(1:n-1) C a(m+2:k) = b(1:n-1) C or doing it in reverse a(k:m+2:-1) = b(n-1:1:-1) Intuitively the user might expect array b to now look like array c, but the result is unpredictable |
The xalias=overindex flag will not help in this situation since the overindex flag does not extend to array syntax expressions. The example compiles, but will not give the correct results. Rewriting this example by replacing the array syntax with the equivalent DO loop will work when compiled with -xalias=overindex. But this kind of programming practice should be avoided entirely.
The compiler looks ahead to see how local variables are used and then makes assumptions about variables that will not change over a subprogram call. In the following example, pointers used in the subprogram defeat the compiler’s optimization strategy and the results are unpredictable. To make this work properly you need to compile with the -xalias=actual flag:
program foo integer i call take_loc(i) i = 1 print * , i call use_loc() print * , i end subroutine take_loc(i) integer i common /loc_comm/ loc_i loc_i = loc(i) end subroutine take_loc subroutine use_loc() integer vi1 pointer (pi,vi) common /loc_comm/ loc_i pi = loc_i vi1 = 3 end subroutine use_loc |
take_loc takes the address of i and saves it away. use_loc uses it. This is a violation of the Fortran standard.
Compiling with the -xalias=actual flag informs the compiler that all arguments to subprograms should be considered global within the compilation unit, causing the compiler to be more cautious with its assumptions about variables appearing as actual arguments.
Programming practices like this that violate the Fortran standard should be avoided.
Specifying -xalias without a list assumes that your program does not violate the Fortran aliasing rules. It is equivalent to asserting no% for all the aliasing keywords.
The compiler default, when compiling without specifying -xalias, is:
-xalias=no%dummy,craypointer,no%actual,no%overindex,no%ftnpointer
If your program uses Cray pointers but conforms to the Fortran aliasing rules whereby the pointer references cannot result in aliasing, even in ambiguous situations, compiling with -xalias may result in generating better optimized code.
Legacy codes may contain source-code restructurings of ordinary computational DO loops intended to cause older vectorizing compilers to generate optimal code for a particular architecture. In most cases, these restructurings are no longer needed and may degrade the portability of a program. Two common restructurings are strip-mining and loop unrolling.
Fixed-length vector registers on some architectures led programmers to manually “strip-mine” the array computations in a loop into segments:
REAL TX(0:63) ... DO IOUTER = 1,NX,64 DO IINNER = 0,63 TX(IINNER) = AX(IOUTER+IINNER) * BX(IOUTER+IINNER)/2. QX(IOUTER+IINNER) = TX(IINNER)**2 END DO END DO |
Strip-mining is no longer appropriate with modern compilers; the loop can be written much less obscurely as:
DO IX = 1,N TX = AX(I)*BX(I)/2. QX(I) = TX**2 END DO |
Unrolling loops by hand was a typical source-code optimization technique before compilers were available that could perform this restructuring automatically. A loop written as:
DO K = 1, N-5, 6 DO J = 1, N DO I = 1,N A(I,J) = A(I,J) + B(I,K ) * C(K ,J) * + B(I,K+1) * C(K+1,J) * + B(I,K+2) * C(K+2,J) * + B(I,K+3) * C(K+3,J) * + B(I,K+4) * C(K+4,J) * + B(I,K+5) * C(K+5,J) END DO END DO END DO DO KK = K,N DO J =1,N DO I =1,N A(I,J) = A(I,J) + B(I,KK) * C(KK,J) END DO END DO END DO |
should be rewritten the way it was originally intended:
DO K = 1,N DO J = 1,N DO I = 1,N A(I,J) = A(I,J) + B(I,K) * C(K,J) END DO END DO END DO |