This chapter discusses the some issues that may arise when porting “dusty deck” Fortran programs from other platforms to Fortran 95.
Fortran 95 extensions and Fortran 77 compatibility features are described in the Fortran User’s Guide.
Fortran carriage-control grew out of the limited capabilities of the equipment used when Fortran was originally developed. For similar historical reasons, operating systems derived from the UNIX do not have Fortran carriage control, but you can simulate it with the Fortran 95 compiler in two ways.
Use the asa filter to transform Fortran carriage-control conventions into the UNIX carriage-control format (see the asa (1) man page) before printing files with the lpr command.
The FORTRAN 77 compiler f77 allowed OPEN(N, FORM=’PRINT’) to enable single or double spacing, formfeed, and stripping of column one. This is still available by compiling programs using FORM=’PRINT’with the f95 -f77 compatibility flag. The compiler allows you to reopen unit 6 to change the form parameter to PRINT, when compiling with -f77. For example:
OPEN( 6, FORM=’PRINT’) |
You can use lp(1) to print a file that is opened in this manner.
Early Fortran systems did not use named files, but did provide a command line mechanism to equate actual file names with internal unit numbers. This facility can be emulated in a number of ways, including standard UNIX redirection.
Example: Redirecting stdin to redir.data (using csh(1)):
demo% cat redir.data The data file 9 9.9 demo% cat redir.f The source file read(*,*) i, z The program reads standard input print *, i, z stop end demo% f95 -o redir redir.f The compilation step demo% redir < redir.data Run with redirection reads data file 9 9.90000 demo% |
If the application code was originally developed for 64-bit (or 60-bit) mainframes such as CRAY or CDC, you might want to compile these codes with the following options when porting to an UltraSPARC platform, for example:
-fast -m64 -xtypemap=real:64,double:64,integer:64
These options automatically promote all default REAL variables and constants to REAL*8, and COMPLEX to COMPLEX*16. Only undeclared variables or variables declared as simply REAL or COMPLEX are promoted; variables declared explicitly (for example, REAL*4) are not promoted. All single-precision REAL constants are also promoted to REAL*8. (Set -xarch and -xchip appropriately for the target platform.) To also promote default DOUBLE PRECISION data to REAL*16, change the double:64 to double:128 in the -xtypemap example.
See the Fortran User’s Guide or the f95(1) man page for details.
The Fortran User’s Guide, and the Numerical Computation Guide discuss in detail the hardware representation of data objects in Fortran. Differences between data representations across systems and hardware platforms usually generate the most significant portability problems.
The following issues should be noted:
Sun adheres to the IEEE Standard 754 for floating-point arithmetic. Therefore, the first four bytes in a REAL*8 are not the same as in a REAL*4.
The default sizes for reals, integers, and logicals are described in the Fortran 95 standard, except when these default sizes are changed by the -xtypemap option.
Character variables can be freely mixed and equivalenced to variables of other types, but be careful of potential alignment problems.
f95 IEEE floating-point arithmetic will raise exceptions on overflow or divide by zero and signal SIGFPE or trap by default (-ftrap=common is the default with f95). It does deliver IEEE indeterminate forms in cases where exceptions would otherwise be signaled. This is explained in Chapter Chapter 6, Floating-Point Arithmetic.
The extreme finite, normalized values can be determined. See libm_single(3F) and libm_double(3F). The indeterminate forms can be written and read, using formatted and list-directed I/O statements.
Many “dusty-deck” Fortran applications store Hollerith ASCII data into numerical data objects. With the 1977 Fortran standard (and Fortran 95), the CHARACTER data type was provided for this purpose and its use is recommended. You can still initialize variables with the older Fortran Hollerith (nH) feature, but this is not standard practice. The following table indicates the maximum number of characters that will fit into certain data types. (In this table, boldfaced data types indicate default types subject to promotion by the -xtypemap command-line flag.)
Table 7–1 Maximum Characters in Data Types
|
Maximum Number of Standard ASCII Characters |
|
|
|
---|---|---|---|---|
Data Type |
Default |
INTEGER:64 |
REAL:64 |
DOUBLE:128 |
BYTE |
1 |
1 |
1 |
1 |
COMPLEX |
8 |
8 |
16 |
16 |
COMPLEX*16 |
16 |
16 |
16 |
16 |
COMPLEX*32 |
32 |
32 |
32 |
32 |
DOUBLE COMPLEX |
16 |
16 |
32 |
32 |
DOUBLE PRECISION |
8 |
8 |
16 |
16 |
INTEGER |
4 |
8 |
4 |
8 |
INTEGER*2 |
2 |
2 |
2 |
2 |
INTEGER*4 |
4 |
4 |
4 |
4 |
INTEGER*8 |
8 |
8 |
8 |
8 |
LOGICAL |
4 |
8 |
4 |
8 |
LOGICAL*1 |
1 |
1 |
1 |
1 |
LOGICAL*2 |
2 |
2 |
2 |
2 |
LOGICAL*4 |
4 |
4 |
4 |
4 |
LOGICAL*8 |
8 |
8 |
8 |
8 |
REAL |
4 |
4 |
8 |
8 |
REAL*4 |
4 |
4 |
4 |
4 |
REAL*8 |
8 |
8 |
8 |
8 |
REAL*16 |
16 |
16 |
16 |
16 |
Example: Initialize variables with Hollerith:
demo% cat FourA8.f double complex x(2) data x /16Habcdefghijklmnop, 16Hqrstuvwxyz012345/ write( 6, ’(4A8, "!")’ ) x end demo% f95 -o FourA8 FourA8.f demo% FourA8 abcdefghijklmnopqrstuvwxyz012345! demo% |
If needed, you can initialize a data item of a compatible type with a Hollerith and then pass it to other routines.
If you pass Hollerith constants as arguments, or if you use them in expressions or comparisons, they are interpreted as character-type expressions. Use the compiler option -xhasc=no to have the compiler treat Hollerith constants as typeless data in arguments on subprogram calls. This may be needed when porting older Fortran programs.
As a general rule, porting an application program from one system and compiler to another can be made easier by eliminating any nonstandard coding. Optimizations or work-arounds that were successful on one system might only obscure and confuse compilers on other systems. In particular, optimized hand-tuning for one particular architecture can cause degradations in performance elsewhere. This is discussed later in the chapters on performance and tuning. However, the following issues are worth considering with regards to porting in general.
Some systems automatically initialize local and COMMON variables to zero or some “not-a-number” (NaN) value. However, there is no standard practice, and programs should not make assumptions regarding the initial value of any variable. To assure maximum portability, a program should initialize all variables.
Aliasing occurs when the same storage address is referenced by more than one name. This typically happens with pointers, or when actual arguments to a subprogram overlap between themselves or between COMMON variables within the subprogram. For example, arguments X and Z refer to the same storage locations, as do B and H:
COMMON /INS/B(100) REAL S(100), T(100) ... CALL SUB(S,T,S,B,100) ... SUBROUTINE SUB(X,Y,Z,H,N) REAL X(N),Y(N),Z(N),H(N) COMMON /INS/B(100) ... |
Many “dusty deck” Fortran programs utilized this sort of aliasing as a way of providing some kind of dynamic memory management that was not available in the language at that time.
Avoid aliasing in all portable code. The results could be unpredictable on some platforms and when compiled with optimization levels higher than -O2.
The f95 compiler assumes it is compiling a standard-conforming program. Programs that do not conform strictly to the Fortran standard can introduce ambiguous situations that interfere with the compiler’s analysis and optimization strategies. Some situations can produce erroneous results.
For example, overindexing arrays, use of pointers, or passing global variables as subprogram arguments when also used directly, can result in ambiguous situations that limit the compiler’s ability to generate optimal code that will be correct in all situations.
If you know that your program does contain some apparent aliasing situations you can use the -xalias option to specify the degree to which the compiler should be concerned. In some cases the program will not execute properly when compiled at optimization levels higher than -O2 unless the appropriate -xalias option is specified.
The option flag takes a comma-separated list of keywords that indicate a type of aliasing situation. Each keyword can be prefixed by no% to indicate an aliasing that is not present.
Table 7–2 -xalias Keywords and What They Mean
Here some examples of typical aliasing situations. At the higher optimization levels (-O3 and above) the f95 compiler can generate better code if your program does not contain the aliasing syndromes shown below and you compile with -xalias=no%keyword.
In some cases you will need to compile with -xalias=keyword to insure that the code generate will produce the correct results.
The following example needs to be compiled with -xalias=dummy
parameter (n=100) integer a(n) common /qq/z(n) call sub(a,a,z,n) ... subroutine sub(a,b,c,n) integer a(n), b(n) common /qq/z(n) a(2:n) = b(1:n-1) c(2:n) = z(1:n-1) The compiler must assume that the dummy variables and the common variable may overlap. |
This example works only when compiled with -xalias=craypointer, which is the default:
parameter (n=20) integer a(n) integer v1(*), v2(*) pointer (p1,v1) pointer (p2,v2) p1 = loc(a) p2 = loc(a) a = (/ (i,i=1,n) /) ... v1(2:n) = v2(1:n-1) The compiler must assume that these locations can overlap. |
Here is an example of Cray pointers that do not overlap. In this case, compile with -xalias=no%craypointer for possibly better performance:
parameter (n=10) integer a(n+n) integer v1(n), v2(n) pointer (p1,v1) pointer (p2,v2) p1 = loc(a(1)) p2 = loc(a(n+1)) ... v1(:) = v2(:) The Cray pointers to not point to overlapping memory areas. |
Compile the following example with -xalias=ftnpointer
parameter (n=20) integer, pointer :: a(:) integer, target :: t(n) interface subroutine sub(a,b,n) integer, pointer :: a(:) integer, pointer :: b(:) end subroutine end interface a => t a = (/ (i, i=1,n) /) call sub(a,a,n) .... end subroutine sub(a,b,n) integer, pointer :: a(:) real, pointer :: b(:) integer i, mold forall (i=2:n) a(i) = transfer(b(i-1), mold) The compiler must assume that a and b can overlap. |
Note that in this example the compiler must assume that a and b may overlap, even though they point to data of different data types. This is illegal in standard Fortran. The compiler gives a warning if it can detect this situation.
Compile the following example with -xalias=overindex
integer a,z common // a(100),z z = 1 call sub(a) print*, z subroutine sub(x) integer x(10) x(101) = 2 The compiler may assume that the call to sub may write to z The program prints 2, and not 1, when compiled with -xalias=overindex |
Overindexing appears in many legacy Fortran 77 programs and should be avoided. In many cases the result will be unpredictable. To insure correctness, programs should be compiled and tested with the -C (runtime array bounds checking) option to flag any array subscripting problems.
In general, the overindex flag should only be used with legacy Fortran 77 programs. -xalias=overindex does not apply to array syntax expressions, array sections, WHERE, and FORALL statements.
Fortran 95 programs should always conform to the subscripting rules in the Fortran standard to insure correctness of the generated code. For example, the following example uses ambiguous subscripting in an array syntax expression that will always produce an incorrect result due to the overindexing of the array:
This example of array syntax overindexing DOES NOT GIVE CORRECT RESULTS! parameter (n=10) integer a(n),b(n) common /qq/a,b integer c(n) integer m, k a = (/ (i,i=1,n) /) b = a c(1) = 1 c(2:n) = (/ (i,i=1,n-1) /) m = n k = n + n C C the reference to a is actually a reference into b C so this should really be b(2:n) = b(1:n-1) C a(m+2:k) = b(1:n-1) C or doing it in reverse a(k:m+2:-1) = b(n-1:1:-1) Intuitively the user might expect array b to now look like array c, but the result is unpredictable |
The xalias=overindex flag will not help in this situation since the overindex flag does not extend to array syntax expressions. The example compiles, but will not give the correct results. Rewriting this example by replacing the array syntax with the equivalent DO loop will work when compiled with -xalias=overindex. But this kind of programming practice should be avoided entirely.
The compiler looks ahead to see how local variables are used and then makes assumptions about variables that will not change over a subprogram call. In the following example, pointers used in the subprogram defeat the compiler’s optimization strategy and the results are unpredictable. To make this work properly you need to compile with the -xalias=actual flag:
program foo integer i call take_loc(i) i = 1 print * , i call use_loc() print * , i end subroutine take_loc(i) integer i common /loc_comm/ loc_i loc_i = loc(i) end subroutine take_loc subroutine use_loc() integer vi1 pointer (pi,vi) common /loc_comm/ loc_i pi = loc_i vi1 = 3 end subroutine use_loc |
take_loc takes the address of i and saves it away. use_loc uses it. This is a violation of the Fortran standard.
Compiling with the -xalias=actual flag informs the compiler that all arguments to subprograms should be considered global within the compilation unit, causing the compiler to be more cautious with its assumptions about variables appearing as actual arguments.
Programming practices like this that violate the Fortran standard should be avoided.
Specifying -xalias without a list assumes that your program does not violate the Fortran aliasing rules. It is equivalent to asserting no% for all the aliasing keywords.
The compiler default, when compiling without specifying -xalias, is:
-xalias=no%dummy,craypointer,no%actual,no%overindex,no%ftnpointer
If your program uses Cray pointers but conforms to the Fortran aliasing rules whereby the pointer references cannot result in aliasing, even in ambiguous situations, compiling with -xalias may result in generating better optimized code.
Legacy codes may contain source-code restructurings of ordinary computational DO loops intended to cause older vectorizing compilers to generate optimal code for a particular architecture. In most cases, these restructurings are no longer needed and may degrade the portability of a program. Two common restructurings are strip-mining and loop unrolling.
Fixed-length vector registers on some architectures led programmers to manually “strip-mine” the array computations in a loop into segments:
REAL TX(0:63) ... DO IOUTER = 1,NX,64 DO IINNER = 0,63 TX(IINNER) = AX(IOUTER+IINNER) * BX(IOUTER+IINNER)/2. QX(IOUTER+IINNER) = TX(IINNER)**2 END DO END DO |
Strip-mining is no longer appropriate with modern compilers; the loop can be written much less obscurely as:
DO IX = 1,N TX = AX(I)*BX(I)/2. QX(I) = TX**2 END DO |
Unrolling loops by hand was a typical source-code optimization technique before compilers were available that could perform this restructuring automatically. A loop written as:
DO K = 1, N-5, 6 DO J = 1, N DO I = 1,N A(I,J) = A(I,J) + B(I,K ) * C(K ,J) * + B(I,K+1) * C(K+1,J) * + B(I,K+2) * C(K+2,J) * + B(I,K+3) * C(K+3,J) * + B(I,K+4) * C(K+4,J) * + B(I,K+5) * C(K+5,J) END DO END DO END DO DO KK = K,N DO J =1,N DO I =1,N A(I,J) = A(I,J) + B(I,KK) * C(KK,J) END DO END DO END DO |
should be rewritten the way it was originally intended:
DO K = 1,N DO J = 1,N DO I = 1,N A(I,J) = A(I,J) + B(I,K) * C(K,J) END DO END DO END DO |
Library functions that return the time of day or elapsed CPU time vary from system to system.
The time functions supported in the Fortran library are listed in the following table:
Table 7–3 Fortran Time Functions
Name |
Function |
Man Page |
---|---|---|
time |
Returns the number of seconds elapsed since January, 1, 1970 |
time(3F) |
date |
Returns date as a character string |
date(3F) |
fdate |
Returns the current time and date as a character string |
fdate(3F) |
idate |
Returns the current month, day, and year in an integer array |
idate(3F) |
itime |
Returns the current hour, minute, and second in an integer array |
itime(3F) |
ctime |
Converts the time returned by the time function to a character string |
ctime(3F) |
ltime |
Converts the time returned by the time function to the local time |
ltime(3F) |
gmtime |
Converts the time returned by the time function to Greenwich time |
gmtime(3F) |
etime |
Single processor: Returns elapsed user and system time for program execution Multiple processors: Returns the wall clock time |
etime(3F) |
dtime |
Returns the elapsed user and system time since last call to dtime |
dtime(3F) |
date_and_time |
Returns date and time in character and numeric form |
date_and_time(3F) |
For details, see Fortran Library Reference Manual or the individual man pages for these functions. Here is a simple example of the use of these time functions (TestTim.f):
subroutine startclock common / myclock / mytime integer mytime, time mytime = time() return end function wallclock() integer wallclock common / myclock / mytime integer mytime, time, newtime newtime = time() wallclock = newtime - mytime mytime = newtime return end integer wallclock, elapsed character*24 greeting real dtime, timediff, timearray(2) c print a heading call fdate( greeting ) print*, " Hello, Time Now Is: ", greeting print*, "See how long ’sleep 4’ takes, in seconds" call startclock call system( ’sleep 4’ ) elapsed = wallclock() print*, "Elapsed time for sleep 4 was: ", elapsed," seconds" c now test the cpu time for some trivial computing timediff = dtime( timearray ) q = 0.01 do 30 i = 1, 100000 q = atan( q ) 30 continue timediff = dtime( timearray ) print*, "atan(q) 100000 times took: ", timediff ," seconds" end |
Running this program produces the following results:
demo% TimeTest Hello, Time Now Is: Thu Feb 8 15:33:36 2001 See how long ’sleep 4’ takes, in seconds Elapsed time for sleep 4 was: 4 seconds atan(q) 100000 times took: 0.01 seconds demo% |
The routines listed in the following table provide compatibility with VMS Fortran system routines idate and time. To use these routines, you must include the -lV77 option on the f95 command line, in which case you also get these VMS versions instead of the standard f95 versions.
Table 7–4 Summary: Nonstandard VMS Fortran System Routines
Name |
Definition |
Calling Sequence |
Argument Type |
---|---|---|---|
idate |
Date as day, month, year |
call idate( d, m, y ) |
integer |
time |
Current time as hhmmss |
call time( t ) |
character*8 |
The date(3F) routine and the VMS version of idate(3F) cannot be Year 2000 safe because they return 2-digit values for the year. Programs that compute time duration by subtracting dates returned by these routines will compute erroneous results after December 31, 1999. The Fortran 95 routine date_and_time(3F) should be used instead. See the Fortran Library Reference Manual for details.
Here are a few suggestions for what to try when programs ported to Fortran 95 do not run as expected.
Pay attention to the size and the engineering units. Numbers very close to zero can appear to be different, but the difference is not significant, especially if this number is the difference between two large numbers. For example, 1.9999999e-30 is very near -9.9992112e-33, even though they differ in sign.
VAX math is not as accurate as IEEE math, and even different IEEE processors may differ. This is especially true if the mathematics involves many trigonometric functions. These functions are much more complicated than one might think, and the standard defines only the basic arithmetic functions. There can be subtle differences, even between IEEE machines. Review Chapter Chapter 6, Floating-Point Arithmetic regarding floating-point issues.
Try running with a call nonstandard_arithmetic(). Doing so can also improve performance considerably, and make your Sun workstation behave more like a VAX system. If you have access to a VAX or some other system, run it there also. It is quite common for many numerical applications to produce slightly different results on each floating-point implementation.
Check for NaN, +Inf, and other signs of probable errors. See Chapter Chapter 6, Floating-Point Arithmetic, or the man page ieee_handler(3m) for instructions on how to trap the various exceptions. On most machines, these exceptions simply abort the run.
Two numbers can differ by 6 x 1029 and still have the same floating-point form. Here is an example of different numbers, with the same representation:
real*4 x,y x=99999990e+29 y=99999996e+29 write (*,10) x, x 10 format(’99,999,990 x 10^29 = ’, e14.8, ’ = ’, z8) write(*,20) y, y 20 format(’99,999,996 x 10^29 = ’, e14.8, ’ = ’, z8) end |
The output is:
99,999,990 x 10^29 = 0.99999993E+37 = 7CF0BDC1 99,999,996 x 10^29 = 0.99999993E+37 = 7CF0BDC1 |
In this example, the difference is 6 x 1029. The reason for this indistinguishable, wide gap is that in IEEE single-precision arithmetic, you are guaranteed only six decimal digits for any one decimal-to-binary conversion. You may be able to convert seven or eight digits correctly, but it depends on the number.
If the program fails without warning and runs different lengths of time between failures, then:
Compile with minimal optimization (–O1). If the program then works, compile only selective routines with higher optimization levels.
Understand that optimizers must make assumptions about the program. Nonstandard coding or constructs can cause problems. Almost no optimizer handles all programs at all levels of optimization. (See 7.6.2 Aliasing and the -xalias Option)