Sun Studio 12: Fortran Programming Guide

Chapter 7 Porting

This chapter discusses the some issues that may arise when porting “dusty deck” Fortran programs from other platforms to Fortran 95.

Fortran 95 extensions and Fortran 77 compatibility features are described in the Fortran User’s Guide.

7.1 Carriage-Control

Fortran carriage-control grew out of the limited capabilities of the equipment used when Fortran was originally developed. For similar historical reasons, operating systems derived from the UNIX do not have Fortran carriage control, but you can simulate it with the Fortran 95 compiler in two ways.

Use the asa filter to transform Fortran carriage-control conventions into the UNIX carriage-control format (see the asa (1) man page) before printing files with the lpr command.
The FORTRAN 77 compiler f77 allowed OPEN(N, FORM=’PRINT’) to enable single or double spacing, formfeed, and stripping of column one. This is still available by compiling programs using FORM=’PRINT’with the f95 -f77 compatibility flag. The compiler allows you to reopen unit 6 to change the form parameter to PRINT, when compiling with -f77. For example:

      OPEN( 6, FORM=’PRINT’)

You can use lp(1) to print a file that is opened in this manner.

7.2 Working With Files

Early Fortran systems did not use named files, but did provide a command line mechanism to equate actual file names with internal unit numbers. This facility can be emulated in a number of ways, including standard UNIX redirection.

Example: Redirecting stdin to redir.data (using csh(1)):

demo% cat redir.data           The data file
 9 9.9

demo% cat redir.f             The source file
      read(*,*) i, z          The program reads standard input
      print *, i, z
      stop
      end

demo% f95 -o redir redir.f   The compilation step
demo% redir < redir.data            Run with redirection reads data file
  9 9.90000
demo%

7.3 Porting From Scientific Mainframes

If the application code was originally developed for 64-bit (or 60-bit) mainframes such as CRAY or CDC, you might want to compile these codes with the following options when porting to an UltraSPARC platform, for example:

-fast -m64 -xtypemap=real:64,double:64,integer:64

These options automatically promote all default REAL variables and constants to REAL*8, and COMPLEX to COMPLEX*16. Only undeclared variables or variables declared as simply REAL or COMPLEX are promoted; variables declared explicitly (for example, REAL*4) are not promoted. All single-precision REAL constants are also promoted to REAL*8. (Set -xarch and -xchip appropriately for the target platform.) To also promote default DOUBLE PRECISION data to REAL*16, change the double:64 to double:128 in the -xtypemap example.

See the Fortran User’s Guide or the f95(1) man page for details.

7.4 Data Representation

The Fortran User’s Guide, and the Numerical Computation Guide discuss in detail the hardware representation of data objects in Fortran. Differences between data representations across systems and hardware platforms usually generate the most significant portability problems.

The following issues should be noted:

Sun adheres to the IEEE Standard 754 for floating-point arithmetic. Therefore, the first four bytes in a REAL*8 are not the same as in a REAL*4.
The default sizes for reals, integers, and logicals are described in the Fortran 95 standard, except when these default sizes are changed by the -xtypemap option.
Character variables can be freely mixed and equivalenced to variables of other types, but be careful of potential alignment problems.
f95 IEEE floating-point arithmetic will raise exceptions on overflow or divide by zero and signal SIGFPE or trap by default (-ftrap=common is the default with f95). It does deliver IEEE indeterminate forms in cases where exceptions would otherwise be signaled. This is explained in Chapter Chapter 6, Floating-Point Arithmetic.
The extreme finite, normalized values can be determined. See libm_single(3F) and libm_double(3F). The indeterminate forms can be written and read, using formatted and list-directed I/O statements.

7.5 Hollerith Data

Many “dusty-deck” Fortran applications store Hollerith ASCII data into numerical data objects. With the 1977 Fortran standard (and Fortran 95), the CHARACTER data type was provided for this purpose and its use is recommended. You can still initialize variables with the older Fortran Hollerith (nH) feature, but this is not standard practice. The following table indicates the maximum number of characters that will fit into certain data types. (In this table, boldfaced data types indicate default types subject to promotion by the -xtypemap command-line flag.)

Table 7–1 Maximum Characters in Data Types


	Maximum Number of Standard ASCII Characters
Data Type	Default	`INTEGER:64`	`REAL:64`	`DOUBLE:128`
`BYTE`	1	1	1	1
`COMPLEX`	8	8	16	16
*`COMPLEX16`**	16	16	16	16
*`COMPLEX32`**	32	32	32	32
`DOUBLE COMPLEX`	16	16	32	32
`DOUBLE PRECISION`	8	8	16	16
`INTEGER`	4	8	4	8
*`INTEGER2`**	2	2	2	2
*`INTEGER4`**	4	4	4	4
*`INTEGER8`**	8	8	8	8
`LOGICAL`	4	8	4	8
*`LOGICAL1`**	1	1	1	1
*`LOGICAL2`**	2	2	2	2
*`LOGICAL4`**	4	4	4	4
*`LOGICAL8`**	8	8	8	8
`REAL`	4	4	8	8
*`REAL4`**	4	4	4	4
*`REAL8`**	8	8	8	8
*`REAL16`**	16	16	16	16

Example: Initialize variables with Hollerith:

demo% cat FourA8.f
      double complex x(2)
      data x /16Habcdefghijklmnop, 16Hqrstuvwxyz012345/
      write( 6, ’(4A8, "!")’ ) x
      end

demo% f95 -o FourA8 FourA8.f
demo% FourA8
abcdefghijklmnopqrstuvwxyz012345!
demo%

If needed, you can initialize a data item of a compatible type with a Hollerith and then pass it to other routines.

If you pass Hollerith constants as arguments, or if you use them in expressions or comparisons, they are interpreted as character-type expressions. Use the compiler option -xhasc=no to have the compiler treat Hollerith constants as typeless data in arguments on subprogram calls. This may be needed when porting older Fortran programs.

7.6 Nonstandard Coding Practices

As a general rule, porting an application program from one system and compiler to another can be made easier by eliminating any nonstandard coding. Optimizations or work-arounds that were successful on one system might only obscure and confuse compilers on other systems. In particular, optimized hand-tuning for one particular architecture can cause degradations in performance elsewhere. This is discussed later in the chapters on performance and tuning. However, the following issues are worth considering with regards to porting in general.

7.6.1 Uninitialized Variables

Some systems automatically initialize local and COMMON variables to zero or some “not-a-number” (NaN) value. However, there is no standard practice, and programs should not make assumptions regarding the initial value of any variable. To assure maximum portability, a program should initialize all variables.

7.6.2 Aliasing and the `-xalias` Option

Aliasing occurs when the same storage address is referenced by more than one name. This typically happens with pointers, or when actual arguments to a subprogram overlap between themselves or between COMMON variables within the subprogram. For example, arguments X and Z refer to the same storage locations, as do B and H:

  COMMON /INS/B(100)
  REAL S(100), T(100)
  ...
  CALL SUB(S,T,S,B,100)
  ...
  SUBROUTINE SUB(X,Y,Z,H,N)
  REAL X(N),Y(N),Z(N),H(N)
  COMMON /INS/B(100)
  ...

Many “dusty deck” Fortran programs utilized this sort of aliasing as a way of providing some kind of dynamic memory management that was not available in the language at that time.

Avoid aliasing in all portable code. The results could be unpredictable on some platforms and when compiled with optimization levels higher than -O2.

The f95 compiler assumes it is compiling a standard-conforming program. Programs that do not conform strictly to the Fortran standard can introduce ambiguous situations that interfere with the compiler’s analysis and optimization strategies. Some situations can produce erroneous results.

For example, overindexing arrays, use of pointers, or passing global variables as subprogram arguments when also used directly, can result in ambiguous situations that limit the compiler’s ability to generate optimal code that will be correct in all situations.

If you know that your program does contain some apparent aliasing situations you can use the -xalias option to specify the degree to which the compiler should be concerned. In some cases the program will not execute properly when compiled at optimization levels higher than -O2 unless the appropriate -xalias option is specified.

The option flag takes a comma-separated list of keywords that indicate a type of aliasing situation. Each keyword can be prefixed by no% to indicate an aliasing that is not present.

Table 7–2 -xalias Keywords and What They Mean


-xalias= `keyword`	Aliasing situation
`dummy`	Dummy subprogram arguments can alias each other and global variables.
`no%dummy`	The Fortran standard is followed and dummy arguments do not alias each other or global variables in the actual call. (This is the default.)
`craypointer`	The program uses Cray pointers that can point anywhere. (This is the default.)
`no%craypointer`	Cray pointers always point at distinct memory areas, or are not used.
`ftnpointer`	Any Fortran 95 pointer can point to any target variable, regardless of type, kind, or rank.
`no%ftnpointer`	Fortran 95 pointers follow the rules of the standard. (This is the default.)
`overindex`	There are four overindexing situations that can be caused by violating the subscript bounds in an array reference, and any one or more of these may appear in the program: A reference to an element of an array in a COMMON block could refer to any element in a COMMON block or equivalence group. Passing an element of a COMMON block or equivalence group as an actual argument to a subprogram gives access to any element of that COMMON block or equivalence group. Variables of a sequence derived type are treated as if they were COMMON blocks, and elements of a such a variable may alias other elements of that variable. Individual array subscript bounds may be violated, even though the array reference stays within the array. `overindex` does not apply to array syntax, `WHERE`, and `FORALL` statements. If overindexing occurs in these constructs, they should be rewritten as `DO` loops.
`no%overindex`	Array bounds are not violated. Array references do not reference other variables. (This is the default.)
`actual`	The compiler treats actual subprogram arguments as if they were global variables. Passing an argument to a subprogram may result in aliasing through Cray pointers.
`no%actual`	Passing an argument to a subprogram does not cause further aliasing. (This is the default.)

Here some examples of typical aliasing situations. At the higher optimization levels (-O3 and above) the f95 compiler can generate better code if your program does not contain the aliasing syndromes shown below and you compile with -xalias=no%keyword.

In some cases you will need to compile with -xalias=keyword to insure that the code generate will produce the correct results.

7.6.2.1 Aliasing Through Dummy Arguments and Global Variables

The following example needs to be compiled with -xalias=dummy

parameter (n=100)
integer a(n)
common /qq/z(n)
call sub(a,a,z,n)
...
subroutine sub(a,b,c,n)
integer a(n), b(n)
common /qq/z(n)
a(2:n) = b(1:n-1)
c(2:n) = z(1:n-1)
The compiler must assume that the dummy variables and the common variable may overlap.

7.6.2.2 Aliasing Introduced With Cray Pointers

This example works only when compiled with -xalias=craypointer, which is the default:

parameter (n=20)
integer a(n)
integer v1(*), v2(*)
pointer (p1,v1)
pointer (p2,v2)
p1 = loc(a)
p2 = loc(a)
a = (/ (i,i=1,n) /)
...
v1(2:n) = v2(1:n-1)
The compiler must assume that these locations can overlap.

Here is an example of Cray pointers that do not overlap. In this case, compile with -xalias=no%craypointer for possibly better performance:

parameter (n=10)
integer a(n+n)
integer v1(n), v2(n)
pointer (p1,v1)
pointer (p2,v2)
p1 = loc(a(1))
p2 = loc(a(n+1))
...
v1(:) = v2(:)
The Cray pointers to not point to overlapping memory areas.

7.6.2.3 Aliasing Introduced With Fortran 95 Pointers

Compile the following example with -xalias=ftnpointer

parameter (n=20)
integer, pointer :: a(:)
integer, target :: t(n)
interface
  subroutine sub(a,b,n)
    integer, pointer :: a(:)
    integer, pointer :: b(:)
  end subroutine
end interface

a => t
a = (/ (i, i=1,n) /)
call sub(a,a,n)
....
end
subroutine sub(a,b,n)
 integer, pointer :: a(:)
 real, pointer :: b(:)
 integer i, mold

 forall (i=2:n)
   a(i) = transfer(b(i-1), mold)
The compiler must assume that a and b can overlap.

Note that in this example the compiler must assume that a and b may overlap, even though they point to data of different data types. This is illegal in standard Fortran. The compiler gives a warning if it can detect this situation.

7.6.2.4 Aliasing By Overindexing

Compile the following example with -xalias=overindex

integer a,z
common // a(100),z
z = 1
call sub(a)
print*, z
subroutine sub(x)
  integer x(10)
  x(101) = 2
The compiler may assume that the call to sub may write to z
The program prints 2, and not 1, when compiled with -xalias=overindex

Overindexing appears in many legacy Fortran 77 programs and should be avoided. In many cases the result will be unpredictable. To insure correctness, programs should be compiled and tested with the -C (runtime array bounds checking) option to flag any array subscripting problems.

In general, the overindex flag should only be used with legacy Fortran 77 programs. -xalias=overindex does not apply to array syntax expressions, array sections, WHERE, and FORALL statements.

Fortran 95 programs should always conform to the subscripting rules in the Fortran standard to insure correctness of the generated code. For example, the following example uses ambiguous subscripting in an array syntax expression that will always produce an incorrect result due to the overindexing of the array:

This example of array syntax overindexing DOES NOT GIVE CORRECT RESULTS!

   parameter (n=10)
   integer a(n),b(n)
   common /qq/a,b
   integer c(n)
   integer m, k
   a = (/ (i,i=1,n) /)
   b = a
   c(1) = 1
   c(2:n) = (/ (i,i=1,n-1) /)

   m = n
   k = n + n
C
C   the reference to a is actually a reference into b
C   so this should really be  b(2:n) = b(1:n-1)
C
   a(m+2:k) = b(1:n-1)

C  or doing it in reverse
   a(k:m+2:-1) = b(n-1:1:-1)

Intuitively the user might expect array b to now look like array c, but the result is unpredictable

The xalias=overindex flag will not help in this situation since the overindex flag does not extend to array syntax expressions. The example compiles, but will not give the correct results. Rewriting this example by replacing the array syntax with the equivalent DO loop will work when compiled with -xalias=overindex. But this kind of programming practice should be avoided entirely.

7.6.2.5 Aliasing By Actual Arguments

The compiler looks ahead to see how local variables are used and then makes assumptions about variables that will not change over a subprogram call. In the following example, pointers used in the subprogram defeat the compiler’s optimization strategy and the results are unpredictable. To make this work properly you need to compile with the -xalias=actual flag:

 program foo
      integer i
      call take_loc(i)
      i = 1
      print * , i
      call use_loc()
      print * , i
   end

   subroutine take_loc(i)
      integer i
      common /loc_comm/ loc_i
      loc_i = loc(i)
   end subroutine take_loc

   subroutine use_loc()
      integer vi1
      pointer (pi,vi)
      common /loc_comm/ loc_i
      pi = loc_i
      vi1 = 3
   end subroutine use_loc

take_loc takes the address of i and saves it away. use_loc uses it. This is a violation of the Fortran standard.

Compiling with the -xalias=actual flag informs the compiler that all arguments to subprograms should be considered global within the compilation unit, causing the compiler to be more cautious with its assumptions about variables appearing as actual arguments.

Programming practices like this that violate the Fortran standard should be avoided.

7.6.2.6 `-xalias` Defaults

Specifying -xalias without a list assumes that your program does not violate the Fortran aliasing rules. It is equivalent to asserting no% for all the aliasing keywords.

The compiler default, when compiling without specifying -xalias, is:

-xalias=no%dummy,craypointer,no%actual,no%overindex,no%ftnpointer

If your program uses Cray pointers but conforms to the Fortran aliasing rules whereby the pointer references cannot result in aliasing, even in ambiguous situations, compiling with -xalias may result in generating better optimized code.

7.6.3 Obscure Optimizations

Legacy codes may contain source-code restructurings of ordinary computational DO loops intended to cause older vectorizing compilers to generate optimal code for a particular architecture. In most cases, these restructurings are no longer needed and may degrade the portability of a program. Two common restructurings are strip-mining and loop unrolling.

7.6.3.1 Strip-Mining

Fixed-length vector registers on some architectures led programmers to manually “strip-mine” the array computations in a loop into segments:

  REAL TX(0:63)
  ...
  DO IOUTER = 1,NX,64
     DO IINNER = 0,63
        TX(IINNER) = AX(IOUTER+IINNER) * BX(IOUTER+IINNER)/2.
        QX(IOUTER+IINNER) = TX(IINNER)**2
     END DO
  END DO

Strip-mining is no longer appropriate with modern compilers; the loop can be written much less obscurely as:

  DO IX = 1,N
    TX = AX(I)*BX(I)/2.
    QX(I) = TX**2
  END DO

7.6.3.2 Loop Unrolling

Unrolling loops by hand was a typical source-code optimization technique before compilers were available that could perform this restructuring automatically. A loop written as:

  DO       K = 1, N-5, 6
     DO    J = 1, N
        DO I = 1,N
           A(I,J) = A(I,J) + B(I,K  ) * C(K  ,J)
 *                         + B(I,K+1) * C(K+1,J)
 *                         + B(I,K+2) * C(K+2,J)
 *                         + B(I,K+3) * C(K+3,J)
 *                         + B(I,K+4) * C(K+4,J)
 *                         + B(I,K+5) * C(K+5,J)
        END DO
     END DO
  END DO
  DO       KK = K,N
     DO    J =1,N
        DO I =1,N
           A(I,J) = A(I,J) + B(I,KK) * C(KK,J)
        END DO
     END DO
  END DO

should be rewritten the way it was originally intended:

  DO       K = 1,N
     DO    J = 1,N
        DO I = 1,N
           A(I,J) = A(I,J) + B(I,K) * C(K,J)
        END DO
     END DO
  END DO

7.7 Time and Date Functions

Library functions that return the time of day or elapsed CPU time vary from system to system.

The time functions supported in the Fortran library are listed in the following table:

Table 7–3 Fortran Time Functions


Name	Function	Man Page
`time`	Returns the number of seconds elapsed since January, 1, 1970	`time`(3F)
`date`	Returns date as a character string	`date`(3F)
`fdate`	Returns the current time and date as a character string	`fdate`(3F)
`idate`	Returns the current month, day, and year in an integer array	`idate`(3F)
`itime`	Returns the current hour, minute, and second in an integer array	`itime`(3F)
`ctime`	Converts the time returned by the `time` function to a character string	`ctime`(3F)
`ltime`	Converts the time returned by the `time` function to the local time	`ltime`(3F)
`gmtime`	Converts the time returned by the `time` function to Greenwich time	`gmtime`(3F)
`etime`	Single processor: Returns elapsed user and system time for program execution Multiple processors: Returns the wall clock time	`etime`(3F)
`dtime`	Returns the elapsed user and system time since last call to `dtime`	`dtime`(3F)
`date_and_time`	Returns date and time in character and numeric form	`date_and_time`(3F)

For details, see Fortran Library Reference Manual or the individual man pages for these functions. Here is a simple example of the use of these time functions (TestTim.f):

      subroutine startclock
      common / myclock / mytime
      integer mytime, time
      mytime = time()
      return
      end
      function wallclock()
      integer wallclock
      common / myclock / mytime
      integer mytime, time, newtime
      newtime = time()
      wallclock = newtime - mytime
      mytime = newtime
      return
      end
      integer wallclock, elapsed
      character*24 greeting
      real dtime, timediff, timearray(2)
c      print a heading
      call fdate( greeting )
      print*,  "      Hello, Time Now Is: ",  greeting
      print*,      "See how long ’sleep 4’ takes, in seconds"
      call startclock
      call system( ’sleep 4’ )
      elapsed = wallclock()
      print*, "Elapsed time for sleep 4 was: ", elapsed," seconds"
c      now test the cpu time for some trivial computing
      timediff = dtime( timearray )
      q = 0.01
      do 30 i = 1, 100000
            q = atan( q )
30      continue
      timediff = dtime( timearray )
      print*, "atan(q) 100000 times took: ", timediff ," seconds"
      end

Running this program produces the following results:

demo% TimeTest
       Hello, Time Now Is: Thu Feb  8 15:33:36 2001
 See how long ’sleep 4’ takes, in seconds
 Elapsed time for sleep 4 was:  4  seconds
 atan(q) 100000 times took:  0.01  seconds
demo%

The routines listed in the following table provide compatibility with VMS Fortran system routines idate and time. To use these routines, you must include the -lV77 option on the f95 command line, in which case you also get these VMS versions instead of the standard f95 versions.

Table 7–4 Summary: Nonstandard VMS Fortran System Routines


Name	Definition	Calling Sequence	Argument Type
`idate`	Date as day, month, year	`call idate( d, m, y )`	`integer`
`time`	Current time as `hhmmss`	`call time( t )`	*`character8`**

Note –

The date(3F) routine and the VMS version of idate(3F) cannot be Year 2000 safe because they return 2-digit values for the year. Programs that compute time duration by subtracting dates returned by these routines will compute erroneous results after December 31, 1999. The Fortran 95 routine date_and_time(3F) should be used instead. See the Fortran Library Reference Manual for details.

7.8 Troubleshooting

Here are a few suggestions for what to try when programs ported to Fortran 95 do not run as expected.

7.8.1 Results Are Close, but Not Close Enough

Try the following:

Pay attention to the size and the engineering units. Numbers very close to zero can appear to be different, but the difference is not significant, especially if this number is the difference between two large numbers. For example, 1.9999999e-30 is very near -9.9992112e-33, even though they differ in sign.

VAX math is not as accurate as IEEE math, and even different IEEE processors may differ. This is especially true if the mathematics involves many trigonometric functions. These functions are much more complicated than one might think, and the standard defines only the basic arithmetic functions. There can be subtle differences, even between IEEE machines. Review Chapter Chapter 6, Floating-Point Arithmetic regarding floating-point issues.
Try running with a call nonstandard_arithmetic(). Doing so can also improve performance considerably, and make your Sun workstation behave more like a VAX system. If you have access to a VAX or some other system, run it there also. It is quite common for many numerical applications to produce slightly different results on each floating-point implementation.
Check for NaN, +Inf, and other signs of probable errors. See Chapter Chapter 6, Floating-Point Arithmetic, or the man page ieee_handler(3m) for instructions on how to trap the various exceptions. On most machines, these exceptions simply abort the run.
Two numbers can differ by 6 x 10²⁹ and still have the same floating-point form. Here is an example of different numbers, with the same representation:

      real*4 x,y
      x=99999990e+29
      y=99999996e+29
      write (*,10) x, x
 10   format(’99,999,990 x 10^29 = ’, e14.8, ’ = ’, z8)
      write(*,20) y, y
 20   format(’99,999,996 x 10^29 = ’, e14.8, ’ = ’, z8)
      end

The output is:

99,999,990 x 10^29 = 0.99999993E+37 = 7CF0BDC1
99,999,996 x 10^29 = 0.99999993E+37 = 7CF0BDC1

In this example, the difference is 6 x 10²⁹. The reason for this indistinguishable, wide gap is that in IEEE single-precision arithmetic, you are guaranteed only six decimal digits for any one decimal-to-binary conversion. You may be able to convert seven or eight digits correctly, but it depends on the number.

7.8.2 Program Fails Without Warning

If the program fails without warning and runs different lengths of time between failures, then:

Compile with minimal optimization (–O1). If the program then works, compile only selective routines with higher optimization levels.
Understand that optimizers must make assumptions about the program. Nonstandard coding or constructs can cause problems. Almost no optimizer handles all programs at all levels of optimization. (See 7.6.2 Aliasing and the -xalias Option)