Go to main content
Oracle® Developer Studio 12.6: Fortran User's Guide

Exit Print View

Updated: July 2017
 
 

3.4 Options Reference

This section describes all of the f95 compiler command–line option flags, including various risks, restrictions, caveats, interactions, examples, and other details.

Unless indicated otherwise, each option is valid on both SPARC and x64/x86 platforms. Option flags valid only on SPARC platforms are marked (SPARC). Option flags valid only on x64/x86 platforms are marked (x86).

Option flags marked (Obsolete) are obsolete and should not be used. In many cases they have been superceded by other options or flags that should be used instead.

3.4.1 –aligncommon[={1|2|4|8|16}]

Specify the alignment of data in common blocks and standard numeric sequence types .

The value indicates the maximum alignment (in bytes) for data elements within common blocks and standard numeric sequence types.


Note -  A standard numeric sequence type is a derived type containing a SEQUENCE statement and only default component data types ( INTEGER, REAL, DOUBLEPRECISION, COMPLEX without KIND= or * size) . Any other type, such as REAL*8, will make the type non-standard.

For example, -aligncommon=4 would align data elements with natural alignments of 4 bytes or more on 4-byte boundaries.

This option does not affect data with natural alignment smaller than the specified size.

Without -aligncommon, the compiler aligns elements in common blocks and numeric sequence types on (at most) 4-byte boundaries.

Specifying -aligncommon without a value defaults to 1 - all common block and numeric sequence type elements align on byte boundaries (no padding between elements).

-aligncommon=16 reverts to -aligncommon=8 when compiling with –m32.

Do not use -aligncommon=1 with -xmemalign as these declarations will conflict and could cause a segmentation fault on some platforms and configurations.

Using —aligncommon=1 on SPARC platforms might result in a bus error due to misalignment, requiring an appropriate choice of the —xmemalign option be used. Depending on the application, —xmemalign=1s, —xmemalign=4i, or —xmemalign=8i should give optimal performance while avoiding the segmentation fault.

See also —xmemalign

3.4.2 –ansi

Identify many nonstandard extensions.

Warning messages are issued for any uses of non–standard Fortran extensions in the source code.

3.4.3 –arg=local

Preserve actual arguments over ENTRY statements.

When you compile a subprogram with alternate entry points with this option, f95 uses copy/restore to preserve the association of dummy and actual arguments.

This option is provided for compatibility with legacy Fortran 77 programs. Code that relies on this option is non-standard.

3.4.4 –autopar

Enable automatic loop parallelization.

Finds and parallelizes appropriate loops for running in parallel on multiple processors. Analyzes loops for inter–iteration data dependencies and loop restructuring. If the optimization level is not specified -O3 or higher, it will automatically be raised to -O3.

Also specify the -stackvar option when using any of the parallelization options, including -autopar. The -stackvar option may provide better performance when using -autopar because it may allow the optimizer to detect additional opportunities for parallelization. See the description of the -stackvar option for information on how to set the sizes for the main thread stack and for the slave thread stacks.

Avoid -autopar if the program already contains explicit calls to the libthread threads library. See note in -mt[={yes|no}].

The -autopar option is not appropriate on a single–processor system, and the compiled code will generally run slower.

Use the OMP_NUM_THREADS environment variable to specify the number of threads to use when running a program automatically parallelized by the -xautopar compiler option. If OMP_NUM_THREADS is not set, the default number of threads used is a multiple of the number of cores per socket (that is, cores per processor chip), which is less than or equal to the total number of cores or 32, whichever is less. Set OMP_NUM_THREADS to 1 to run with just one thread. For best performance, the number of threads used should not exceed the number of hardware threads (or virtual processors) available on the machine. On Oracle Solaris systems, this number can be determined by using the psrinfo(1M) command. On Oracle Linux systems, this number can be determined by inspecting the file /proc/cpuinfo. See the Oracle Developer Studio 12.6: OpenMP API User’s Guide for more information.

In addition to OMP_NUM_THREADS, other environment variables that apply to an OpenMP programs may be used with a program automatically parallelized by the -xautopar compiler option. See the Oracle Developer Studio 12.6: OpenMP API User’s Guide for descriptions of the environment variables.

If you use -autopar and compile and link in one step, the multithreading library and the thread–safe Fortran runtime library will automatically be linked. If you use -autopar and compile and link in separate steps, then you must also link with -autopar to insure linking the appropriate libraries.

Use the -reduction option in conjunction with -autopar to recognize reduction operations in loops.

Use the -loopinfo option to show which loops were and were not parallelized.

For explicit, user-controlled parallelization, use OpenMP directives and the -xopenmp option.

3.4.5 –B{static|dynamic}

Prefer dynamic or require static library linking.

No space is allowed between -B and dynamic or static. The default, without -B specified, is -Bdynamic.

  • –Bdynamic: Prefer dynamic linking (try for shared libraries).

  • –Bstatic: Require static linking (no shared libraries).

Also note:

  • If you specify static, but the linker finds only a dynamic library, then the library is not linked with a warning that the “library was not found.”

  • If you specify dynamic, but the linker finds only a static version, then that library is linked, with no warning.

You can toggle -Bstatic and -Bdynamic on the command line. That is, you can link some libraries statically and some dynamically by specifying -Bstatic and -Bdynamic any number of times on the command line, as follows:

f95 prog.f -Bdynamic -lwells -Bstatic -lsurface

These are loader and linker options. Compiling and linking in separate steps with -Bx on the compile command will require it in the link step as well.

You cannot specify both -Bdynamic and -dn on the command line because -dn disables linking of dynamic libraries.

Mixing static Fortran runtime system libraries with dynamic Fortran runtime system libraries is not recommended and can result in linker errors or silent data corruption. Always link with the latest shared dynamic Fortran runtime system libraries.

3.4.6 –C

Check array references for out of range subscripts and conformance at runtime.

Subscripting arrays beyond their declared sizes may result in unexpected results, including segmentation faults. The -C option checks for possible array subscript violations in the source code and during execution. -C also adds runtime checks for array conformance in array syntax expressions

Specifying -C may make the executable file larger.

If the -C option is used, array subscript violations are treated as an error. If an array subscript range violation is detected in the source code during compilation, it is treated as a compilation error.

If an array subscript violation can only be determined at runtime, the compiler generates range–checking code into the executable program. This may cause an increase in execution time. As a result, it is appropriate to enable full array subscript checking while developing and debugging a program, then recompiling the final production executable without subscript checking.

3.4.7 –c

Compile only; produce object .o files, but suppress linking.

Compile a .o file for each source file. If only a single source file is being compiled, the -o option can be used to specify the name of the .o file written.

3.4.8 –copyargs

Allow assignment to constant arguments.

Allow a subprogram to change a dummy argument that is a constant. This option is provided only to allow legacy code to compile and execute without a runtime error.

  • Without -copyargs, if you pass a constant argument to a subroutine, and then within the subroutine try to change that constant, the run aborts.

  • With -copyargs, if you pass a constant argument to a subroutine, and then within the subroutine change that constant, the run does not necessarily abort.

Code that aborts unless compiled with -copyargs is, of course, not Fortran standard compliant. Also, such code is often unpredictable.

3.4.9 –Dname[=def]

Define symbol name for the preprocessor.

This option only applies to .F, .F90, .F95, and .F03 source files.

–Dname=def Define name to have value def

–Dname Define name to be 1

On the command line, this option will define name as if

#define name[=def]

had appeared in the source file. If no =def specified, the name name is defined as the value 1. The macro symbol name is passed on to the preprocessor fpp (or cpp— see the -xpp option) for expansion.

The predefined macro symbols have two leading underscores. The Fortran syntax may not support the actual values of these macros—they should appear only in fpp or cpp preprocessor directives. (Note the two leading underscores.)

  • The compiler version is predefined (in hex) in __SUNPRO_F90, and __SUNPRO_F95. For example __SUNPRO_F95 is 0x880 for version 8.8 of the Fortran compiler in the Oracle Developer Studio 12.6 release.

  • The following macros are predefined on appropriate systems:

    On Linux: __gnu__linux__, __linux, __linux__, __unix__, _LP64, __LP64__

    On Oracle Solaris: __SVR4__, __SunOS, __SunOS_RELEASE, __sun__, __svr4__, __SVR4, __SunOS_5_10, __SunOS_5_11, __sun

    On SPARC: __sparc__, __sparc_v9__, __sparc, __sparcv8, __sparcv9

    On x86: __amd64, __x86_64__, __i386__, __i386

  • The following are predefined with no underscores, but they might be deleted in a future release: sparc, unix, sun

  • On 64-bit x86 systems, the macros __amd64 and __x86_64 are defined.

Compile a .F, .F90, .F95, or .F03 source file with the -v verbose option to see the preprocessor definitions assumed by the compiler.

You can use these values in such preprocessor conditionals as the following:

#ifdef __sparc

f95 uses the fpp(1) preprocessor by default. Like the C preprocessor cpp(1), fpp expands source code macros and enables conditional compilation of code. Unlike cpp, fpp understands Fortran syntax, and is preferred as a Fortran preprocessor. Use the -xpp=cpp flag to force the compiler to specifically use cpp rather than fpp.

3.4.10 –dalign

Align COMMON blocks and standard numerical sequence types, and generate faster multi-word load/stores .

This flag changes the data layout in COMMON blocks, numeric sequence types, and EQUIVALENCE classes, and enables the compiler to generate faster multi-word load/stores for that data.

-dalign is a macro equivalent to:

-xmemalign=8s -aligncommon=16

Note that –aligncommon=16 is reverted to -aligncommon=8 when compiled with –-m32.

The data layout effect is that of the -f flag: double- and quad-precision data in COMMON blocks and EQUIVALENCE classes are laid out in memory along their "natural" alignment, which is on 8-byte boundaries (or 16-byte boundaries for quad-precision when compiling for 64-bit platforms with -m64.) The default alignment in COMMON blocks and standard-conforming numeric sequence derived types is on 4-byte boundaries.

Using -dalign along with -xtypemap=real:64,double:64,integer:64 also causes 64-bit integer variables to be double-word aligned on SPARC processors.


Note -  -dalign may result in nonstandard alignment of data, which could cause problems with variables in EQUIVALENCE or COMMON and may render the program non-portable if -dalign is required.

If you compile one subprogram with -dalign, compile all subprograms of the program with -dalign. This option is included in the -fast option.

Note that because -dalign invokes -aligncommon, standard numeric sequence types are also affected by this option. See –aligncommon[={1|2|4|8|16}]

3.4.11 –dbl_align_all[={yes|no}]

Force alignment of data on 8–byte boundaries

The value is either yes or no. If yes, all variables will be aligned on 8–byte boundaries. Default is -dbl_align_all=no.

Double precision and quad-precision data alignments are not affected by this option.

This flag does not alter the layout of data in COMMON blocks or user-defined structures.

Use with -dalign to enable added efficiency with multi-word load/stores.

3.4.12 –depend[={yes|no}]

Analyze loops for inter-iteration data dependencies and performs loop restructuring. Loop restructuring includes loop interchange, loop fusion, and scalar replacement.

If you do not specify -depend, the default is -depend=yes. If you specify -depend but do not specify an argument, the compiler assumes -depend=yes.

To turn off dependence analysis, compile with -depend=no.

-xdepend is a synonym for -depend.

3.4.13 –dryrun

Show commands built by the f95 command-line driver, but do not compile.

Useful when debugging, this option displays the commands and sub-options the compiler will invoke to perform the compilation.

3.4.14 –d{y|n}

Allow or disallow dynamic libraries for the entire executable.

  • –dy: Yes, allow dynamic/shared libraries.

  • –dn: No, do not allow dynamic/shared libraries.

The default, if not specified, is -dy.

Unlike -Bx, this option applies to the whole executable and need appear only once on the command line.

–dy|–dn are loader and linker options. If you compile and link in separate steps with these options, then you need the same option in the link step.

3.4.15 –e

Accept extended length input source line.

Extended source lines can be up to 250 characters long. The compiler pads on the right with trailing blanks to column 250. If you use continuation lines while compiling with -e, then do not split character constants across lines, otherwise, unnecessary blanks may be inserted in the constants .

3.4.16 –erroff[={%all|%none|taglist}]

Suppress warning messages listed by tag name.

Suppress the display of warning messages specified in the comma–separated list of tag names taglist. If %all, suppress all warnings, which is equivalent to the -w option. If %none, no warnings are suppressed. —erroff without an argument is equivalent to —erroff=%all.

Example:

f95 -erroff=WDECL_LOCAL_NOTUSED ink.f

Use the -errtags option to see the tag names associated with warning messages.

3.4.17 –errtags[={yes|no}]

Display the message tag with each warning message.

With-errtags=yes, the compiler’s internal error tag name will appear along with warning messages. -errtags alone is equivalent to -errtags=yes.

The default is not to display the tag (-errtags=no).

demo% f95 -errtags ink.f
ink.f:
 MAIN:
"ink.f", line 11: Warning: local variable "i" never used (WDECL_LOCAL_NOTUSED)  

3.4.18 –errwarn[={%all|%none|taglist}]

Treat warning messages as errors.

The taglist specifies a list of comma-separated tag names of warning messages that should be treated as errors. If %all, treat all warnings as errors. If %none, no warnings are treated as errors.

See also -errtags.

3.4.19 –ext_names=e

Create external names with or without trailing underscores.

e must be either plain, underscores, or fsecond-underscore The default is underscores.

–ext_names=plain: Do not add trailing underscore.

–ext_names=underscores: Add trailing underscore.

–ext_names=fsecond-underscore: Append two underscores to external names that contain an underscore, and a single underscore to those that do not.

An external name is a name of a subroutine, function, block data subprogram, or labeled common. This option affects both the name of the routine’s entry point and the name used in calls to it. Use this flag to allow Fortran routines to call (and be called by) other programming language routines.

fsecond-underscore is provided for compatibility with gfortran.

3.4.20 –F

Invoke the source file preprocessor, but do not compile.

Apply the fpp preprocessor to .F, .F90, .F95, and .F03 source files listed on the command line, and write the processed result on a file with the same name but with filename extension changed to .f (or .f95 or .f03), but do not compile.

Example:

f95 -F source.F

writes the processed source file to source.f

fpp is the default preprocessor for Fortran. The C preprocessor, cpp, can be selected instead by specifying -xpp=cpp.

3.4.21 –f

Align double- and quad-precision data in COMMON blocks.

-f is a legacy option flag equivalent to -aligncommon=16. Use of -aligncommon is preferred.

The default alignment of data in COMMON blocks is on 4-byte boundaries. -f changes the data layout of double- and quad-precision data in COMMON blocks and EQUIVALENCE classes to be placed in memory along their “natural” alignment, which is on 8-byte boundaries (or on 16-byte boundaries for quad-precision when compiling for 64-bit environments with -m64).


Note -  -f may result in nonstandard alignment of data, which could cause problems with variables in EQUIVALENCE or COMMON and may render the program non-portable if -f is required.

Compiling any part of a program with -f requires compiling all subprograms of that program with -f.

By itself, this option does not enable the compiler to generate faster multi-word fetch/store instructions on double and quad precision data. The -dalign option does this and invokes -f as well. Use of -dalign is preferred over the older -f. See –dalign. Because -dalign is part of the -fast option, so is -f.

3.4.22 –f77[=list]

Select FORTRAN 77 compatibility mode.

This option flag enables porting legacy FORTRAN 77 source programs, including those with language extensions accepted by the Sun WorkShop f77 compiler, to the f95 Fortran compiler. (There is no longer a separate FORTRAN 77 compiler.)

list is a comma-separated list selected from the following possible keywords:

keyword
meaning
%all
Enable all the Fortran 77 compatibility features.
%none
Disable all the Fortran 77 compatibility features.
backslash
Accept backslash as an escape sequence in character strings.
input
Allow input formats accepted by f77.
intrinsics
Limit recognition of intrinsics to only Fortran 77 intrinsics.
logical
Accept Fortran 77 usage of logical variables, such as:
  • assigning integer values to logical variables

  • allowing arithmetic expressions in logical conditional statements, with .NE.0 representing .TRUE.

  • allowing relational operators .EQ. and .NE. with logical operands

misc
Allow miscellaneous f77 Fortran 77 extensions.
output
Generate f77-style formatted output, including list-directed and NAMELIST output.
subscript
Allow non-integer expressions as array subscripts.
tab
Enable f77-style TAB-formatting, including unlimited source line length. No blank padding will be added to source lines shorter than 72 characters.

All keywords can be prefixed by no% to disable the feature, as in:

-f77=%all,no%backslash

The default, when -f77 is not specified, is -f77=%none. Using -f77 without a list is equivalent to specifying -f77=%all.

Exceptions Trapping and -f77:

Specifying -f77 does not change the Fortran trapping mode, which is -ftrap=common. f95 differs from the Fortran 77 compiler’s behavior regarding arithmetic exception trapping. The Fortran 77 compiler allowed execution to continue after an arithmetic exception occurred. Compiling with -f77 also causes the program to call ieee_retrospective on program exit to report on any arithmetic exceptions that might have occurred. Specify -ftrap=%none following the -f77 option flag on the command line to mimic the original Fortran 77 behavior.

See Mixing Languages for complete information on f77 compatibility and Fortran 77 to Fortran 95 migration.

See also the -xalias flag for handling non-standard programming syndromes that may cause incorrect results.

3.4.23 –fast

Select options that optimize execution performance.


Note -  This option is defined as a particular selection of other options that is subject to change from one release to another, and between compilers. Also, some of the options selected by -fast might not be available on all platforms. Compile with the -dryrun flag to see the expansion of -fast.

-fast provides high performance for certain benchmark applications. However, the particular choice of options may or may not be appropriate for your application. Use -fast as a good starting point for compiling your application for best performance. But additional tuning may still be required. If your program behaves improperly when compiled with -fast, look closely at the individual options that make up -fast and invoke only those appropriate to your program that preserve correct behavior.

Note also that a program compiled with -fast may show good performance and accurate results with some data sets, but not with others. Avoid compiling with -fast those programs that depend on particular properties of floating-point arithmetic.

Because some of the options selected by -fast have linking implications, if you compile and link in separate steps be sure to link with -fast also.

–fast selects the following options:

  • The -xtarget=native hardware target .

    If the program is intended to run on a different target than the compilation machine, follow the -fast with a code–generator option. For example: f95 -fast -xtarget=ultraT2 ...

  • The -O5 optimization level option.

  • The -depend option analyzes loops for data dependencies and possible restructuring. (This option is always enabled when compiling at optimizations levels -xO3 and greater.)

  • The -libmil option for system–supplied inline expansion templates.

    For C functions that depend on exception handling, follow -fast by -nolibmil (as in -fast -nolibmil). With -libmil, exceptions cannot be detected with errno or matherr(3m).

  • The -fsimple=2 option for aggressive floating–point optimizations.

    –fsimple=2 is unsuitable if strict IEEE 754 standards compliance is required. See –fsimple[={1|2|0}].

  • The -dalign option to generate double loads and stores for double and quad data in common blocks. Using this option can generate nonstandard Fortran data alignment in common blocks.

  • The -xlibmopt option selects optimized math library routines.

  • -pad=local inserts padding between local variables, where appropriate, to improve cache usage. (SPARC)

  • -xvector=lib transforms certain math library calls within DO loops to single calls to a vectorized library equivalent routine with vector arguments. (SPARC)

  • -fma=fused enables automatic generation of floating-point fused multiply-add instructions.

  • –fns selects non-standard floating-point arithmetic exception handling and gradual underflow. See –fns[={yes|no}].

  • -fround=nearest is selected because —xvector and —xlibmopt require it. (Oracle Solaris)

  • Trapping on common floating-point exceptions, -ftrap=common, is the enabled with f95.

  • -nofstore cancels forcing expressions to have the precision of the result. (x86)

  • -xregs=frameptr on x86 allows the compiler to use the frame-pointer register as a general purpose register. See the description of —xregs=frameptr for details and especially if compiling mixed C, C++, and Fortran source code. Specify -xregs=no%frameptr after -fast and the frame pointer register will not be used as a general purpose register. (x86)

It is possible to add or subtract from this list by following the -fast option with other options, as in:

f95 -fast -fsimple=1 -xlibmopt=%none ...

which overrides the -fsimple=2 option and disables the -xlibmopt selected by -fast.

Because -fast invokes -dalign, -fns, -fsimple=2, programs compiled with -fast can result in nonstandard floating-point arithmetic, nonstandard alignment of data, and nonstandard ordering of expression evaluation. These selections might not be appropriate for most programs.

Note that the set of options selected by the -fast flag can change with each compiler release. Invoking the compiler with -dryrun displays the -fast expansion:

<sparc>% f95 -dryrun -fast |& grep ###
          ###     command line files and options (expanded):
          
          ### -dryrun -xO5 -xchip=T5     
              -xcache=16/32/4/8:128/32/8/8:8192/64/16/128 -xarch=sparc4
              -xdepend=yes -xmemalign=8s -xpad=local -xvector=lib,no%simd     
              -aligncommon=dalign -fma=fused -fsimple=2 -fns=yes     
              -ftrap=division,invalid,overflow -xlibmil -xlibmopt

3.4.24 -features=a

Enables/disables the following Fortran language feature.

[no%]mergestrings

(SPARC) Causes the compiler to put string literals and other suitable const or read-only data into a special section of the binary where the linker removes duplicate strings.

The default is –features=no%mergestrings, and duplicate strings are not removed.

3.4.25 –fixed

Specify fixed–format Fortran 95 source input files.

All source files on the command–line will be interpreted as fixed format regardless of filename extension. Normally, f95 interprets only .f files as fixed format, .f95 as free format.

3.4.26 –flags

Synonym for -help.

3.4.27 –fma[={none|fused}]

Enables automatic generation of floating-point fused multiply-add instructions. -fma=none disables generation of these instructions. -fma=fused allows the compiler to attempt to find opportunities to improve the performance of the code by using floating-point fused multiply-add instructions.

The default is -fma=none.

The minimum architecture requirement is-xarch=sparcfmaf on SPARC and -xarch=avx2 on x86 to generate fused multiply-add instructions. The compiler marks the binary program if fused multiply-add instructions are generated to prevent execution of the program on platforms that do not support fused multiply-add instructions. When the minimum architecture is not used, then -fma=fused has no effect.

Fused multiply-add instructions eliminate the intermediate rounding step between the multiply and add. Consequently, programs may produce different results when compiled with -fma=fused, although precision will tend to increase rather than decrease.

3.4.28 –fno-semantic-interposition, –fsemantic-interposition

–fno-semantic-interposition allows the compiler to attempt to interprocedurally optimize functions with global linker scoping.

–fsemantic-interposition prevents the interprocedural optimization of functions with global linker scoping. Interprocedural optimizations consist of inlining functions, cloning functions, and interprocedural propagation, etc. This option is particularly useful when building libraries where it is allowed to interpose on external global functions that are also used internally within the library. While this option is useful, for example, to rewrite memory allocation functions for debugging, it is expensive in terms of code quality.

In Oracle Developer Studio 12.6, –fno-semantic-interposition is the default.

3.4.28.1 See Also

-xldscope, -xipo, –xinline=list,

3.4.29 –fnonstd

Initialize floating–point hardware to non–standard preferences.

This option is a macro for the combination of the following option flags:

–fns -ftrap=common

Specifying -fnonstd is approximately equivalent to the following two calls at the beginning of a Fortran main program.

i=ieee_handler("set", "common", SIGFPE_ABORT)
call nonstandard_arithmetic()

The nonstandard_arithmetic() routine replaces the obsolete abrupt_underflow() routine of earlier releases.

To be effective, the main program must be compiled with this option.

Using this option initializes the floating-point hardware to:

  • Abort (trap) on floating-point exceptions.

  • Flush underflow results to zero if it will improve speed, rather than produce a subnormal number as the IEEE standard requires.

See -fns for more information about gradual underflow and subnormal numbers.

The -fnonstd option allows hardware traps to be enabled for floating–point overflow, division by zero, and invalid operation exceptions. These are converted into SIGFPE signals, and if the program has no SIGFPE handler, it terminates with a dump of memory.

For more information, see the ieee_handler(3M) man page and the Oracle Developer Studio 12.6: Numerical Computation Guide.

3.4.30 –fns[={yes|no}]

Select nonstandard floating–point mode.

The default is the standard floating–point mode (–fns=no).

Optional use of =yes or =no provides a way of toggling the -fns flag following some other macro flag that includes it, such as -fast. -fns without a value is the same as -fns=yes.

This option flag enables nonstandard floating-point mode when the program begins execution. On SPARC platforms, specifying nonstandard floating-point mode disables “gradual underflow”, causing tiny results to be flushed to zero rather than producing subnormal numbers. It also causes subnormal operands to be silently replaced by zero. On those SPARC systems that do not support gradual underflow and subnormal numbers in hardware, use of this option can significantly improve the performance of some programs.

Where x does not cause total underflow, x is a subnormal number if and only if |x| is in one of the ranges indicated:

Table 10  Subnormal REAL and DOUBLE
Data Type
Range
REAL
0.0 < |x| < 1.17549435e–38
DOUBLE PRECISION
0.0 < |x| < 2.22507385072014e–308

See the Oracle Developer Studio 12.6: Numerical Computation Guide for details on subnormal numbers. (Some arithmeticians use the term denormalized number for subnormal number.)

The standard initialization of floating–point preferences is the default:

  • IEEE 754 floating–point arithmetic is nonstop (do not abort on exception).

  • Underflows are gradual.

On x86 platforms, this option is enabled only for Pentium III and Pentium 4 processors (SSE or SSE2 instruction sets).

On x86, -fns selects SSE flush-to-zero mode and where available, denormals-are-zero mode. This flag causes subnormal results to be flushed to zero. Where available, this flag also causes subnormal operands to be treated as zero. This flag has no effect on traditional x87 floating-point operations not utilizing the SSE or SSE2 instruction set.

To be effective, the main program must be compiled with this option.

3.4.31 -fopenmp

Same as -xopenmp=parallel.

3.4.32 –fpover[={yes|no}]

Detect floating-point overflow in formatted input.

With -fpover=yes specified, the I/O library will detect runtime floating-point overflows in formatted input and return an error condition (1031). The default is no such overflow detection (–fpover=no). -fpover without a value is equivalent to -fpover=yes. Combine with —ftrap to get full diagnostic information.

3.4.33 –fpp

Force preprocessing of input with fpp .

Pass all the input source files listed on the f95 command line through the fpp preprocessor, regardless of file extension. (Normally, only files with .F, .F90, or .F95 extension are automatically preprocessed by fpp.) See also –xpp={fpp|cpp}.

3.4.34 –fprecision={single|double|extended}

(x86) Initialize non-default floating-point rounding precision mode.

Sets the floating-point precision mode to either single, double, or extended on x86 platforms.

With a value of single or double, this flag causes the rounding precision mode to be set to single or double precision respectively at program initiation. With extended, or by default when the -fprecision flag is not specified, the rounding precision mode is initialized to extended precision.

This option is effective only on x86 systems and only if used when compiling the main program, but is ignored if compiling for 64–bit (-m64) or SSE2–enabled (-xarch=sse2) processors. It is also ignored on SPARC systems.

3.4.35 –free

Specify free–format source input files.

All source files on the command–line will be interpreted as f95 free format regardless of filename extension. Normally, f95 interprets .f files as fixed format, .f95 as free format.

3.4.36 –fround={nearest|tozero|negative|positive}

Set the IEEE rounding mode in effect at startup.

The default is -fround=nearest.

To be effective, compile the main program with this option.

This option sets the IEEE 754 rounding mode that:

  • Can be used by the compiler in evaluating constant expressions.

  • Is established at runtime during the program initialization.

When the value is tozero, negative, or positive, the option sets the rounding direction to round-to-zero, round-to-negative-infinity, or round-to-positive-infinity, respectively, when the program begins execution. When -fround is not specified, -fround=nearest is used as the default and the rounding direction is round-to-nearest. The meanings are the same as those for the ieee_flags function.

3.4.37 –fserialio

A linking option that specifies that the program does not perform I/O in more than one thread at a time. It allows Fortran I/O statements to be executed without performing synchronization to avoid race conditions. This option should be specified only when creating an executable program. It should not be specified when creating a shared object library nor should it be specified if the program includes code compiled with versions of Sun f77 before the Sun Forte 7 release.

3.4.38 –fsimple[={1|2|0}]

Select floating–point optimization preferences.

Allow the optimizer to make simplifying assumptions concerning floating–point arithmetic.

For consistent results, compile all units of a program with the same -fsimple option.

The defaults are:

  • Without the -fsimple flag, the compiler defaults to -fsimple=0

  • With -fsimple without a value, the compiler uses -fsimple=1

The different floating–point simplification levels are:

–fsimple=0

Permit no simplifying assumptions. Preserve strict IEEE 754 conformance.

–fsimple=1

Allow conservative simplifications. The resulting code does not strictly conform to IEEE 754.

With -fsimple=1, the optimizer can assume the following:

  • IEEE 754 default rounding/trapping modes do not change after process initialization.

  • Computations producing no visible result other than potential floating point exceptions may be deleted.

  • Computations with Infinity or NaNs (“Not a Number”) as operands need not propagate NaNs to their results; for example, x*0 may be replaced by 0.

  • Computations do not depend on sign of zero.

With -fsimple=1, the optimizer is not allowed to optimize completely without regard to roundoff or exceptions. In particular, a floating–point computation cannot be replaced by one that produces different results with rounding modes held constant at run time.

–fsimple=2

In addition to —fsimple=1, permit aggressive floating point optimizations. This can cause some programs to produce different numeric results due to changes in the way expressions are evaluated. In particular, the Fortran standard rule requiring compilers to honor explicit parentheses around subexpressions to control expression evaluation order may be broken with -fsimple=2. This could result in numerical rounding differences with programs that depend on this rule.

For example, with -fsimple=2, the compiler may evaluate C-(A-B) as (C-A)+B, breaking the standard’s rule about explicit parentheses, if the resulting code is better optimized. The compiler might also replace repeated computations of x/y with x*z, where z=1/y is computed once and saved in a temporary, to eliminate the costly divide operations.

Programs that depend on particular properties of floating-point arithmetic should not be compiled with -fsimple=2.

-fsimple=2 allows fp-transformations which may introduce fp exceptions.

–fast selects -fsimple=2.

3.4.39 –fstore

(x86) Force precision of floating-point expressions.

For assignment statements, this option forces all floating-point expressions to the precision of the destination variable. This is the default. However, the -fast option includes -nofstore to disable this option. Follow -fast with -fstore to turn this option back on.

3.4.40 –ftrap=t

Set floating–point trapping mode in effect at startup.

t is a comma–separated list that consists of one or more of the following:

%all, %none, common, [no%]invalid, [no%]overflow, [no%]underflow, [no%]division, [no%]inexact.

-ftrap=common is a macro for -ftrap=invalid,overflow,division.

The f95 default is -ftrap=common. This differs from the C and C++ compiler defaults, -ftrap=none.

Sets the IEEE 745 trapping mode in effect at startup but does not install a SIGFPE handler. You can use ieee_handler(3M) or fex_set_handling(3M) to simultaneously enable traps and install a SIGFPE handler. If you specify more than one value, the list is processed sequentially from left to right. The common exceptions, by definition, are invalid, division by zero, and overflow.

Example: -ftrap=%all,no%inexact means set all traps, except inexact.

The meanings for -ftrap=t are the same as for ieee_flags(), except that:

  • %all turns on all the trapping modes, and will cause trapping of spurious and expected exceptions. Use common instead.

  • %none turns off all trapping modes.

  • A no% prefix turns off that specific trapping mode.

To be effective, compile the main program with this option.

3.4.41 –fvisibility

The –fvisibility=v option is equivalent to the –xldscope option as follows:

–fvisibility Options
Equivalent –xldscope Options
default
global
internal
hidden
protected
symbolic
hidden
hidden

3.4.42 –G

Build a dynamic shared library instead of an executable file.

Direct the linker to build a shared dynamic library. Without -G, the linker builds an executable file. With -G, it builds a dynamic library. Use -o with -G to specify the name of the file to be written.

3.4.43 -g

See -g[n].

3.4.44 –g[n]

Compile for debugging and performance analysis.

Produce additional symbol table information for debugging with dbx(1) debugging utility and for performance analysis with the Performance Analyzer.

Although some debugging is possible without specifying -g, the full capabilities of dbx and debugger are only available to those compilation units compiled with -g.

Some capabilities of other options specified along with -g may be limited. See the dbx documentation for details.

To use the full capabilities of the Performance Analyzer, compile with -g. While some performance analysis features do not require -g, you must compile with -g to view annotated source, some function level information, and compiler commentary messages. (See the analyzer(1) man page and the Oracle Developer Studio 12.6: Performance Analyzer.)

The commentary messages generated with -g describe the optimizations and transformations the compiler made while compiling your program. The messages, interleaved with the source code, can be displayed by the er_src(1) command.

Note that commentary messages only appear if the compiler actually performed any optimizations. You are more likely to see commentary messages when you request high optimization levels, such as with -xO4, or -fast.

-g is implemented as a macro that expands to various other, more primitive, options. See -xdebuginfo for the details of the expansions.

-g

Produce standard debugging information.

-gnone

Do not produce any debugging information. This is the default.

-g1

Produce file and line number as well as simple parameter information that is considered crucial during post-mortem debugging.

-g2

Same as -g.

-g3

Produce additional debugging information, which currently consists only of macro definition information. This added information can result in an increase in the size of the debug information in the resulting .o and executable when compared to using only -g.

3.4.45 -gz[=cmp-type]

Equivalent of specifying-xcompress=debug -xcompress_format=cmp-type.

-gz with no sub-option is equivalent to -gz=zlib.

3.4.46 –hname

Specify the name of the generated dynamic shared library.

This option is passed on to the linker. For details, see the Oracle Solaris 11.3 Linkers and Libraries Guide.

The -hname option records the name name to the shared dynamic library being created as the internal name of the library. A space between -h and name is optional (except if the library name is elp, for which the space will be needed). In general, name must be the same as what follows the -o. Use of this option is meaningless without also specifying the -G or -shared options.

Without the -hname option, no internal name is recorded in the library file.

If the library has an internal name, whenever an executable program referencing the library is run the runtime linker will search for a library with the same internal name in any path the linker is searching. With an internal name specified, searching for the library at runtime linking is more flexible. This option can also be used to specify versions of shared libraries.

If there is no internal name of a shared library, then the linker uses a specific path for the shared library file instead.

3.4.47 –help

Display a summary list of compiler options.

See also –xhelp=flags.

3.4.48 –Ipath

Add path to the INCLUDE file search path.

Insert the directory path path at the start of the INCLUDE file search path. A space is allowed between -I and path. Invalid directories are ignored with no warning message.

The include file search path is the list of directories searched for INCLUDE files—file names appearing on preprocessor #include directives, or Fortran INCLUDE statements.

The search path is also used to find MODULE files.

Example: Search for INCLUDE files in /usr/app/include:

demo% f95 -I/usr/app/include growth.F

Multiple -Ipath options may appear on the command line. Each adds to the top of the search path list (first path searched).

The search order for relative paths on INCLUDE or #include is:

  1. The directory that contains the source file

  2. The directories that are named in the -I options

  3. The directories in the compiler’s internal default list

  4. /usr/include/

To invoke the preprocessor, you must be compiling source files with a .F, .F90, .F95, or .F03 suffix.

3.4.49 -i8

(There is no -i8 option.)

Use —xtypemap=integer:64 to specify 8–byte INTEGER with this compiler.

3.4.50 –inline=[%auto][[,][no%]f1,…[no%]fn]

Enable or disable inlining of specified routines.

Request the optimizer to inline the user–written routines appearing in a comma-separated list of function and subroutine names. Prefixing a routine name with no% disables inlining of that routine.

Inlining is an optimization technique whereby the compiler effectively replaces a subprogram reference such as a CALL or function call with the actual subprogram code itself. Inlining often provides the optimizer more opportunities to produce efficient code.

Specify %auto to enable automatic inlining at optimization levels -O4 or -O5. Automatic inlining at these optimization levels is normally turned off when explicit inlining is specified with -inline.

Specifying -xinline= without naming any functions or %auto indicates that none of the routines in the source files are to be inlined.

Example: Inline the routines xbar, zbar, vpoint:

demo% f95 -O3 -inline=xbar,zbar,vpoint *.f

Following are the restrictions; no warnings are issued:

  • Optimization must be -O3 or greater.

  • The source for the routine must be in the file being compiled, unless -xipo or–xcrossfile are also specified.

  • The compiler determines if actual inlining is profitable and safe.

The appearance of -inline with -O4 disables the automatic inlining that the compiler would normally perform, unless %auto is also specified. With -O4, the compilers normally try to inline all appropriate user–written subroutines and functions. Adding -inline with -O4 may degrade performance by restricting the optimizer’s inlining to only those routines in the list. In this case, use the %auto sub-option to enable automatic inlining at -O4 and -O5.

demo% f95 -O4 -inline=%auto,no%zpoint *.f

In the example above, the user has enabled -O4’s automatic inlining while disabling any possible inlining of the routine zpoint() that the compiler might attempt.

3.4.51 –iorounding[={compatible|processor-defined}]

Set floating-point rounding mode for formatted input/output.

Sets the ROUND= specifier globally for all formatted input/output operations.

With -iorounding=compatible, the value resulting from data conversion is the one closer to the two nearest representations, or the value away from zero if the value is halfway between them.

With -iorounding=processor-defined, the rounding mode is the processor’s default mode. This is the default when -iorounding is not specified.

3.4.52 –keepmod[={yes|no}]

If a module file exists and its content is not changed by the latest compilation, it will not be replaced even though the compilation is supposed to create a new module file with the same name. Since the content of the module file is unchanged by the compilation, the only effect is that the time stamp of the existing module file will be preserved.

The default, when -keepmod is not specified, is -keepmod=yes. Note that this default is different from previous releases of Oracle Developer Studio Fortran.

This option is best used together with the dependencies generated by the -xM compilation option. By retaining the time stamp of a module file when its content is unchanged, this option prevents cascading compilation for the source files depending on this module file. This is very helpful in an incremental build and can significantly reduce the build time.

When this option is used with user-specified dependencies and the user has an explicit build rule on how to create the module with a dependency on the corresponding source file, the option can cause the source file to be recompiled multiple times even though the source file is modified only once because of the outdated time stamp of the module file.

3.4.53 –keeptmp

Retains the temporary files that are created during compilation.

3.4.54 –Kpic

(Obsolete) Synonym for -pic.

3.4.55 –KPIC

(Obsolete) Synonym for -PIC.

3.4.56 –Lpath

Add path to list of directory paths to search for libraries.

Adds path to the front of the list of object–library search directories. A space between -L and path is optional. This option is passed to the linker. See also –lx.

While building the executable file, ld(1) searches path for archive libraries (.a files) and shared libraries (.so files). ld searches path before searching the default directories. For the relative order between LD_LIBRARY_PATH and -Lpath, see ld(1).


Note -  Specifying /usr/lib or /usr/ccs/lib with -L path may prevent linking the unbundled libm. These directories are searched by default.

Example: Use -Lpath to specify library search directories:

demo% f95 -L./dir1 -L./dir2 any.f

3.4.57 –lx

Add library libx.a to linker’s list of search libraries.

Pass -lx to the linker to specify additional libraries for ld to search for unresolved references. ld links with object library libx. If shared library libx.so is available (and -Bstatic or -dn are not specified), ld uses it, otherwise, ld uses static library libx.a. If it uses a shared library, the name is built in to a.out. No space is allowed between -l and x character strings.

Example: Link with the library libVZY:

demo% f95 any.f -lVZY

Use -lx again to link with more libraries.

Example: Link with the libraries liby and libz:

demo% f95 any.f -ly -lz

3.4.58 –libmil

Inline selected libm library routines for optimization.

There are inline templates for some of the libm library routines. This option selects those inline templates that produce the fastest executable for the floating–point options and platform currently being used.

For more information, see the man pages libm_single(3F) and libm_double(3F)

3.4.59 -library=sunperf

Link with the Oracle Developer Studio supplied performance libraries. See the Oracle Developer Studio 12.6: Performance Library User’s Guide.

3.4.60 –loopinfo

Show loop parallelization results.

Show which loops were and were not parallelized with the –autopar option.

–loopinfo displays a list of messages on standard error:

demo% f95 -c -fast -autopar -loopinfo shalow.f
...
"shalow.f", line 172: PARALLELIZED, and serial version generated
"shalow.f", line 173: not parallelized, not profitable
"shalow.f", line 181: PARALLELIZED, fused
"shalow.f", line 182: not parallelized, not profitable
...
...etc

3.4.61 –Mpath

Specify MODULE directory, archive, or file.

Look in path for Fortran modules referenced in the current compilation. This path is searched in addition to the current directory.

path can specify a directory, .a archive file of precompiled module files, or a .mod precompiled module file. The compiler determines the type of the file by examining its contents.

An archive .a file must be explicitly specified on a -M option flag to be searched for modules. The compiler will not search archive files by default.

Only .mod files with the same names as the MODULE names appearing on USE statements will be searched. For example, the statement USE ME causes the compiler to look only for the module file me.mod

When searching for modules, the compiler gives higher priority to the directory where the module files are being written. This is controlled by the -moddir compiler option, or the MODDIR environment variable. When neither are specified, the default write-directory is the current directory. When both are specified, the write-directory is the path specified by the -moddir flag.

This means that if only the -M flag appears, the current directory will be searched for modules first before any object listed on the -M flag. To emulate the behavior of previous releases, use:

-moddir=empty-dir -Mdir -M

where empty-dir is the path to an empty directory.

Directories named in —I path will be searched for module files if the files are not found in any of the other locations that are searched.

A space between the -M and the path is allowed. For example, -M /home/siri/PK15/Modules

On Oracle Solaris, if the path identifies a regular file that is not an archive or a module file, the compiler passes the option to the linker, ld, which will treat it as a linker mapfile. This feature is provided as a convenience similar to the C and C++ compilers.

See Module Files for more information about modules in Fortran.

3.4.62 –m32 | –m64

Specifies the data type model for compiled binary object.

Use -m32 to create 32-bit executables and shared libraries. Use -m64 to create 64-bit executables and shared libraries.

Object files or libraries compiled with -m32 cannot be linked with object files or libraries compiled with -m64.

When compiling applications with large amounts of static data on x64 platforms using -m64, -xmodel=medium may also be required. Be aware that some Oracle Linux platforms do not support the medium model.

Modules that are compiled with –m32 |–m64 must also be linked with –m32 |–m64.

On Oracle Solaris systems, -m32 is the default. On Oracle Linux systems, –m64 is the default.

3.4.63 –moddir=path

Specify where the compiler will write compiled .mod MODULE files.

The compiler will write the .mod MODULE information files it compiles in the directory specified by path. The directory path can also be specified with the MODDIR environment variable. If both are specified, this option flag takes precedence.

The compiler uses the current directory as the default for writing .mod files.

See Module Files for more information about modules in Fortran.

3.4.64 -mt[={yes|no}]

Use this option to compile and link multithreaded code.

This option passes -D_REENTRANT to the preprocessor.

-mt=yes is the default behavior of the compiler. -mt is equivalent to -mt=yes. If this behavior is not desired use the option -mt=no.

The -xopenmp option (for using the OpenMP shared-memory parallelization API) includes -mt=yes automatically.

Use this option consistently. If you compile and link one translation unit with –mt, you must compile and link all units of the program with –mt.

To determine which system support libraries will be linked by default, compile with the –dryrun option.

3.4.65 –native

(Obsolete) Optimize performance for the host system.

This option is a synonym for -xtarget=native, which is preferred. The -fast option sets -xtarget=native.

3.4.66 –noautopar

Disables automatic parallelization invoked by -autopar earlier on the command line.

3.4.67 –nodepend

Cancel any -depend appearing earlier on the command line. -depend=no is the preferred usage over -nodepend.

3.4.68 -nofstore

(x86) Cancel -fstore on command line.

The compiler default is -fstore. -fast includes -nofstore.

3.4.69 –nolib

Disable linking with system libraries .

Do not automatically link with any system or language library; that is do not pass any default -lx options on to ld. The normal behavior is to link system libraries into the executables automatically, without the user specifying them on the command line.

The -nolib option makes it easier to link one of these libraries statically. The system and language libraries are required for final execution. It is your responsibility to link them in manually. This option provides you with complete control.

Link libm statically and libc dynamically with f95:

demo% f95 -nolib any.f95 -Bstatic -lm -Bdynamic -lc

The order for the -lx options is important. Follow the order shown in the examples.

3.4.70 –nolibmil

Cancel -libmil on command line.

Use this option after the -fast option to disable inlining of libm math routines:

demo% f95 -fast -nolibmil …

3.4.71 –noreduction

Disable -reduction on command line.

This option disables -reduction.

3.4.72 –norunpath

Do not build a runtime shared library search path into the executable.

The compiler normally builds into an executable a path that tells the runtime linker where to find the shared libraries it will need. The path is installation dependent. The -norunpath option prevents that path from being built in to the executable.

This option is helpful when libraries have been installed in some nonstandard location, and you do not wish to make the loader search down those paths when the executable is run at another site. Compare with -Rpaths.

3.4.73 –O[n]

Specify optimization level .

n can be 1, 2, 3, 4, or 5. No space is allowed between -O and n.

If -O[n] is not specified, only a very basic level of optimization limited to local common subexpression elimination and dead code analysis is performed. A program’s performance may be significantly improved when compiled with an optimization level than without optimization. Use of -O (which sets -O3) or -fast (which sets -O5) is recommended for most programs.

Each -On level includes the optimizations performed at the levels below it. Generally, the higher the level of optimization a program is compiled with, the better runtime performance obtained. However, higher optimization levels may result in increased compilation time and larger executable files.

Debugging with -g does not suppress -On, but -On limits -g in certain ways; see the dbx documentation.

The -O3 and -O4 options reduce the utility of debugging such that you cannot display variables from dbx, but you can still use the dbx where command to get a symbolic traceback.

If the optimizer runs out of memory, it attempts to proceed over again at a lower level of optimization, resuming compilation of subsequent routines at the original level.

–O

This is equivalent to -O3.

–O1

Provides a minimum of statement–level optimizations.

Use if higher levels result in excessive compilation time, or exceed available swap space.

–O2

Enables basic block level optimizations.

This level usually gives the smallest code size. (See also -xspace.)

–O3 is preferred over -O2 unless -O3 results in unreasonably long compilation time, exceeds swap space, or generates excessively large executable files.

–O3

Performs automatic inlining of functions whose body is smaller than the overhead of calling the function. To control which functions are inlined, see -xinline=list.

Adds loop unrolling and global optimizations at the function level. Adds -depend automatically.

Usually -O3 generates larger executable files.

–O4

Adds automatic inlining of routines contained in the same file.

Usually -O4 generates larger executable files due to inlining.

The -g option suppresses the -O4 automatic inlining described above.–xcrossfile increases the scope of inlining with -O4.

–O5

Attempt aggressive optimizations.

Suitable only for that small fraction of a program that uses the largest fraction of compute time. -O5’s optimization algorithms take more compilation time, and may also degrade performance when applied to too large a fraction of the source program.

Optimization at this level is more likely to improve performance if done with profile feedback. See -xprofile=p.

3.4.74 –o filename

Specify the name of the executable file to be written.

There must be a blank between -o and filename. Without this option, the default is to write the executable file to a.out. When used with -c, -o specifies the target .o object file; with -G or -shared, it specifies the target .so library file.

3.4.75 –onetrip

Enable one trip DO loops.

Compile DO loops so that they are executed at least once. DO loops in standard Fortran are not performed at all if the upper limit is smaller than the lower limit, unlike some legacy implementations of Fortran.

3.4.76 –openmp

Synonym for -xopenmp.

3.4.77 –p

(Obsolete) Compile for profiling with the prof profiler.

Prepare object files for profiling, see prof (1). If you compile and link in separate steps, and also compile with the -p option, then be sure to link with the -p option. -p with prof is provided mostly for compatibility with older systems. -pg profiling with gprof is possibly a better alternative.

3.4.78 –pad[=p]

Insert padding for efficient use of cache.

This option inserts padding between arrays or character variables, if they are static local and not initialized, or if they are in common blocks. The extra padding positions the data to make better use of cache. In either case, the arrays or character variables can not be equivalenced.

p, if present, must be either %none or either (or both) local or common:

local
Add padding between adjacent local variables.
common
Add padding between variables in common blocks.
%none
Do not add padding. (Compiler default.)

If both local and common are specified, they can appear in any order.

Defaults for -pad:

  • The compiler does no padding by default.

  • Specifying -pad, but without a value is equivalent to -pad=local,common.

The -pad[=p] option applies to items that satisfy the following criteria:

  • The items are arrays or character variables

  • The items are static local or in common blocks

For a definition of local or static variables, see –stackvar.

The program must conform to the following restrictions:

  • Neither the arrays nor the character strings are equivalenced

  • If -pad=common is specified for compiling a file that references a common block, it must be specified when compiling all files that reference that common block. The option changes the spacing of variables within the common block. If one program unit is compiled with the option and another is not, references to what should be the same location within the common block might reference different locations.

  • If -pad=common is specified, the declarations of common block variables in different program units must be the same except for the names of the variables. The amount of padding inserted between variables in a common block depends on the declarations of those variables. If the variables differ in size or rank in different program units, even within the same file, the locations of the variables might not be the same.

  • If -pad=common is specified, EQUIVALENCE declarations involving common block variables are flagged with a warning message and the block is not padded.

  • Avoid overindexing arrays in common blocks with -pad=common specified. The altered positioning of adjacent data in a padded common block will cause overindexing to fail in unpredictable ways.

It is the programmer’s responsibility to make sure that common blocks are compiled consistently when -pad is used. Common blocks appearing in different program units that are compiled inconsistently with -pad=common will cause errors.

3.4.79 –pg

Compile for profiling with the gprof profiler. (-xpg is a synonym for -pg)

Compile self–profiling code in the manner of -p, but invoke a runtime recording mechanism that keeps more extensive statistics and produces a gmon.out file when the program terminates normally. Generate an execution profile by running gprof. See the gprof(1) man page for details.

Library options must be after the source and .o files (–pg libraries are static).


Note -  There is no advantage compiling with -xprofile if you specify -pg. These two features do not prepare or use data provided by the other.

Profiles generated by using prof(1) or gprof(1) on 64-bit Oracle Solaris platforms or just gprof on 32-bit Oracle Solaris platforms include approximate user CPU times. These times are derived from PC sample data (see pcsample(2)) for routines in the main executable and routines in shared libraries specified as linker arguments when the executable is linked. Other shared libraries (libraries opened after process startup using dlopen(3C)) are not profiled.

On 32-bit Oracle Solaris systems, profiles generated using prof(1) are limited to routines in the executable. 32 bit shared libraries can be profiled by linking the executable with -pg and using gprof(1).

The Oracle Solaris 10 software does not include system libraries compiled with -p. As a result, profiles collected on Oracle Solaris 10 platforms do not include call counts for system library routines.

The compiler options -p, -pg, or -xpg should not be used to compile multi-threaded programs, because the runtime support for these options is not thread-safe. If a program that uses multiple threads is compiled with these options invalid results or a segmentation fault could occur at runtime.

Binaries compiled with -xpg for gprof profiling should not be used with binopt(1) as they are incompatible and can result in internal errors.

If you compile and link in separate steps, and you compile with -pg, then be sure to link with -pg.

On x86 systems, -pg is incompatible with -xregs=frameptr, and these two options should not be used together. Note also that -xregs=frameptr is included in -fast.

3.4.80 –pic

Compile position–independent code for shared library.

On SPARC, –pic is equivalent to -xcode=pic13. See -xcode[=v] for more information on position-independent code.

On x86, produces position-independent code. Use this option to compile source files when building a shared library. Each reference to a global datum is generated as a dereference of a pointer in the global offset table. Each function call is generated in pc-relative addressing mode through a procedure linkage table.

3.4.81 –PIC

Compile position–independent code with 32-bit addresses.

On SPARC, –PIC is equivalent to -xcode=pic32. See -xcode[=v] for more information about position-independent code.

On x86, —PIC is equivalent to —pic.

3.4.82 –preserve_argvalues[=simple|none|complete]

(x86) Saves copies of register-based function arguments in the stack.

When none is specified or if the -preserve_argvalues option is not specified on the command line, the compiler behaves as usual.

When simple is specified, up to six integer arguments are saved.

When complete is specified, the values of all function arguments in the stack trace are visible to the user in the proper order.

The values are not updated during the function lifetime on assignments to formal parameters.

3.4.83 –Qoption pr ls

Pass the sub-option list ls to the compilation phase pr.

There must be blanks separating Qoption, pr, and ls. The Q can be uppercase or lowercase. The list is a comma–delimited list of sub-options, with no blanks within the list. Each sub-option must be appropriate for that program phase, and can begin with a minus sign.

This option is provided primarily for debugging the internals of the compiler by support staff. Use the LD_OPTIONS environment variable to pass options to the linker.

3.4.84 –qp

Synonym for -p.

3.4.85 –R ls

Build dynamic library search paths into the executable file.

With this option, the linker, ld(1), stores a list of dynamic library search paths into the executable file.

ls is a colon–separated list of directories for library search paths. The blank between -R and ls is optional.

Multiple instances of this option are concatenated together, with each list separated by a colon.

The list is used at runtime by the runtime linker, ld.so. At runtime, dynamic libraries in the listed paths are scanned to satisfy any unresolved references.

Use this option to let users run shippable executables without a special path option to find needed dynamic libraries.

Building an executable file using -Rpaths adds directory paths to a default path that is always searched last.

For more information, see the Oracle Solaris 11.3 Linkers and Libraries Guide.

3.4.86 –r8const

Promote single-precision constants to REAL*8 constants.

All single-precision REAL constants are promoted to REAL*8. Double-precision (REAL*8) constants are not changed. This option only applies to constants. To promote both constants and variables, see –xtypemap=spec.

Use this option flag carefully. It could cause interface problems when a subroutine or function expecting a REAL*4 argument is called with a REAL*4 constant that gets promoted to REAL*8. It could also cause problems with programs reading unformatted data files written by an unformatted write with REAL*4 constants on the I/O list.

3.4.87 –recl=a[,b]

Set default output record length.

Set the default record length (in characters) for either or both preconnected units output (standard output) and error (standard error). This option must be specified using one of the following forms:

  • –recl=out:N

  • –recl=error:N

  • –recl=out:N1,error:N2

  • –recl=error:N1,out:N2

  • –recl=all:N

where N, N1, N2 are all positive integers in the range from 72 to 2147483646. out refers to standard output, error to standard error, and all sets the default record length to both. The default is –recl=all:80. This option is only effective if the program being compiled has a Fortran main program.

3.4.88 –reduction

Recognize reduction operations in loops.

Analyze loops for reduction operations during automatic parallelization. There is potential for roundoff error with the reduction.

A reduction operation accumulates the elements of an array into a single scalar value. For example, summing the elements of a vector is a typical reduction operation. Although these operations violate the criteria for parallelizability, the compiler can recognize them and parallelize them as special cases when -reduction is specified.

This option is usable only with the automatic parallelization option —autopar. It is ignored otherwise. Explicitly parallelized loops are not analyzed for reduction operations.

3.4.89 –S

Compile and only generate assembly code.

Compile the named programs and leave the assembly–language output on corresponding files suffixed with .s. No .o file is created.

3.4.90 –s

Strip the symbol table out of the executable file.

This option makes the executable file smaller and more difficult to reverse engineer. However, this option inhibits debugging with dbx or other tools, and overrides -g.

3.4.91 –shared

Produces a shared object rather than a dynamically-linked executable. This option is passed to ld (as -G), and cannot be used with the -dn option.

When you use the -shared option, the compiler passes default -l options to ld, which are the same options that would be passed if you created an executable.

If you are creating a shared object by specifying the -shared option along with other compiler options that are specified at both compile time and link time, make sure that those options are also specified when you link with the resulting shared object.

When you create a shared object, all the object files that are compiled for 64-bit SPARC architectures must also be compiled with an explicit -xcode value as documented under the description of -xcode. For more information, see –G.

3.4.92 –silent

(Obsolete) Suppress compiler messages.

Normally, the f95 compiler does not issue messages, other than error diagnostics, during compilation. This option flag is provided for compatibility with the legacy f77 compiler, and its use is redundant except with the -f77 compatibility flag.

3.4.93 –stackvar

Allocate local variables on the stack whenever possible.

This option makes writing recursive and re-entrant code easier and provides the optimizer more freedom when parallelizing loops.

Use of -stackvar is recommended with any of the parallelization options.

Local variables are variables that are not dummy arguments, COMMON variables, variables inherited from an outer scope, or module variables made accessible by a USE statement.

With -stackvar in effect, local variables are allocated on the stack unless they have the attributes SAVE or STATIC. Note that explicitly initialized variables are implicitly declared with the SAVE attribute. A structure variable that is not explicitly initialized but some of whose components are initialized is, by default, not implicitly declared SAVE. Also, variables equivalenced with variables that have the SAVE or STATIC attribute are implicitly SAVE or STATIC.

A statically allocated variable is implicitly initialized to zero unless the program explicitly specifies an initial value for it. Variables allocated on the stack are not implicitly initialized except that components of structure variables can be initialized by default.

Putting large arrays onto the stack with -stackvar can overflow the stack causing segmentation faults. Increasing the stack size may be required.

The initial thread executing the program has a main stack, while each slave thread of a multithreaded program has its own thread stack.

For slave threads, the default thread stack size is 4 Megabytes on 32–bit systems and 8 Megabytes on 64–bit systems. The limit command (with no parameters) shows the current main stack size. If you get a segmentation fault using -stackvar, try increasing the main and thread stack sizes.

Example: Show the current main stack size:

demo% limit
cputime         unlimited
filesize        unlimited
datasize        523256 kbytes
stacksize       8192 kbytes      <–––
coredumpsize    unlimited
descriptors     64
memorysize      unlimited
demo%

Example: Set the main stack size to 64 Megabytes:

demo% limit stacksize 65536

You can set the size of the thread stack used by each slave thread by setting the STACKSIZE or OMP_STACKSIZE environment variable. See the Oracle Developer Studio 12.6: OpenMP API User’s Guide for more information about these environment variables.

Compile with -xcheck=stkovf to enable runtime checking for stack overflow situations. See the -xcheck option for more information.

3.4.94 –stop_status[={yes|no}]

Permit STOP statement to return an integer status value .

The default is -stop_status=no.

With -stop_status=yes, a STOP statement may contain an integer constant. That value will be passed to the environment as the program terminates:

STOP 123

The value must be in the range 0 to 255. Larger values are truncated and a run–time message issued. Note that

STOPstop string

is still accepted and returns a status value of 0 to the environment, although a compiler warning message will be issued.

The environment status variable is $status for the C shell csh, and $? for the Bourne and Korn shells, sh and ksh.

3.4.95 –temp=dir

Define directory for temporary files.

Set directory for temporary files used by the compiler to be dir. No space is allowed within this option string. Without this option, the files are placed in the /tmp directory.

This option takes precedence over the value of the TMPDIR environment variable.

3.4.96 –time

Time each compilation phase.

The time spent and resources used in each compiler pass is displayed.

3.4.97 –traceback[={%none|common|signals_list}]

Issue a stack trace if a severe error occurs in execution.

The -traceback option causes the executable to issue a stack trace to stderr, dump core, and exit if certain signals are generated by the program. If multiple threads generate a signal, a stack trace will only be produced for the first one.

To use traceback, add the -traceback option to the compiler command line when linking. The option is also accepted at compile-time but is ignored unless an executable binary is generated. Using -traceback with the -G or -shared options to create a shared library is an error.

Table 11  -traceback Options
Option
Meaning
common
specifies that a stack trace should be issued if any of a set of common signals occurs: sigill, sigfpe, sigbus, sigsegv, or sigabrt.
signals_list
specifies a comma-separated list of names of signals that should generate a stack trace, in lower case. The following signals (those that cause the generation of a core file) can be caught: sigquit, sigill, sigtrap, sigabrt, sigemt, sigfpe, sigbus, sigsegv, sigsys, sigxcpu, sigxfsz.
Any of these can be preceded with no% to disable catching the signal.
For example: -traceback=sigsegv,sigfpe will produce a stack trace and core dump if either sigsegv or sigfpe occurs.
%none or none
disables traceback

If the option is not specified, the default is -traceback=%none

-traceback alone, without a value, implies -traceback=common

Note: If the core dump is not wanted, users may set the coredumpsize limit to zero using:

% limit coredumpsize 0            

The -traceback option has no effect on runtime performance.

3.4.98 –U

Recognize upper and lower case in source files.

Do not treat uppercase letters as equivalent to lowercase. The default is to treat uppercase as lowercase except within character–string constants. With this option, the compiler treats Delta, DELTA, and delta as different symbols. Calls to intrinsic functions are not affected by this option.

Portability and mixing Fortran with other languages may require use of –U.

3.4.99 –Uname

Undefine preprocessor macro name.

This option applies only to source files that invoke the fpp or cpp preprocessor. It removes any initial definition of the preprocessor macro name created by -Dname on the same command line, including those implicitly placed there by the command-line driver, regardless of the order the options appear. It has no effect on any macro definitions in source files. Multiple -Uname flags can appear on the command line. There must be no space between -U and the macro name.

3.4.100 –u

Report undeclared variables.

Make the default type for all variables be undeclared rather than using Fortran implicit typing, as if IMPLICIT NONE appeared in each compilation unit. This option warns of undeclared variables, and does not override any IMPLICIT statements or explicit type statements.

3.4.101 –unroll=n

Enable unrolling of DO loops where possible.

n is a positive integer. The choices are:

  • n=1 inhibits all loop unrolling.

  • n>1 suggests to the optimizer that it attempt to unroll loops n times.

Loop unrolling generally improves performance, but will increase the size of the executable file. See also The UNROLL Directive.

3.4.102 –use=list

Specify implicit USE modules.

list is a comma-separated list of module names or module file names.

Compiling with -use=module_name has the effect of adding a USE module_name statement to each subprogram or module being compiled. Compiling with -use=module_file_name has the effect of adding a USE module_name for each of the modules contained in the specified file.

See Module Files for more information about modules in Fortran.

3.4.103 –V

Show name and version of each compiler pass.

This option prints the name and version of each pass as the compiler executes.

3.4.104 –v

Verbose mode -show details of each compiler pass.

Like -V, shows the name of each pass as the compiler executes, and details the options, macro flag expansions, and environment variables used by the driver.

3.4.105 –vax=keywords

Specify choice of legacy VAX VMS Fortran extensions enabled.

The keywords specifier must be one of the following sub-options or a comma-delimited list of a selection of these.

blank_zero
Interpret blanks in formatted input as zeros on internal files.
debug
Interpret lines starting with the character ’D’ to be regular Fortran statements rather than comments, as in VMS Fortran.
rsize
Interpret unformatted record size to be in words rather than bytes.
struct_align
Layout components of a VAX structure in memory as in VMS Fortran, without padding. Note: this can cause data misalignments, and should be used with —xmemalign to avoid such errors.
%all
Enable all these VAX VMS features.
%none
Disable all these VAX VMS features.

Sub-options can be individually selected or turned off by preceding with no%.

Example:

-vax=debug,rsize,no%blank_zero

The default is -vax=%none. Specifying -vax without any sub-options is equivalent to -vax=%all.

3.4.106 –vpara

Show parallelization warning messages.

Issues warnings about potential parallel programming problems in OpenMP programs.

Use with the -xopenmp option.

The compiler issues warnings when it detects the following situations:

  • Loops are parallelized using OpenMP directives when there are data dependencies between different loop iterations.

  • OpenMP data-sharing attributes-clauses are problematic. For example, declaring a variable "shared" whose accesses in an OpenMP parallel region may cause data race, or declaring a variable "private" whose value in a parallel region is used after the parallel region.

No warnings appear if all parallelization directives are processed without problems.

3.4.107 -Wc,arg

Passes the argument arg to a specified component c.

Arguments must be separated from the preceding only by a comma. All -W arguments are passed after the rest of the command-line arguments. To include a comma as part of an argument, use the escape character \ (backslash) immediately before the comma. All -W arguments are passed after the regular command-line arguments.

For example, -Wa,-o,objfile passes -o and objfile to the assembler in that order. Also, -Wl,-I,name causes the linking phase to override the default name of the dynamic linker, /usr/lib/ld.so.1.

The order in which the arguments are passed to a tool with respect to the other specified command line options might change in subsequent compiler releases.

The possible values for c are listed in the following table.

Table 12  -W Flags
Flag
Meaning
a
Assembler: (fbe); (gas)
c
Fortran code generator: (cg) (SPARC) ;
d
f95 driver
l
Link editor (ld)
m
mcs
O (Capital o)
Interprocedural optimizer
o (Lowercase o)
Postoptimizer
p
Preprocessor (fpp or cpp)
0 (Zero)
Compiler (f90comp)
2
Optimizer: (iropt)
3
Static error checking: (previse)

Note: You cannot use -Wd to pass f95 options to the Fortran compiler.

3.4.108 –w[n]

Show or suppress warning messages.

This option shows or suppresses most warning messages. However, if one option overrides all or part of an option earlier on the command line, you do get a warning.

n may be 0, 1, 2, 3, or 4 .

-w0 shows just error messages. This is equivalent to -w. -w1 shows errors and warnings. This is the default without -w.-w2 shows errors, warnings, and cautions.-w3 shows errors, warnings, cautions, and notes.-w4 shows errors, warnings, cautions, notes, and comments.

3.4.109 -Xlinker arg

Pass arg to linker ld(1).

3.4.110 –xaddr32[={yes|no}]

(x86/x64 only) The -xaddr32=yes compilation flag restricts the resulting executable or shared object to a 32-bit address space.

An executable that is compiled in this manner results in the creation of a process that is restricted to a 32-bit address space. When -xaddr32=no is specified a usual 64 bit binary is produced. If the -xaddr32 option is not specified, -xaddr32=no is assumed. If only -xaddr32 is specified -xaddr32=yes is assumed.

This option is only applicable to -m64 compilations and only on Oracle Solaris platforms supporting SF1_SUNW_ADDR32 software capability. Since the Oracle Linux kernel does not support address space limitation this option is not available on Oracle Linux. The -xaddr32 option is ignored on Oracle Linux.

When linking, if a single object file was compiled with -xaddr32=yes the whole output file is assumed to be compiled with -xaddr32=yes. A shared object that is restricted to a 32-bit address space must be loaded by a process that executes within a restricted 32-bit mode address space. For more information refer to the SF1_SUNW_ADDR32 software capabilities definition, described in the Oracle Solaris 11.3 Linkers and Libraries Guide.

3.4.111 –xalias[=keywords]

Specify degree of aliasing to be assumed by the compiler.

Some non-standard programming techniques can introduce situations that interfere with the compiler’s optimization strategies. The use of overindexing, pointers, and passing global or non-unique variables as subprogram arguments, can introduce ambiguous aliasing situations that could result code that does not work as expected.

Use the -xalias flag to inform the compiler about the degree to which the program deviates from the aliasing requirements of the Fortran standard.

The flag may appear with or without a list of keywords. The keywords list is comma-separated, and each keyword indicates an aliasing situation present in the program.

Each keyword may be prefixed by no% to indicate an aliasing type that is not present.

The aliasing keywords are:

Table 13  -xalias Option Keywords
keyword
meaning
dummy
Dummy (formal) subprogram parameters can alias each other and global variables.
no%dummy
(Default). Usage of dummy parameters follows the Fortran standard and do not alias each other or global variables.
craypointer
(Default). Cray pointers can point at any global variable or a local variable whose address is taken by the LOC() function. Also, two Cray pointers might point at the same data. This is a safe assumption that could inhibit some optimizations.
no%craypointer
Cray pointers point only at unique memory addresses, such as obtained from malloc(). Also, no two Cray pointers point at the same data. This assumption enables the compiler to optimize Cray pointer references.
actual
The compiler treats actual subprogram arguments as if they were global variables. Passing an argument to a subprogram might result in aliasing through Cray pointers.
no%actual
(Default) Passing an argument does not result in further aliasing.
overindex
  • A reference to an element in a COMMON block might refer to any element in a COMMON block or equivalence group.

  • Passing any element of a COMMON block or equivalence group as an actual argument to a subprogram gives access to any element of that COMMON block or equivalence group to the called subprogram.

  • Variables of a sequence derived type are treated as if they were COMMON blocks, and elements of such a variable might alias other elements of that variable.

  • Individual array bounds may be violated, but except as noted above, the referenced array element is assumed to stay within the array. Array syntax, WHERE, and FORALL statements are not considered for overindexing. If overindexing occurs in these constructs, they should be rewritten as DO loops.

no%overindex
(Default) Array bounds are not violated. Array references do not reference other variables.
ftnpointer
Calls to external functions might cause Fortran pointers to point at target variables of any type, kind, or rank.
no%ftnpointer
(Default) Fortran pointers follow the rules of the standard.

Specifying -xalias without a list gives the best performance for most programs that do not violate Fortran aliasing rules, and corresponds to:

no%dummy,no%craypointer,no%actual,no%overindex,no%ftnpointer

To be effective, -xalias should be used when compiling with optimization levels -xO3 and higher.

The compiler default, with no -xalias flag specified, assumes that the program conforms to the Fortran standard except for Cray pointers:

no%dummy,craypointer,no%actual,no%overindex,no%ftnpointer

3.4.112 –xannotate[={yes|no}]

Create binaries that can later be used by the optimization and observability tools binopt(1), code-analyzer(1), discover(1), collect(1), and uncover(1).

The default on Oracle Solaris is -xannotate=yes. The default on Oracle Linux is -xannotate=no. Specifying -xannotate without a value is equivalent to -xannotate=yes.

For optimal use of the optimization and observability tools, -xannotate=yes must be in effect at both compile and link time. Compile and link with -xannotate=no to produce slightly smaller binaries and libraries when optimization and observability tools will not be used.

3.4.113 –xarch=isa

Specify instruction set architecture (ISA ).

The following table lists the -xarch keywords common to both SPARC and x86 platforms.

Table 14  -xarch keywords common to both SPARC and x86 platforms
Flag
Meaning
generic
Uses the instruction set common to most processors. This is the default.
native
Compile for good performance on this system. The compiler chooses the appropriate setting for the current system processor it is running on.

Note that although -xarch can be used alone, it is part of the expansion of the –xtarget option and may be used to override the -xarch value that is set by a specific -xtarget option. For example:

% f95 -xtarget=T3 -xarch=sparc4 ...

overrides the -xarch set by -xtarget=T3.

This option limits the code generated by the compiler to the instructions of the specified instruction set architecture by allowing only the specified set of instructions. This option does not guarantee use of any target–specific instructions.

If this option is used with optimization, the appropriate choice can provide good performance of the executable on the specified architecture. An inappropriate choice results in a binary program that is not executable on the intended target platform.

Note the following:

  • Object binary files (.o) compiled with generic,sparc, sparcvis2, sparcvis3, sparcfmaf, sparcima can be linked and can execute together, but can only run on a processor supporting all the instruction sets linked.

  • For any particular choice, the generated executable might not run or run much more slowly on legacy architectures. Also, because quad-precision (REAL*16 and long double) floating-point instructions are not implemented in any of these instruction set architectures, the compiler does not use these instructions in the code it generates.

The default when -xarch is not specified is generic.

Table 15 gives details for each of the -xarch keywords on SPARC platforms.

Table 15  -xarch Values for SPARC Platforms
-xarch=
Meaning (SPARC)
sparc
Compile for the SPARC–V9 ISA. Compile for the V9 ISA, but without the Visual Instruction Set (VIS), and without other implementation-specific ISA extensions. This option enables the compiler to generate code for good performance on the V9 ISA.
sparc4
Compile for the SPARC4 version of the SPARC- V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, which includes VIS 1.0, the UltraSPARC-III extensions, which includes VIS2.0, the fused floating-point multiply-add instructions, VIS 3.0, and SPARC4 instructions.
sparc4b
Compile for the SPARC4B version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, which includes VIS 1.0, the UltraSPARC-III extensions, which includes VIS2.0, the SPARC64 VI extensions for floating-point multiply-add, the SPARC64 VII extensions for integer multiply-add, and the PAUSE and CBCOND instructions from the SPARC T4 extensions.
sparc4c
Compile for the SPARC4C version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, which includes VIS 1.0, the UltraSPARC-III extensions, which includes VIS2.0, the SPARC64 VI extensions for floating-point multiply-add, the SPARC64 VII extensions for integer multiply-add, the VIS3B subset of the VIS 3.0 instructions a subset of the SPARC T3 extensions, called the VIS3B subset of VIS 3.0, and the PAUSE and CBCOND instructions from the SPARC T4 extensions.
sparc5
Compile for the SPARC5 version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the extensions, which includes VIS 1.0, the Ultra SPARC-III extensions, which includes VIS2.0, the fused floating-point multiply-add instructions, VIS 3.0, SPARC4, and SPARC5 instructions.
sparcvis
Compile for the SPARC–V9 ISA with UltraSPARC extensions. Compile for SPARC-V9 plus the Visual Instruction Set (VIS) version 1.0, and with UltraSPARC extensions. This option enables the compiler to generate code for good performance on the UltraSPARC architecture.
sparcvis2
Compile for the SPARC-V9 ISA with UltraSPARC-III extensions. Enables the compiler to generate object code for the UltraSPARC architecture, plus the Visual Instruction Set (VIS) version 2.0, and with UltraSPARC III extensions.
sparcvis3
Compile for the SPARC VIS version 3 of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, the fused multiply-add instructions, and the Visual Instruction Set (VIS) version 3.0
sparcfmaf
Compile for the sparcfmaf version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, and the SPARC64 VI extensions for floating-point multiply-add.
Note that you must use -xarch=sparcfmafin conjunction with -fma=fused and some optimization level to get the compiler to attempt to find opportunities to use the multiply-add instructions automatically.
sparcace
Compile for the sparcace version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, the SPARC64 VI extensions for floating-point multiply-add, the SPARC64 VII extensions for integer multiply-add, and the SPARC64 X extensions for ACE floating-point.
sparcaceplus
Compile for the sparcaceplus version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, the SPARC64 VI extensions for floating-point multiply-add, the SPARC64 VII extensions for integer multiply-add, the SPARC64 X extensions for SPARCACE floating-point, and the SPARC64 X+ extensions for SPARCACE floating-point.
sparcace2
Compile for the sparcace2 version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, the SPARC64 VI extensions for floating-point multiply-add, the SPARC64 VII extensions for integer multiply-add, the SPARC64 X extensions for SPARCACE floating-point, the SPARC64 X+ extensions for SPARCACE floating-point, and the SPARC64 XII extensions for SPARCACE floating-point.
sparcima
Compile for the sparcima version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, the SPARC64 VI extensions for floating-point multiply-add, and the SPARC64 VII extensions for integer multiply-add.
v9
Equivalent to -m64 -xarch=sparc Legacy makefiles and scripts that use -xarch=v9 to obtain the 64-bit data type model need only use -m64.
v9a
Equivalent to -m64 -xarch=sparcvis and is provided for compatibility with earlier releases.
v9b
Equivalent to -m64 -xarch=sparcvis2 and is provided for compatibility with earlier releases.

Table 16 details each of the -xarch keywords on x86 platforms. The default on x86 is generic if -xarch is not specified.

Table 16  -xarch Values for x86 Platforms
-xarch=
Meaning (x86)
386 (Obsolete)
You should not use this option. Use –xarch=generic instead.
For a complete list of obsolete options, see Obsolete Option Flags.
pentium_pro (Obsolete)
You should not use this option. Use –xarch=generic instead.
For a complete list of obsolete options, see Obsolete Option Flags.
pentium_proa
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit Pentium Pro architecture.
sse (Obsolete)
You should not use this option. Use –xarch=generic instead.
For a complete list of obsolete options, see Obsolete Option Flags.
ssea
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit SSE architecture.
sse2
Adds the SSE2 instruction set to the pentium_pro. (See Note below.)
sse2a
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit SSE2 architecture.
sse3
Adds the SSE3 instruction set to the SSE2 instruction set.
sse3a
Adds the AMD extended instructions including 3dnow to the SSE3 instruction set.
sse3a
Adds AMD extended instructions, including 3DNow! to the SSE3 instruction set.
ssse3
Adds the SSSE3 instructions to the SSE3 instruction set.
sse4_1
Adds the SSE4.1 instructions to the SSSE3 instruction set.
sse4_2
Adds the SSE4.2 instructions to the SSE4.1 instruction set.
amdsse4a
Adds the SSE4a instructions to the AMD instruction set.
aes
Adds the Intel Advanced Encryption Standard instruction set. Note that the compiler does not generate AES instructions automatically when -xarch=aes is specified unless the source code includes .il inline code, _asm statements, or assembler code that use AES instructions, or references to AES intrinsic functions.
avx
Uses Intel Advanced Vector Extensions instruction set.
avx_i
Uses Intel Advanced Vector Extensions instruction set with the RDRND, FSGSBASE and F16C instruction sets.
avx2
Uses Intel Advanced Vector Extensions 2 instruction set.
avx2_i
Supplements the pentium_pro, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 instruction sets with Broadwell instructions (ADOX, ADCX, MULX, RDSEED, PREFETCHWT1).
avx512
Uses the instructions sets AVX512F, AVX512CDI, AVX512VLI, AVX512BW and AVX512DQ.

3.4.113.1 Special Cautions for x86/x64 Platforms:

There are some important considerations when compiling for Oracle Solaris x86 platforms.

  • If any part of a program is compiled or linked on an x86 platform with —m64, then all parts of the program must be compiled with one of these options as well. For details on the various Intel instruction set architectures (SSE, SSE2, SSE3, SSSE3, and so on) refer to the Intel-64 and IA-32 Intel Architecture Software Developer's Manual

  • Programs compiled with -xarch set to sse, sse2, sse2a, or sse3 and beyond must be run on platforms supporting these features and extensions..

  • With this release, the default instruction set and the meaning of -xarch=generic has changed to sse2. Now, compiling without specifying a target platform option results in an sse2 binary incompatible with older Pentium III or earlier systems.

  • If you compile and link in separate steps, always link using the compiler and with same -xarch setting to ensure that the correct startup routine is linked.

  • Arithmetic results on x86 may differ from results on SPARC due to the x86 80-byte floating-point registers. To minimize these differences, use the -fstore option or compile with -xarch=sse2 if the hardware supports SSE2.

  • Running programs compiled with these -xarch options on platforms that are not enabled with the appropriate features or instruction set extensions could result in segmentation faults or incorrect results occurring without any explicit warning messages.

  • This warning extends also to programs that employ .il inline assembly language functions or __asm() assembler code that utilize SSE, SSE2, SSE2a, and SSE3 (and beyond) instructions and extensions.

3.4.114 –xassume_control[=keywords]

Set parameters to control ASSUME pragmas.

Use this flag to control the way the compiler handles ASSUME pragmas in the source code.

The ASSUME pragmas provide a way for the programmer to assert special information that the compiler can use for better optimization. These assertions may be qualified with a probability value. Those with a probability of 0 or 1 are marked as certain; otherwise they are considered non-certain.

You can also assert, with a probability or certainty, the trip count of an upcoming DO loop, or that an upcoming branch will be taken.

See The ASSUME Directives, for a description of the ASSUME pragmas recognized by the f95 compiler.

The keywords on the -xassume_control option can be a single sub-option keyword or a comma-separated list of keywords. The keyword sub-options recognized are:

optimize
The assertions made on ASSUME pragmas affect optimization of the program.
check
The compiler generates code to check the correctness of all assertions marked as certain, and emits a runtime message if the assertion is violated; the program continues if fatal is not also specified.
fatal
When used with check, the program will terminate when an assertion marked certain is violated.
retrospective[:d]
The d parameter is an optional tolerance value, and must be a real positive constant less than 1. The default is ".1". retrospective compiles code to count the truth or falsity of all assertions. Those outside the tolerance value d are listed on output at program termination.
%none
All ASSUME pragmas are ignored.

The compiler default is

-xassume_control=optimize

This means that the compiler recognizes ASSUME pragmas and they will affect optimization, but no checking is done.

If specified without parameters, -xassume_control implies

-xassume_control=check,fatal

In this case the compiler accepts and checks all certain ASSUME pragmas, but they do not affect optimization. Assertions that are invalid cause the program to terminate.

3.4.115 –xautopar

Synonym for -autopar.

3.4.116 –xcache=c

Define cache properties for the optimizer .

This option specifies the cache properties that the optimizer can use. It does not guarantee that any particular cache property is used.

Although this option can be used alone, it is part of the expansion of the –xtarget option; it is provided to allow overriding an -xcache value implied by a specific -xtarget option.

Table 17  –xcache Values
Value
Meaning
generic
Define the cache properties for good performance on most processors without any major performance degradation. This is the default.
native
Define the cache properties for good performance on this host platform.
s1/l1/a1[/t1]
Define level 1 cache properties.
s1/l1/a1[/t1]:s2/l2/a2[/t2]
Define levels 1 and 2 cache properties.
s1/l1/a1[/t1]:s2/l2/a2[/t2]:s3/l3/a3[/t3]
Define levels 1, 2, and 3 cache properties

The si/li/ai/ti fields are defined as follows:

si

The size of the data cache at level i, in kilobytes

li

The line size of the data cache at level i, in bytes

ai

The associativity of the data cache at level i

ti

The number of hardware threads sharing the cache at level i (optional)

Example: xcache=16/32/4:1024/32/1 specifies the following:

A Level 1 cache has: 16K bytes, 32 byte line size, 4–way associativity.

A Level 2 cache has: 1024K bytes, 32 byte line size, direct mapping associativity.

3.4.117 –xcheck[=keyword[,keyword]]

Generate special runtime checks and initializations.

The keyword must be one of the following:

keyword
Feature
stkovf[action]
Generate code to detect stack overflow errors at runtime, optionally specifying an action to be taken when a stack overflow error is detected.
A stack overflow error occurs when a thread's stack pointer is set beyond the thread's allocated stack bounds. The error may not be detected if the new top of stack address is writable.
A stack overflow error is detected if a memory access violation occurs as a direct result of the error, raising an associated signal (usually SIGSEGV). The signal thus raised is said to be associated with the error.
If -xcheck=stkovf[action] is specified, the compiler generates code to detect stack overflow errors in cases involving stack frames larger than the system page size. The code includes a library call to force a memory access violation instead of setting the stack pointer to an invalid but potentially mapped address (see _stack_grow(3C)).
The optional action, if specified, must be either :detect or :diagnose.
If action is :detect, a detected stack overflow error is handled by executing the signal handler normally associated with the error.
On Oracle Solaris SPARC, –xcheck=stkovf:detect is enabled by default. This prevents silent corruption of the stack due to stack overflow. It can be disabled by specifying –xcheck=no%stkovf.
If action is :diagnose, a detected stack overflow error is handled by catching the associated signal and calling stack_violation(3C) to diagnose the error. This is the default behavior if no action is specified.
If a memory access violation is diagnosed as a stack overflow error, the following message is printed to stderr:
ERROR: stack overflow detected: pc=<inst_addr>, sp=<sp_addr>
where <inst_addr> is the address of the instruction where the error was detected, and <sp_addr> is the value of the stack pointer at the time that the error was detected. After checking for stack overflow and printing the above message if appropriate, control passes to the signal handler normally associated with the error.
-xcheck=stkovf:detect adds a stack bounds check on entry to routines with stack frames larger than system page size (see _stack_grow(3C)). The relative cost of the additional bounds check should be negligible in most applications.
-xcheck=stkovf:diagnose adds a system call to thread creation (see sigaltstack(2)). The relative cost of the additional system call depends on how frequently the application creates and destroys new threads.
-xcheck=stkovf is supported only on Oracle Solaris. The C runtime library on Oracle Linux does not support stack overflow detection.
no%stkovf
Disable runtime checking for stack overflow.
init_local
Perform special initialization of local variables.
The compiler initializes local variables to a value that is likely to cause an arithmetic exception if it is used by the program before it is assigned. Memory allocated by the ALLOCATE statement will also be initialized in this manner.
Module variables, STATIC and SAVE local variables, and variables in COMMON blocks are not initialized.
Exercise caution when using –xcheck with a large amount of local data, such as arrays with more than 10,000 elements. This can cause the compiler's internal representation of the program to become very large when that local date is initialized, which can result in significantly longer compilation times, especially when combined with optimization levels greater than -02.
no%init_local
Disable local variable initialization. This is the default.
%all
Turn on all these runtime checking features.
%none
Disable all these runtime checking features.

Stack overflows, especially in multithreaded applications with large arrays allocated on the stack, can cause silent data corruption in neighboring thread stacks. Compile all routines with -xcheck=stkovf if stack overflow is suspected. But note that compiling with this flag does not guarantee that all stack overflow situations will be detected since they could occur in routines not compiled with this flag.

3.4.117.1 Defaults

If you do not specify -xcheck, the compiler defaults to-xcheck=noreturn. If you specify -xcheck without any arguments, the compiler defaults to-xcheck=%all, unless you are on an Oracle Solaris system for SPARC, in which case the compiler defaults to -xcheck=stkovf:detect for both cases.

The -xcheck option does not accumulate on the command line. The compiler sets the flag in accordance with the last occurrence of the command.

3.4.118 –xchip=c

Specify target processor for the optimizer.

This option specifies timing properties by specifying the target processor.

Although this option can be used alone, it is part of the expansion of the –xtarget option; it is provided to allow overriding a -xchip value implied by the a specific -xtarget option.

Some effects of -xchip=c are:

  • Instruction scheduling

  • The way branches are compiled

  • Choice between semantically equivalent alternatives

The following tables list the valid -xchip processor name values:

Table 18  Common –xchip SPARC Processor Names
-xchip=
Optimize for:
generic
most SPARC processors. (This is the default.)
native
this host platform.
sparc64vi (Obsolete)
SPARC64 VI processor
sparc64vii (Obsolete)
SPARC64 VII processor
sparc64viiplus
SPARC64 VII+ processor
sparc64x
SPARC64 X processor
sparc64xplus
SPARC64 X+ processor
sparc64xii
SPARC64 XII processor.
ultraT1 (Obsolete)
UltraSPARC T1 processor
ultraT2 (Obsolete)
UltraSPARC T2 processor
ultraT2plus (Obsolete)
UltraSPARC T2+ processor
T3 (Obsolete)
SPARC T3 processor
T4
SPARC T4 processor
T5
Uses the timing properties of the SPARC T5 processor.
T7
Uses the timing properties of the SPARC T7 processor.
M5
Uses the timing properties of the SPARC M5 processor.
M6
Uses the timing properties of the SPARC M6 processor.
M7
Uses the timing properties of the SPARC M7 processor.
Table 19  x86 –xchip flags
Flag
Meaning
generic
Use timing properties for good performance on most x86 architectures. This is the default value. It directs the compiler to use the best timing properties for good performance on most processors without major performance degradation on any of them.
native
Set the parameters for the best performance on the host environment.
core2
Optimize for the Intel Core2 processor.
nehalem
Optimize for the Intel Nehalem processor.
opteron
Optimize for the AMD Opteron processor.
penryn
Optimize for the Intel Penryn processor.
pentium
(Obsolete) Uses timing properties of the x86 Pentium architecture.
pentium_pro
(Obsolete) Uses timing properties of the x86 Pentium Pro architecture.
pentium3
(Obsolete) Uses timing properties of the x86 Pentium 3 architecture.
pentium4
(Obsolete) Uses timing properties of the x86 Pentium 4 architecture.
amdfam10 (Obsolete)
Optimize for the AMD AMDFAM10 processor.
sandybridge
Intel Sandy Bridge processor.
ivybridge
Intel Ivy Bridge processor.
haswell
Intel Haswell processor.
westmere
Intel Westmere processor.
broadwell
Intel Broadwell processor.
skylake
Intel Skylake processor.

3.4.119 -xcode[=v]

(SPARC) Specify code address space.


Note -  Build shared objects by specifying -xcode=pic13 or -xcode=pic32. While you can build workable shared objects with -m64 -xcode=abs64 they will be inefficient. Shared objects built with -m64, -xcode=abs32, or -m64, -xcode=abs44 will not work.

The following table lists the values for v.

Table 20  The -xcode Flags
Value
Meaning
abs32
This is the default on 32-bit architectures. Generates 32-bit absolute addresses. Code + data + BSS size is limited to 2**32 bytes.
abs44
This is the default on 64-bit architectures. Generates 44-bit absolute addresses. Code + data + BSS size is limited to 2**44 bytes. Available only on 64–bit architectures.
abs64
Generates 64-bit absolute addresses. Available only on 64-bit architectures.
pic13
Generates position-independent code for use in shared libraries (small model). Equivalent to -Kpic. Permits references to at most 2**11 unique external symbols on 32-bit architectures, 2**10 on 64-bit architectures.
pic32
Generates position-independent code for use in shared libraries (large model). Equivalent to -KPIC. Permits references to at most 2**30 unique external symbols on 32-bit architectures, 2**29 on 64-bit architectures.

The default is -xcode=abs32 for 32–bit architectures. The default for 64–bit architectures is-xcode=abs44.

When building shared dynamic libraries, the default -xcode values of abs44 and abs32 will not work with 64–bit architectures. Specify -xcode=pic13 or -xcode=pic32 instead. Two nominal performance costs with -xcode=pic13 and -xcode=pic32 on SPARC are:.

  • A routine compiled with either -xcode=pic13 or -xcode=pic32 executes a few extra instructions upon entry to set a register to point at a table (_GLOBAL_OFFSET_TABLE_) used for accessing a shared library’s global or static variables.

  • Each access to a global or static variable involves an extra indirect memory reference through _GLOBAL_OFFSET_TABLE_. If the compilation includes -xcode=pic32, there are two additional instructions per global and static memory reference.

When considering these costs, remember that the use of -xcode=pic13 and -xcode=pic32 can significantly reduce system memory requirements due to the effect of library code sharing. Every page of code in a shared library compiled -xcode=pic13 or -xcode=pic32 can be shared by every process that uses the library. If a page of code in a shared library contains even a single non-pic (that is, absolute) memory reference, the page becomes nonsharable, and a copy of the page must be created each time a program using the library is executed.

The easiest way to tell whether a .o file has been compiled with -xcode=pic13 or -xcode=pic32 is by using the nm command:

% nm file.o | grep _GLOBAL_OFFSET_TABLE_ U _GLOBAL_OFFSET_TABLE_

A .o file containing position-independent code contains an unresolved external reference to _GLOBAL_OFFSET_TABLE_, as indicated by the letter U.

To determine whether to use -xcode=pic13 or -xcode=pic32, check the size of the Global Offset Table (GOT) by using elfdump -c looking for the section header sh_name: .got. The sh_size value is the size of the GOT. If the GOT is less than 8,192 bytes, specify -xcode=pic13. Otherwise specify -xcode=pic32.See the elfdump(1) man page for more information.

Follow these guidelines to determine how you should use -xcode:

  • If you are building an executable, do not use -xcode=pic13 or -xcode=pic32.

  • If you are building an archive library only for linking into executables, do not use -xcode=pic13 or -xcode=pic32.

  • If you are building a shared library, start with -xcode=pic13 and, once the GOT size exceeds 8,192 bytes, use -xcode=pic32.

  • If you are building an archive library for linking into shared libraries, use -xcode=pic32.

3.4.120 –xcommonchk[={yes|no}]

Enable runtime checking of common block inconsistencies .

This option provides a debug check for common block inconsistencies in programs using TASK COMMON and parallelization.

The default is -xcommonchk=no; runtime checking for common block inconsistencies is disabled because it will degrade performance. Use -xcommonchk=yes only during program development and debugging, and not for production-quality programs.

Compiling with -xcommonchk=yes enables runtime checking. If a common block declared in one source program unit as a regular common block appears somewhere else on a TASK COMMON directive, the program will stop with an error message indicating the first such inconsistency. -xcommonchk without a value is equivalent to -xcommonchk=yes.

3.4.121 -xcompress={[no%]debug}

Compress debug sections using the format specified by the -xcompress_format option if supported by the underlying Operating System. A sub-option is required. The option is ignored with a warning when Operating System support is unavailable.

3.4.122 -xcompress_format=cmp-type

When -xcompress=debug is in effect, this option specifies how the debug section is to be compressed.

When -xcompress=debug is in effect, this option specifies how the debug section is to be compressed.

The following values for cmp-type are recognized.

none

No compression of the debug section is done.

zlib

Compress the debug section using ZLIB compression.

zlib-gnu

Compress the section using ZLIB compression, using the GNU section compression format.

On Oracle Solaris, when compilation involves linking, the debug sections are compressed using the ld option –z compress-sections=cmp-type. For more information, see the ld(1) man page.

On Oracle Solaris, when compiling to an object file (.o), the debug sections are compressed using elfcompress -t cmp-type. For more information, see the elfcompress(1) man page.

On Linux, the objcopy command is used to compress debug sections of each .o file and thus objcopy --compress-debug-sections. For more information, see the objcopy(1g) man page.

The option is ignored with a warning when Operating System support is unavailable.

3.4.123 –xdebugformat=dwarf

-xdebugformat=dwarf generates debugging information using the dwarf standard format. It is the default and obsolete option.

This is a transitional interface so expect it to change in incompatible ways from release to release, even in a minor release. The details of any specific fields or values in dwarf are also evolving.

Use the dwarfdump(1) command to determine the format of the debugging information in a compiled object or executable file.

3.4.124 -xdebuginfo=a[,a...]

Control how much debugging and observability information is emitted.

The term tagtype refers to tagged types: structs, unions, enums, and classes.

The following list contains the possible values for sub-options a. The prefix no% applied to a sub-option disables that sub-option. The default is -xdebuginfo=%none. Specifying -xdebuginfo without a sub-option is forbidden.

%none

No debugging information is generated. This is the default.

[no%]line

Emit line number and file information.

[no%]param

Emit location list info for parameters. Emit full type information for scalar values (for example, int, char *) and type names but not full definitions of tagtypes.

[no%]variable

Emit location list information for lexically global and local variables, including file and function statics but excluding class statics and externs. Emit full type information for scalar values such as int and char * and type names but not full definitions of tagtypes.

[no%]decl

Emit information for function and variable declarations, member functions, and static data members in class declarations.

[no%]tagtype

Emit full type definitions of tagtypes referenced from param and variable datasets, as well as template definitions.

[no%]macro

Emit macro information.

[no%]codetag

Emit DWARF codetags. This is information regarding bitfields, structure copy, and spills used by RTC and discover.

[no%]hwcpro

Generate information critical to hardware counter profiling. This information includes ldst_map, a mapping from ld/st instructions to the symbol table entry being referenced, and branch_target table of branch-target addresses used to verify that backtracking did not cross a branch-target. See -xhwcprof for more information.


Note -  ldst_map requires the presence of tagtype information. The driver will issue an error if this requirement is not met.

These are macros which expand to combinations of -xdebuginfo and other options as follows:

-g = -g2

-gnone =
        -xdebuginfo=%none
        -xglobalize=no
        -xpatchpadding=fix
        -xkeep_unref=no%funcs,no%vars

-g1 =
        -xdebuginfo=line,param,codetag
        -xglobalize=no
        -xpatchpadding=fix
        -xkeep_unref=no%funcs,no%vars

-g2 =
        -xdebuginfo=line,param,decl,variable,tagtype,codetag
        -xglobalize=yes
        -xpatchpadding=fix
        -xkeep_unref=funcs,vars

-g3 =

        -xdebuginfo=line,param,decl,variable,tagtype,codetag,macro
        -xglobalize=yes
        -xpatchpadding=fix
        -xkeep_unref=funcs,vars

3.4.125 –xdepend

Synonym for -depend.

3.4.126 –xF

Allow function –level reordering by the Performance Analyzer.

Allow the reordering of functions (subprograms) in the core image using the compiler, the performance analyzer and the linker. If you compile with the -xF option, then run the analyzer, you can generate a map file that optimizes the ordering of the functions in memory depending on how they are used together. A subsequent link to build the executable file can be directed to use that map by using the linker -Mmapfile option. It places each function from the executable file into a separate section. (The f95 —Mpath option also passes a regular file to the linker; see the description of the f95 —Mpath option.)

Reordering the subprograms in memory is useful only when the application text page fault time is consuming a large percentage of the application time. Otherwise, reordering may not improve the overall performance of the application.

3.4.127 –xfilebyteorder=options

Support file sharing between little-endian and big-endian platforms.

The flag identifies the byte-order and byte-alignment of data on unformatted I/O files. options must specify any combination of the following, but at least one specification must be present:

littlemax_align:spec

bigmax_align:spec

native:spec

max_align declares the maximum byte alignment for the target platform. Permitted values are 1, 2, 4, 8, and 16. The alignment applies to Fortran VAX structures and Fortran derived types that use platform-dependent alignments for compatibility with C language structures.

little specifies a "little-endian" file on platforms where the maximum byte alignment is max_align. For example, little4 specifies a 32-bit x86 file, while little16 describes a 64-bit x86 file.

big specifies a "big-endian" file with a maximum alignment of max_align. For example, big8 describes a 32-bit SPARC file, while big16 describes a 64-bit SPARC file.

native specifies a "native" file with the same byte order and alignment used by the compiling processor platform. The following are assumed to be "native":

Platform
"native" corresponds to:
32-bit SPARC
big8
64-bit SPARC
big16
32-bit x86
little4
64-bit x86
little16

spec must be a comma-separated list of the following:

%all

unit

filename

%all refers to all files and logical units except those opened as "SCRATCH", or named explicitly elsewhere in the -xfilebyteorder flag. %all can only appear once.

unit refers to a specific Fortran unit number opened by the program.

filename refers to a specific Fortran file name opened by the program.

3.4.127.1 Examples:

-xfilebyteorder=little4:1,2,afile.in,big8:9,bfile.out,12
-xfilebyteorder=little8:%all,big16:20

3.4.127.2 Notes:

This option does not apply to files opened with STATUS="SCRATCH". I/O operations done on these files are always with the byte-order and byte-alignment of the native processor.

The first default, when -xfilebyteorder does not appear on the command line, is -xfilebyteorder=native:%all.

A file name or unit number can be declared only once in this option.

When -xfilebyteorder does appear on the command line, it must appear with at least one of the little, big, or native specifications.

Files not explicitly declared by this flag are assumed to be native files. For example, compiling with -xfilebyteorder=little4:zork.out declares zork.out to be a little-endian 32-bit x86 file with a 4-byte maximum data alignment. All other files in the program are native files.

When the byte-order specified for a file is the same as the native processor but a different alignment is specified, the appropriate padding will be used even though no byte swapping is done. For example, this would be the case when compiling with -m64 for 64-bit x86 platforms and -xfilebyteorder=little4:filename is specified.

The declared types in data records shared between big-endian and little-endian platforms must have the same sizes. For example, a file produced by a SPARC executable compiled with -xtypemap=integer:64,real:64,double:128 cannot be read by an x86 executable compiled with -xtypemap=integer:64,real:64,double:64 since the default double precision data types will have different sizes. (However, note that starting with the release of Sun Studio 12 Update 1, double:128 is accepted on x64 processors.)

If an option that changes the alignment of the components within a VAX structure (such as —vax=struct_align) or a derived type (such as —aligncommon or —dalign) is used on one platform, the same alignment option has to be used on other platforms sharing the same unformatted data file whose content is affected by the alignment option.

An I/O operation with an entire UNION/MAP data object on a file specified as non-native will result in a runtime I/O error. You can only execute I/O operations using the individual members of the MAP (and not an entire VAX record containing the UNION/MAP) on non-native files.

3.4.128 -xglobalize[={yes|no}]

Control globalization of function-level or file-level static variables.

Globalization is a technique needed by fix and continue functionality in the debugger whereby function-level or file-level static symbols are promoted to globals while a prefix is added to the name to keep identically named symbols distinct.

The default is -xglobalize=no. Specifying -xglobalize is equivalent to specifying -xglobalize=yes.

3.4.128.1 Interactions

See -xpatchpadding.

3.4.129 –xhasc[={yes|no}]

Treat Hollerith constant as a character string in an actual argument list.

With -xhasc=yes, the compiler treats Hollerith constants as character strings when they appear as an actual argument on a subroutine or function call. This is the default, and complies with the Fortran standard. (The actual call list generated by the compiler contains hidden string lengths for each character string.)

With -xhasc=no, Hollerith constants are treated as typeless values in subprogram calls, and only their addresses are put on the actual argument list. (No string length is generated on the actual call list passed to the subprogram.)

Compile routines with -xhasc=no if they call a subprogram with a Hollerith constant and the called subprogram expects that argument as INTEGER (or anything other than CHARACTER).

Example:

demo% cat hasc.f
                call z(4habcd, ’abcdefg’)
                end
                subroutine z(i, s)
                integer i
                character *(*) s
                print *, "string length = ", len(s)
                return
                end
demo% f95 -o has0 hasc.f
demo% has0
 string length =   4    <-- should be 7
demo% f95 -o has1 -xhasc=no hasc.f
demo% has1
 string length =   7  <-- now correct length for s

Passing 4habcd to z is handled correctly by compiling with -xhasc=no.

This flag is provided to aid porting legacy Fortran 77 programs.

3.4.130 –xhelp=flags

List the compiler option flags. Equivalent to -help.

3.4.131 –xhwcprof[={enable | disable}]

Enable compiler support for dataspace profiling.

With -xhwcprof enabled, the compiler generates information that helps tools associate profiled load and store instructions with the data-types and structure members (in conjunction with symbolic information produced with -g) to which they refer. It associates profile data with the data space of the target, rather than the instruction space, and provides insight into behavior that is not easily obtained from only instruction profiling.

While you can compile a specified set of object files with -xhwcprof, this option is most useful when applied to all object files in the application. This will provide coverage to identify and correlate all memory references distributed in the application's object files.

If you are compiling and linking in separate steps, use -xhwcprof at link time as well.

An instance of -xhwcprof=enable or -xhwcprof=disable overrides all previous instances of -xhwcprof in the same command line.

-xhwcprof is disabled by default. Specifying -xhwcprof without any arguments is the equivalent to -xhwcprof=enable.

-xhwcprof requires that optimization be turned on and that the debug data format be set to dwarf (-xdebugformat=dwarf), which is the default with current Oracle Developer Studio compilers.

-xhwcprof uses -xdebuginfo to automatically enable the minimum amount of debugging information it needs, so -g is not required.

The combination of -xhwcprof and -g increases compiler temporary file storage requirements by more than the sum of the increases due to -xhwcprof and -g specified alone.

-xhwcprof is implemented as a macro that expands to various other, more primitive, options as follows:

-xhwcprof
        -xdebuginfo=hwcprof,tagtype,line
-xhwcprof=enable
        -xdebuginfo=hwcprof,tagtype,line
-xhwcprof=disable
        -xdebuginfo=no%hwcprof,no%tagtype,no%line

The following command compiles example.f and specifies support for hardware counter profiling and symbolic analysis of data types and structure members using DWARF symbols:

f95 -c -O -xhwcprof -g  example.f            

For more information on hardware counter-based profiling, see the Oracle Developer Studio 12.6: Performance Analyzer.

3.4.132 –xinline=list

Synonym for -inline.

3.4.133 –xinline_param=a[,a[,a]...]

Use this option to manually change the heuristics used by the compiler for deciding when to inline a function call.

This option only has an effect at -O3 or higher. The following sub-options have an effect only at -O4 or higher when automatic inlining is on.

In the following sub-options n must be a positive integer; a can be one of the following:

Table 21  -xinline_param Sub-options
Sub-option
Meaning
default
Set the values of all the sub-options to their default values.
max_inst_hard[:n]
Automatic inlining only considers functions smaller than n pseudo instructions (counted in compiler's internal representation) as possible inline candidates.
Under no circumstances will a function larger than this be considered for inlining.
max_inst_soft[:n]
Set inlined function's size limit to n pseudo instructions (counted in compiler's internal representation).
Functions of greater size than this may sometimes be inlined.
When interacting with max_inst_hard, the value of max_inst_soft should be equal to or smaller than the value of max_inst_hard, i.e, max_inst_soft <= max_inst_hard.
In general, the compiler's automatic inliner only inlines calls whose called function's size is smaller than the value of max_inst_soft. In some cases a function may be inlined when its size is larger than the value of max_inst_soft but smaller than that of max_inst_hard. An example of this would be if the parameters passed into a function were constants.
When deciding whether to change the value of max_inst_hard or max_inst_soft for inlining one specific call site to a function, use -xinline_report=2 to report detailed inlining message and follow the suggestion in the inlining message.
max_function_inst[:n]
Allow functions to increase due to automatic inlining by up to n pseudo instructions (counted in compiler's internal representation).
max_growth[:n]
The automatic inliner is allowed to increase the size of the program by up to n% where the size is measured in pseudo instructions.
min_counter[:n]
The minimum call site frequency counter as measured by profiling feedback (-xprofile) in order to consider a function for automatic inlining.
This option is valid only when the application is compiled with profiling feedback (-xprofile=use).
level[:n]
Use this sub-option to control the degree of automatic inlining that is applied. The compiler will inline more functions with higher settings for -xinline_param=level.
n must be one of 1, 2, or 3.
The default value of n is 2 when this option is not specified, or when the options is specified without :n.
Specify the level of automatic inline:
level:1 basic inlining level:2 medium inlining (default) level:3 aggressive inlining
The level decides the specified values for the combination of the following inlining parameters:
max_growth + max_function_inst + max_inst + max_inst_call
When level = 1, all the parameters are half the values of the default.
When level = 2, all the parameters are the default value.
When level = 3, all the parameters are double the values of the default.
max_recursive_depth[:n]
When a function calls itself either directly or indirectly, it is said to be making a recursive call.
This sub-option allows a recursive call to be automatically inlined up to n levels.
max_recursive_inst[:n]
Specifies the maximum number of pseudo instructions (counted in compiler's internal representation) the caller of a recursive function can grow to by performing automatic recursive inlining.
When interactions between max_recursive_inst and max_recursive_depth occur, recursive function calls will be inlined until either the max_recursive_depth number of recursive calls, or until the size of the function being inlined into exceeds max_recursive_inst. The settings of these two parameters control the degree of inlining of small recursive functions.

If -xinline_param=default is specified, the compiler will set all the values of the sub-options to the default values.

If the option is not specified, the default is -xinline_param=default.

The list of values and options accumulate from left to right. So for a specification of -xinline_param=max_inst_hard:30,..,max_inst_hard:50, the value max_inst_hard:50 will be passed to the compiler.

If multiple -xinline_param options are specified on the command line, the list of sub-options likewise accumulate from left to right. For example, the effect of

 -xinline_param=max_inst_hard:50,min_counter:70 ...
   -xinline_param=max_growth:100,max_inst_hard:100

will be the same as that of

-xinline_param=max_inst_hard:100,min_counter:70,max_growth:100

3.4.134 –xinline_report[=n]

This option generates a report written to standard output on the inlining of functions by the compiler. The type of report depends on the value of n, which must be 0, 1, or 2.

0

No report is generated.

1

A summary report of default values of inlining parameters is generated.

2

A detailed report of inlining messages is generated, showing which callsites are inlined and which are not, with a short reason for not inlining a callsite. In some cases, this report will include suggested values for -xinline_param that can be used to inline a callsite that is not inlined.

When -xinline_report is not specified, the default value for n is 0. When -xinline_report is specified without =n, the default value is 1.

When -xlinkopt is present, the inlining messages about the callsites that are not inlined might not be accurate.

The report is limited to inlining performed by the compiler that is subject to the heuristics controllable by the -xinline_param option. Callsites inlined by the compiler for other reasons may not be reported.

3.4.135 –xinstrument=[%no]datarace

Specify this option to compile and instrument your program for analysis by the Thread Analyzer.

(For information on the Thread Analyzer, see tha(1) for details.)

By compiling with this option you can then use the Performance Analyzer to run the instrumented program with collect -r races to create a data-race-detection experiment. You can run the instrumented code standalone but it runs more slowly.

Specify -xinstrument=no%datarace to turn off this feature. This is the default.

-xinstrument must be specified with an argument.

If you compile and link in separate steps, you must specify -xinstrument=datarace in both the compilation and linking steps.

This option defines the preprocessor token __THA_NOTIFY. You can specify #ifdef __THA_NOTIFY to guard calls to libtha(3) routines.

This option also sets -g.

–xinstrument cannot be used together with –xlinkopt.

3.4.136 –xipo[={0|1|2}]

Perform interprocedural optimizations.

Performs whole-program optimizations by invoking an interprocedural analysis pass. -xipo performs optimizations across all object files in the link step, and is not limited to just the source files on the compile command.

-xipo is particularly useful when compiling and linking large multi-file applications. Object files compiled with this flag have analysis information compiled within them that enables interprocedural analysis across source and pre-compiled program files. However, analysis and optimization is limited to the object files compiled with -xipo, and does not extend to object files on libraries.

-xipo=0 disables, and -xipo=1 enables, interprocedural analysis. -xipo=2 adds interprocedural aliasing analysis and memory allocation and layout optimizations to improve cache performance. The default is -xipo=0, and if -xipo is specified without a value, -xipo=1 is used.

When compiling with -xipo=2, there should be no calls from functions or subroutines compiled without -xipo=2 (for example, from libraries) to functions or subroutines compiled with -xipo=2.

As an example, if you interpose on the function malloc() and compile your own version of malloc() with -xipo=2, all the functions that reference malloc() in any library linked with your code would also have to be compiled with -xipo=2. Since this might not be possible for system libraries, your version of malloc should not be compiled with -xipo=2.

When compiling and linking are performed in separate steps, -xipo must be specified in both steps to be effective.

Example using -xipo in a single compile/link step:

demo% f95 -xipo -xO4 -o prog  part1.f part2.f part3.f

The optimizer performs crossfile inlining across all three source files. This is done in the final link step, so the compilation of the source files need not all take place in a single compilation and could be over a number of separate compilations, each specifying -xipo.

Example using -xipo in separate compile/link steps:

demo% f95 -xipo -xO4 -c part1.f part2.f
demo% f95 -xipo -xO4 -c part3.f
demo% f95 -xipo -xO4 -o prog  part1.o part2.o part3.o

The object files created in the compile steps have additional analysis information compiled within them to permit crossfile optimizations to take place at the link step.

A restriction is that libraries, even if compiled with -xipo do not participate in crossfile interprocedural analysis, as shown in this example:

demo% f95 -xipo -xO4 one.f two.f three.f
demo% ar -r mylib.a one.o two.o three.o
...
demo% f95 -xipo -xO4 -o myprog main.f four.f mylib.a

Here interprocedural optimizations will be performed between one.f, two.f and three.f, and between main.f and four.f, but not between main.f or four.f and the routines on mylib.a. (The first compilation may generate warnings about undefined symbols, but the interprocedural optimizations will be performed because it is a compile and link step.)

Other important information about -xipo:

  • requires at least optimization level -xO4

  • Building executables compiled with -xipo using a parallel make tool can cause problems if object files used in the build are common to the link steps running in parallel. Each link step should have its own copy of the object file being optimized prior to linking.

  • objects compiled without -xipo can be linked freely with objects compiled with -xipo.

  • The -xipo option generates significantly larger object files due to the additional information needed to perform optimizations across files. However, this additional information does not become part of the final executable binary file. Any increase in the size of the executable program will be due to the additional optimizations performed

  • If you have .o files compiled with the –xipo option from different compiler versions, mixing these files can result in failure with an error message about "IR version mismatch". When using the –xipo option, all the files should be compiled with the same version of the compiler.

  • In this release, crossfile subprogram inlining is the only interprocedural optimization performed by -xipo.

  • .s assembly language files do not participate in interprocedural analysis.

  • The -xipo flag is ignored if compiling with -S.

When Not To Compile With -xipo:

Working with the set of object files in the link step, the compiler tries to perform whole-program analysis and optimizations. For any function or subroutine foo() defined in this set of object files, the compiler makes the following two assumptions:

  1. At runtime, foo() will not be called explicitly by another routine defined outside this set of object files, and

  2. calls to foo() from any routine in the set of object files will be not be interposed upon by a different version of foo() defined outside this set of object files.

If assumption (1) is not true for the given application, do not compile with -xipo=2.If assumption (2) is not true, do not compile with either -xipo=1 or -xipo=2.

As an example, consider interposing on the function malloc() with your own source version and compiling with -xipo=2. Then all the functions in any library that reference malloc() that are linked with your code would have to also be compiled with -xipo=2 and their object files would need to participate in the link step. Since this might not be possible for system libraries, your version of malloc() should not be compiled with -xipo=2.

As another example, suppose that you build a shared library with two external calls, foo() and bar() inside two different source files, and bar() calls foo() inside its body. If there is a possibility that the function call foo() could be interposed at runtime, then compile neither source file for foo() or bar() with -xipo=1 or -xipo=2. Otherwise, foo() could be inlined into bar(), which could cause incorrect results when compiled with -xipo.

3.4.137 –xipo_archive[={none|readonly|writeback}]

( SPARC) Allow crossfile optimization to include archive (.a) libraries.

The value must be one of the following:

none
No processing of archive files is performed. The compiler does not apply cross-module inlining or other cross-module optimizations to object files compiled using -xipo and extracted from an archive library at link time. To do that, both -xipo and either -xipo_archive=readonlyor -xipo_archive=writeback must be specified at link time.
readonly
The compiler optimizes object files passed to the linker with object files compiled with -xipo that reside in the archive library (.a) before producing an executable.
The option -xipo_archive=readonly enables cross-module inlining and interprocedural data flow analysis of object files in an archive library specified at link time. However, it does not enable cross-module optimization of the archive library's code except for code that has been inserted into other modules by cross module inlining.
To apply cross-module optimization to code within an archive library, -xipo_archive=writeback is required. Note that doing so modifies the contents of the archive library from which the code was extracted.
writeback
The compiler optimizes object files passed to the linker with object files compiled with -xipo that reside in the archive library (.a) before producing an executable. Any object filed contained in the library that were optimized during the compilation are replaced with their optimized version.
For parallel links that use a common set of archive libraries, each link should create its own copy of archive libraries to be optimized before linking.

If you do not specify a setting for -xipo_archive, the compiler assumes -xipo_archive=none.

3.4.138 -xipo_build=[yes|no]

Building -xipo without -xipo_build involves two passes through the compiler—once when producing the object files, and then again later at link time when performing the cross file optimization. Setting -xipo_build reduces compile time by avoiding optimizations during the initial pass and optimizing only at link time. Optimization is not needed for the object files, as with -xipo it will be performed at link time. If unoptimized object files built with -xipo_build are linked without including -xipo to perform optimization, the application will fail to link with an unresolved symbol error.

3.4.138.1 -xipo_build Examples

The following example performs a fast build of .o files, followed by crossfile optimization at link time:

% cc -O -xipo -xipo_build -o code1.o -c code1.c
% cc -O -xipo -xipo_build -o code2.o -c code2.c
% cc -O -xipo -o a.out code1.o code2.o

The -xipo_build will turn off -O when creating the .o files, to build these quickly. Full -O optimization will be performed at link time as part of -xipo crossfile optimization.

The following example links without using -xipo.

% cc -O -o a.out code1.o code2.o

If either code1.o or code2.o were generated with -xipo_build, the result will be a link-time failure indicating the symbol __unoptimized_object_file is unresolved.

When building .o files separately, the default behavior is -xipo_build=no. However, when the executable or library is built in a single pass from source files, -xipo_build will be implicitly enabled. For example:

% cc -fast -xipo a.c b.c c.c

will implicitly enable -xipo_build=yes for the first passes that generate a.o, b.o, and c.o. Include the option -xipo_build=no to disable this behavior.

3.4.139 –xivdep[=p]

Disable or set interpretation of !DIR$ IVDEP directives

The IVDEP directive tells the compiler to ignore some or all loop-carried dependences on array references that it finds in a loop, allowing it to perform various loop optimizations such as microvectorization, distribution, software pipelining, among others, that would not be otherwise possible. It is employed in situations where the user knows either that the dependences do not matter or that they never occur in practice.

The interpretation of !DIR$ IVDEP directives depend upon the value of the -xivdep option. The following values for p are interpreted as follows:

-loop — ignore assumed loop-carried vector dependences
-loop_any — ignore all loop-carried vector dependences
-back — ignore assumed backward loop-carried vector dependences
-back_any — ignore all backward loop-carried vector dependences
-none — do not ignore any dependences (disables IVDEP directives)

These interpretations are provided for compatibility with other vendor's interpretations of the IVDEP directive.

The default when —xivdep is not specified, and when —xivdep is specified without an argument, are both —xivdep=loop, which implies that the !DIR$ IVDEP directives are enabled by default.

For more information, see IVDEP Directive for details.

3.4.140 -xjobs{=n|auto}

Compile with multiple processes. If this flag is not specified, the default behavior is -xjobs=auto.

Specify the -xjobs option to set how many processes the compiler creates to complete its work. This option can reduce the build time on a multi-cpu machine. Currently, -xjobs works only with the -xipo option. When you specify -xjobs=n, the interprocedural optimizer uses n as the maximum number of code generator instances it can invoke to compile different files.

Generally, a safe value for n is 1.5 multiplied by the number of available processors. Using a value that is many times the number of available processors can degrade performance because of context switching overheads among spawned jobs. Also, using a very high number can exhaust the limits of system resources such as swap space.

When -xjobs=auto is specified, the compiler will automatically choose the appropriate number of parallel jobs.

You must always specify -xjobs with a value. Otherwise, an error diagnostic is issued and compilation aborts.

If -xjobs is not specified, the default behavior is -xjobs=auto. This can be overridden by adding -xjobs=n to the command line. Multiple instances of -xjobs on the command line override each other until the right-most instance is reached.

3.4.140.1 -xjobs Examples

The following example links with up to three parallel processes for -xipo:

% cc -xipo -xO4 -xjobs=3 t1.o t2.o t3.o

The following example links serially with a single process for -xipo:

% cc -xipo -xO4 -xjobs=1 t1.o t2.o t3.o

The following example links in parallel, with the compiler choosing the number of jobs for -xipo:

% cc -xipo -xO4  t1.o t2.o t3.o

Note that this is exactly the same behavior as when explicitly specifying -xjobs=auto:

% cc -xipo -xO4 -xjobs=auto t1.o t2.o t3.o

3.4.141 -xkeep_unref[={[no%]funcs,[no%]vars}]

Keep definitions of unreferenced functions and variables. The no% prefix allows the compiler to potentially remove the definitions.

The default is no%funcs,no%vars. Specifying -xkeep_unref is equivalent to specifying -xkeep_unref=funcs,vars, meaning that -keep_unref keeps everything.

3.4.142 –xkeepframe[=[%all,%none,name,no%name]]

Prohibit stack related optimizations for the named functions (name).

%all

Prohibit stack related optimizations for all the code.

%none

Allow stack related optimizations for all the code.

This option is accumulative and can appear multiple times on the command line. For example, —xkeepframe=%all —xkeepframe=no%func1 indicates that the stack frame should be kept for all functions except func1. Also, —xkeepframe overrides —xregs=frameptr. For example, —xkeepframe=%all —xregs=frameptr indicates that the stack should be kept for all functions, but the optimizations for —xregs=frameptr would be ignored.

If not specified on the command line, the compiler assumes -xkeepframe=%none as the default. If specified but without a value, the compiler assumes -xkeepframe=%all

3.4.143 –xknown_lib=library_list

Recognize calls to a known library.

When specified, the compiler treats references to certain known libraries as intrinsics, ignoring any user-supplied versions. This enables the compiler to perform optimizations over calls to library routines based on its special knowledge of that library.

The library_list is a comma-delimited list of keywords currently to blas, blas1, blas2, blas3, and intrinsics. The compiler recognizes calls to the following BLAS1, BLAS2, and BLAS3 library routines and is free to optimize appropriately for the Oracle Developer Studio Performance Library implementation. The compiler will ignore user-supplied versions of these library routines and use the BLAS routines in the Oracle Developer Studio Performance Library or inline the routines.

The —library=sunperf option is needed to link with the Oracle Developer Studio Performance Library.

-xknown_lib=
Feature
blas1
The compiler recognizes calls to the following BLAS1 library routines:
caxpy ccopy cdotc cdotu crotg cscal csrot csscal cswap dasum daxpy dcopy ddot drot drotg drotm drotmg dscal dsdot dswap dnrm2 dzasum dznrm2 icamax idamax isamax izamax sasum saxpy scasum scnrm2 scopy sdot sdsdot snrm2 srot srotg srotm srotmg sscal sswap zaxpy zcopy zdotc zdotu zdrot zdscal zrotg zscal zswap
blas2
The compiler recognizes calls to the following BLAS2 library routines:
cgemv cgerc cgeru ctrmv ctrsv dgemv dger dsymv dsyr dsyr2 dtrmv dtrsv sgemv sger ssymv ssyr ssyr2 strmv strsv zgemv zgerc zgeru ztrmv ztrsv
blas3
The compiler recognizes calls to the following BLAS2 library routines:
cgemm csymm csyr2k csyrk ctrmm ctrsm dgemm dsymm dsyr2k dsyrk dtrmm dtrsm sgemm ssymm ssyr2k ssyrk strmm strsm zgemm zsymm zsyr2k zsyrk ztrmm ztrsm
blas
Selects all the BLAS routines. Equivalent to -xknown_lib=blas1,blas2,blas3
intrinsics
The compiler ignores any explicit EXTERNAL declarations for Fortran intrinsics, thereby ignoring any user-supplied intrinsic routines. (See the Fortran Library Reference for lists of intrinsic function names.)

3.4.144 –xl

(Obsolete) This legacy f77 option is no longer supported. For the equivalent options in the current Fortran compiler, use: -f77=%all,no%backslash -vax=$all,no%debug

3.4.145 –xlang=f77

(Obsolete) This option is obsolete and does not do anything since the Fortran 77 object files are no longer supported. It might be removed in a future release.

3.4.146 –xld

(Obsolete) This (f77) option is no longer supported. For the equivalent options in the current Fortran compiler, use: -f77=%all,no%backslash -vax=$all,no%debug

3.4.147 –xlibmil

Synonym for -libmil.

3.4.148 –xlibmopt[={%none,archive,shared}]

Controls whether the compiler uses a library of optimized math routines or the standard system math routines. The possible argument values are:

%none

Do not link with the optimized math library. (This is the default when no -xlibmopt option is specified.)

archive

Link with the optimized math library in static archive form. (This is the default when -xlibmopt is specified with no argument.)

shared

(Oracle Solaris) Link with the optimized math library in shared object form.

The rightmost instance of this option on the command line overrides all the previous instances. The order of this option relative to other libraries specified on the command line is not significant.

The optimized math library includes selected math routines normally found in libm. The optimized routines typically run faster than their libm counterparts. The results may be slightly different from those produced by the libm routines, although in most cases they differ only in the least significant bit. When the static archive form of the optimized library is used, the compiler selects routines that are optimized for the instruction set indicated by the -xarch value specified when linking. When the shared object form is used, the most appropriate routines are selected at runtime based on the instruction set supported by the system being used.


Note -  The shared object form is available only on Oracle Solaris.

The routines in the optimized math library depend on the default round-to-nearest floating point rounding mode. If you use the optimized math library, you must ensure that round-to-nearest mode is in effect when any of these routines is called.

3.4.149 –xlinkopt[={1|2|0}]

(Oracle Solaris) Perform link-time optimizations on relocatable object files.

The post-optimizer performs a number of advanced performance optimizations on the binary object code at link-time. The optional value sets the level of optimizations performed, and must be 0, 1, or 2.

0
The post-optimizer is disabled. (This is the default.)
1
Perform optimizations based on control flow analysis, including instruction cache coloring and branch optimizations, at link time.
2
Perform additional data flow analysis, including dead-code elimination and address computation simplification, at link time.

Specifying the -xlinkopt flag without a value implies -xlinkopt=1.

These optimizations are performed at link time by analyzing the object binary code. The object files are not rewritten but the resulting executable code may differ from the original object codes.

This option is most effective when used to compile the whole program, and with profile feedback.

When compiling in separate steps, -xlinkopt must appear on both compile and link steps.

demo% f95 -c -xlinkopt a.f95 b.f95
demo% f95 -o myprog -xlinkopt=2 a.o b.o

Note that the level parameter is only used when the compiler is linking. In the example above, the post-optimization level used is 2 even though the object binaries were compiled with an implied level of 1.

The link-time post-optimizer cannot be used with the incremental linker, ild. The -xlinkopt flag will set the default linker to be ld. Enabling the incremental linker explicitly with the -xildon flag will disable the -xlinkopt option if both are specified together.

For the -xlinkopt option to be useful, at least some, but not necessarily all, of the routines in the program must be compiled with this option. The optimizer can still perform some limited optimizations on object binaries not compiled with -xlinkopt.

The -xlinkopt option optimizes code coming from static libraries that appear on the compiler command line, but does not optimize code coming from shared (dynamic) libraries that appear on the command line. You can also use -xlinkopt when building shared libraries (compiling with -G ).

The -xlinkopt option requires profile feedback (-xprofile) in order to optimize the program. Profiling reveals the most- and least-used parts of the code, which enables the optimizer to focus its effort accordingly. Link-time optimization is particularly important with large applications where optimal placement of code can substantially reduce instruction cache misses. Additionally, -xlinkopt is most effective when used to compile the whole program. Use this option as follows:

demo% f95 -o progt -xO5 -xprofile=collect:prog file.f95
demo% progt
demo% f95 -o prog -xO5 -xprofile=use:prog -xlinkopt file.95

For details on using profile feedback, see the -xprofile option

Note that compiling with this option will increase link time slightly. Object file sizes also increase, but the size of the executable remains the same. Compiling with the -xlinkopt and -g flags increases the size of the executable by including debugging information.

–xlinkopt cannot be used together with –xinstrument.

3.4.150 –xloopinfo

Synonym for -loopinfo.

3.4.151 –xM

Generate make dependencies.

This option produces make dependencies for the compiled source file on standard output. The option covers all make dependencies for the source file, both header files and Fortran modules.

For module dependencies, this option uses an object-based module dependency scheme to eliminate the need for explicit build rules to create the module files.

This compilation cannot be used with -c, -S, or any other compilation options that produce different compilation outputs.

The generated dependency output does not contain any build rules, only dependencies for the files. The user will need to specify the build rules for all the files needed for the build. However, for the module files, no explicit build rules are needed, as the module files are created at the same time as the associated object files. Therefore, the module files only need to have a generic build rule:

%.mod:
        @ echo $@ is already up to date.

The module file build rule is only needed to prevent the 'make' process from stripping all the dependencies related to module files if there are no build rules for them. Other than that, the build rule does not do anything, as shown in the example above.

When used with the -keepmod option, the dependencies generated by the -xM option will prevent compilation cascade due to the unnecessarily updated modules files, as well as prevent the problem with recompilation on the same source files due to the use of the -keepmod option to prevent unnecessary updates on the module files.

This option works in conjunction with the -M, -I, and -moddir options to determine the appropriate directories for the module files needed in the build. Pre-compiled module files, for example those shipped by third parties, should be located at a directory pointed to by the -M option so the correct dependencies can be generated.

3.4.152 –xmaxopt[=n]

Enable optimization pragma and set maximum optimization level.

n has the value 1 through 5 and corresponds to the optimization levels of -O1 through -O5. If not specified, the compiler uses 5.

This option enables the !$PRAGMA SUN OPT=n directive when it appears in the source input. Without this option, the compiler treats these lines as comments. See The OPT Directive.

If this pragma appears with an optimization level greater than the maximum level on the -xmaxopt flag, the compiler uses the level set by -xmaxopt.

3.4.153 –xmemalign[=<a><b>]

(SPARC) Specify maximum assumed memory alignment and behavior of misaligned data accesses.

For memory accesses where the alignment is determinable at compile time, the compiler will generate the appropriate load/store instruction sequence for that data alignment.

For memory accesses where the alignment cannot be determined at compile time, the compiler must assume an alignment to generate the needed load/store sequence.

The -xmemalign flag allows the user to specify the maximum memory alignment of data to be assumed by the compiler for those indeterminate situations. It also specifies the error behavior at runtime when a misaligned memory access does take place.

The value specified consists of two parts: a numeric alignment value, <a>, and an alphabetic behavior flag, <b>.

Allowed values for alignment, <a>, are:

1

Assume at most 1-byte alignment.

2

Assume at most 2-byte alignment.

4

Assume at most 4-byte alignment.

8

Assume at most 8-byte alignment.

16

Assume at most 16-byte alignment.

Allowed values for error behavior on accessing misaligned data, <b>, are:

i

Interpret access and continue execution

s

Raise signal SIGBUS

f

On 64–bit platforms, raise signal SIGBUS only for alignments less or equal to 4, otherwise interpret access and continue execution. On other platforms f is equivalent to i.

The defaults when compiling without -xmemalign specified are:

  • 8i for 32–bit platforms

  • 8s for 64–bit platforms with C and C++

  • 8f for 64–bit platforms with Fortran

The default for -xmemalign appearing without a value is 1i for all platforms.

Note that -xmemalign itself does not force any particular data alignment to take place. Use -dalign or -aligncommon to force data alignment.

You must also specify -xmemalign whenever you link to an object file that was compiled with a b value of either i or f.

The -dalign option is a macro:

-dalign is a macro for: -xmemalign=8s -aligncommon=16

Do not use -aligncommon=1 with -xmemalign as these declarations will conflict and could cause a segmentation fault on some platforms and configurations.

See –aligncommon[={1|2|4|8|16}] for details.

3.4.154 –xmodel=[small | kernel | medium]

(x86) Specify the data address model for shared objects on Oracle Solaris x64 platforms.

The -xmodel option enables the compiler to create 64- bit shared objects for the Oracle Solaris x64 platforms and should only be specified for the compilation of such objects.

This option is invalid when specified with -m32.

small

This option generates code for the small model in which the virtual address of code executed is known at link time and all symbols are known to be located in the virtual addresses in the range from 0 to 231 – 224 - 1.

kernel

Generates code for the kernel model in which all symbols are defined to be in the range from 264 - 231 to 264 - 224.

medium

Generates code for the medium model in which no assumptions are made about the range of symbolic references to data sections. Size and address of the text section have the same limits as the small code model. Applications with large amounts of static data might require -xmodel=medium when compiling with —m64.

If you do not specify -xmodel, the compiler assumes -xmodel=small. Specifying -xmodel without an argument is an error.

It is not necessary to compile all routines with this option as long as you an ensure that objects being accessed are within range.

3.4.155 –xnolib

Synonym for -nolib.

3.4.156 –xnolibmil

Synonym for -nolibmil.

3.4.157 –xnolibmopt

(Obsolete) Use -xlibmopt=%none instead. See –fast, –xlibmopt[={%none,archive,shared}].

3.4.158 –xOn

Synonym for -On.

3.4.159 –xopenmp[={parallel|noopt|none}]

Enable explicit parallelization with OpenMP directives.

The flag accepts the following sub-option keywords:

parallel

Enables recognition of OpenMP pragmas. The optimization level under -xopenmp=parallel is -xO3. The compiler raises the optimization level to -xO3 if necessary and issues a warning.

This flag also defines the preprocessor macro _OPENMP. The _OPENMP macro is defined to have the decimal value yyyymm where yyyy and mm are the year and month designations of the version of the OpenMP API that the implementation supports. Refer to the Oracle Developer Studio 12.6: OpenMP API User’s Guide for the value of the _OPENMP macro for a particular release.

noopt

Enables recognition of OpenMP pragmas. The compiler does not raise the optimization level if it is lower than -xO3. If you explicitly set the optimization lower than -xO3, as in f95 -xO2 -xopenmp=noopt, the compiler issues an error. If you do not specify an optimization level with -xopenmp=noopt, the OpenMP pragmas are recognized, the program is parallelized accordingly, but no optimization is done. This sub-option also defines the preprocessor macro _OPENMP.

none

Does not enable the recognition of OpenMP pragmas, makes no change to the optimization level of your program, and does not define any preprocessor macros. This is the default when -xopenmp is not specified.

If you specify -xopenmp, but do not specify a sub-option keyword, the compiler assumes -xopenmp=parallel. If you do not specify -xopenmp at all, the compiler assumes -xopenmp=none.

Sub-options parallel and noopt will invoke -stackvar automatically.

If you are debugging an OpenMP program with dbx, compile with -g -xopenmp=noopt so you can breakpoint within parallel regions and display the contents of variables.

The default for -xopenmp might change in a future release. You can avoid warning messages by explicitly specifying an appropriate optimization level.

Use the OMP_NUM_THREADS environment variable to specify the number of threads to use when running an OpenMP program. If OMP_NUM_THREADS is not set, the default number of threads used is a multiple of the number of cores per socket (that is, cores per processor chip), which is less than or equal to the total number of cores or 32, whichever is less. You can specify a different number of threads by setting the OMP_NUM_THREADS environment variable, or by calling the omp_set_num_threads() OpenMP runtime routine, or by using the num_threads clause on the parallel region directive. For best performance, the number of threads used to execute a parallel region should not exceed the number of hardware threads (or virtual processors) available on the machine. On Oracle Solaris systems, this number can be determined by using the psrinfo(1M) command. On Oracle Linux systems, this number can be determined by inspecting the file /proc/cpuinfo. See the Oracle Developer Studio 12.6: OpenMP API User’s Guide for more information.

Nested parallelism is disabled by default. To enable nested parallelism, you must set the OMP_NESTED environment variable to TRUE. See the Oracle Developer Studio 12.6: OpenMP API User’s Guide.

If you compile and link in separate steps, specify -xopenmp in both the compilation step and the link step. When used with the link step, the -xopenmp option will link with the OpenMP runtime support library, libmtsk.so.

For up-to-date functionality and performance, make sure that the latest patch of the OpenMP runtime library, libmtsk.so, is installed on the system.

For more information about the OpenMP Fortran 95, C, and C++ application program interface (API) for building multithreaded applications, see the Oracle Developer Studio 12.6: OpenMP API User’s Guide.

3.4.160 –xpad

Synonym for -pad.

3.4.161 –xpagesize=size

Set the preferred page size for the stack and the heap.

On SPARC platforms, the size value must be one of the following:

8K 64K 512K 4M 32M 256M 2G 16G or default

On x86 platforms, the size value must be one of the following:

4K 2M 4M or default

For example: -xpagesize=4M

Not all these page sizes are supported on all platforms and depend on the architecture and Oracle Solaris environment. The page size specified must be a valid page size for the Oracle Solaris operating environment on the target platform. If it is not, the request will be silently ignored at run-time.

Use the pagesize(1) Oracle Solaris command to determine the number of bytes in a page. The operating system offers no guarantee that the page size request will be honored. However, appropriate segment alignment can be used to increase the likelihood of obtaining the requested page size. See the -xsegment_align option on how to set the segment alignment. You can use pmap(1) or meminfo(2) to determine page size of the target platform.

If you specify -xpagesize=default, the flag is ignored; -xpagesize specified without a size value is equivalent to -xpagesize=default.

This option is a macro for the combination -xpagesize_heap=size -xpagesize_stack=size. These two options accept the same arguments as -xpagesize. You can set them both with the same value by specifying -xpagesize=size or you can specify them individually with different values.

Compiling with this flag has the same effect as setting the LD_PRELOAD environment variable to mpss.so.1 with the equivalent options, or running the Oracle Solaris command ppgsz(1) with the equivalent options, before starting the program. See the Oracle Solaris man pages for details.

The libhugetlbfs library is required for –xpagesize to work on Oracle Linux. See the Oracle Linux libhugetlbfs(7) man page for more information.

3.4.162 –xpagesize_heap=size

Set the preferred page size for the heap.

The size value is the same as described for -xpagesize.

See -xpagesize for details.

3.4.163 –xpagesize_stack=size

(SPARC) Set the preferred page size for the stack.

The size value is the same as described for -xpagesize.

See -xpagesize for details.

3.4.164 -xpatchpadding[={fix|patch|size}]

Reserve an area of memory before the start of each function. If fix is specified, the compiler will reserve the amount of space required by fix and continue. This is the first default. If either patch or no value is specified, the compiler will reserve a platform-specific default value. A value of -xpatchpadding=0 will reserve 0 bytes of space. The maximum value for size on x86 is 127 bytes and on SPARC is 2048 bytes.

3.4.165 –xpec[={yes|no}]

Generate a PEC (Portable Executable Code) binary.

This option puts the program intermediate representations in the object file and the binary. This binary may be used later for tuning and troubleshooting.

A binary built with -xpec is usually 5 to 10 times larger than if it is built without. The default is -xpec=no.

Without an argument, -xpec is equivalent to -xpec=yes.

3.4.166 –xpg

Synonym for -pg.

3.4.167 –xpp={fpp|cpp}

Select source file preprocessor.

The default is -xpp=fpp.

The compilers use fpp(1) to preprocess .F, .F95, or .F03 source files. This preprocessor is appropriate for Fortran. Previous versions used the standard C preprocessor cpp. To select cpp, specify -xpp=cpp.

3.4.168 –xprefetch[=a[,a]]

Enable prefetch instructions on those architectures that support prefetch.

See The PREFETCH Directives for a description of the Fortran PREFETCH directives.

a must be one of the following:

auto

Enable automatic generation of prefetch instructions

no%auto

Disable automatic generation of prefetch instructions

explicit

Enable explicit prefetch macros

no%explicit

Disable explicit prefetch macros

latx:factor

(SPARC) Adjust the compiler’s assumed prefetch-to-load and prefetch-to-store latencies by the specified factor. The factor must be a positive floating-point or integer number.

If you are running computationally intensive codes on large SPARC multiprocessors, you might find it advantageous to use -xprefetch=latx:factor. This option instructs the code generator to adjust the default latency time between a prefetch and its associated load or store by the specified factor.

The prefetch latency is the hardware delay between the execution of a prefetch instruction and the time the data being prefetched is available in the cache. The compiler assumes a prefetch latency value when determining how far apart to place a prefetch instruction and the load or store instruction that uses the prefetched data.


Note -  The assumed latency between a prefetch and a load may not be the same as the assumed latency between a prefetch and a store.

The compiler tunes the prefetch mechanism for optimal performance across a wide range of machines and applications. This tuning may not always be optimal. For memory-intensive applications, especially applications intended to run on large multiprocessors, you may be able to obtain better performance by increasing the prefetch latency values. To increase the values, use a factor that is greater than 1. A value between .5 and 2.0 will most likely provide the maximum performance.

For applications with datasets that reside entirely within the external cache, you may be able to obtain better performance by decreasing the prefetch latency values. To decrease the values, use a factor that is less than 1.

To use the -xprefetch=latx:factor option, start with a factor value near 1.0 and run performance tests against the application. Then increase or decrease the factor, as appropriate, and run the performance tests again. Continue adjusting the factor and running the performance tests until you achieve optimum performance. When you increase or decrease the factor in small steps, you will see no performance difference for a few steps, then a sudden difference, then it will level off again.

yes

-xprefetch=yes is the same as -xprefetch=auto,explicit

no

-xprefetch=no is the same as -xprefetch=no%auto,no%explicit

With -xprefetch, -xprefetch=auto, and -xprefetch=yes, the compiler is free to insert prefetch instructions into the code it generates. This may result in a performance improvement on architectures that support prefetch.

3.4.168.1 Defaults:

If -xprefetch is not specified, -xprefetch=auto,explicit is assumed.

If only -xprefetch is specified, -xprefetch=auto,explicit is assumed.

If automatic prefetching is enabled, such as with -xprefetch or -xprefetch=yes, but a latency factor is not specified, then -xprefetch=latx:1.0 is assumed.

3.4.168.2 Interactions:

With -xprefetch=explicit, the compiler will recognize the directives:

-!$PRAGMA SUN_PREFETCH_READ_ONCE (name)
-!$PRAGMA SUN_PREFETCH_READ_MANY (name)
-!$PRAGMA SUN_PREFETCH_WRITE_ONCE (name)
-!$PRAGMA SUN_PREFETCH_WRITE_MANY (name)

The -xchip setting effects the determination of the assumed latencies and therefore the result of a latx:factor setting.

The latx:factor sub-option is valid only when automatic prefetching (auto) is enabled on SPARC processors.

3.4.168.3 Warnings:

Explicit prefetching should only be used under special circumstances that are supported by measurements.

Because the compiler tunes the prefetch mechanism for optimal performance across a wide range of machines and applications, you should only use -xprefetch=latx:factor when the performance tests indicate there is a clear benefit. The assumed prefetch latencies may change from release to release. Therefore, retesting the effect of the latency factor on performance whenever switching to a different release is highly recommended.

3.4.169 –xprefetch_auto_type=indirect_array_access

Generate indirect prefetches for a data arrays accessed indirectly.

Generates indirect prefetches for the loops indicated by the option -xprefetch_level={1|2|3} in the same fashion the prefetches for direct memory accesses are generated. The prefix no% can be added to negate the declaration.

The default is -xprefetch_auto_type=no%indirect_array_access.

Requires -xprefetch=auto and an optimization level -xO3 or higher.

Options such as -xdepend can affect the aggressiveness of computing the indirect prefetch candidates and therefore the aggressiveness of the automatic indirect prefetch insertion due to better memory alias disambiguation information.

3.4.170 –xprefetch_level={1|2|3}

Control the automatic generation of prefetch instructions.

This option is only effective when compiling with:

  • -xprefetch=auto,

  • with optimization level 3 or greater,

  • on a platform that supports prefetch.

The default for -xprefetch=auto without specifying -xprefetch_level is level 2.

Prefetch level 2 generates additional opportunities for prefetch instructions than level 1. Prefetch level 3 generates additional prefetch instructions than level 2.

Prefetch levels 2 and 3 may note be effective on older SPARC or x86 platforms.

3.4.171 –xprofile=p

Collects data for a profile or uses a profile to optimize.

p must be collect[:profdir], use[:profdir], or tcov[:profdir].

This option causes execution frequency data to be collected and saved during execution, then the data can be used in subsequent runs to improve performance. Profile collection is safe for multithreaded applications. That is, profiling a program that does its own multitasking ( -mt ) produces accurate results. This option is only valid when you specify -xO2 or greater level of optimization. If compilation and linking are performed in separate steps, the same -xprofile option must appear on the link step as well as the compile step.

collect[:profdir]

Collects and saves execution frequency for later use by the optimizer with -xprofile=use. The compiler generates code to measure statement execution-frequency.

-xMerge, -ztext, and -xprofile=collect should not be used together. While -xMerge forces statically initialized data into read-only storage, -ztext prohibits position-dependent symbol relocations in read-only storage, and -xprofile=collect generates statically initialized, position-dependent symbol relocations in writable storage.

The profile directory name profdir, if specified, is the pathname of the directory where profile data are to be stored when a program or shared library containing the profiled object code is executed. If the profdir pathname is not absolute, it is interpreted relative to the current working directory when the program is compiled with the option -xprofile=use:profdir.

If no profile directory name is specified with —xprofile=collect:prof_dir or —xprofile=tcov:prof_dir, profile data are stored at run time in a directory named program.profile where program is the basename of the profiled process's main program. In this case, the environment variables SUN_PROFDATA and SUN_PROFDATA_DIR can be used to control where the profile data are stored at run time. If set, the profile data are written to the directory given by $SUN_PROFDATA_DIR/$SUN_PROFDATA. If a profile directory name is specified at compilation time, SUN_PROFDATA_DIR and SUN_PROFDATA have no effect at run time. These environment variables similarly control the path and names of the profile data files written by tcov, as described in the tcov(1) man page.

If these environment variables are not set, the profile data is written to the directory profdir.profile in the current directory, where profdir is the name of the executable or the name specified in the -xprofile=collect:profdir flag. -xprofile does not append. profile to profdir if profdir already ends in .profile. If you run the program several times, the execution frequency data accumulates in the profdir.profile directory; that is output from prior executions is not lost.

If you are compiling and linking in separate steps, make sure that any object files compiled with -xprofile=collect are also linked with -xprofile=collect.

use[:profdir]

Uses execution frequency data collected from code compiled with —xprofile=collect[:profdir] or —xprofile=tcov[:profdir] to optimize for the work performed when the profiled code was executed. profdir is the pathname of a directory containing profile data collected by running a program that was compiled with —xprofile=collect[:profdir] or —xprofile=tcov[:profdir].

To generate data that can be used by both tcov and —xprofile=use[:profdir], a profile directory must be specified at compilation time, using the option —xprofile=tcov[:profdir]. The same profile directory must be specified in both —xprofile=tcov:profdir and —xprofile=use:profdir. To minimize confusion, specify profdir as an absolute pathname.

The profdir pathname is optional. If profdir is not specified, the name of the executable binary is used. a.out is used if -o is not specified. The compiler looks for profdir.profile/feedback, or a.out.profile/feedback when profdir is not specified. For example:

demo% f95 -xprofile=collect -o myexe prog.f95 		 
demo% f95 -xprofile=use:myexe -xO5 -o myexe prog.f95

The program is optimized by using the execution frequency data previously generated and saved in the feedback files written by a previous execution of the program compiled with -xprofile=collect.

Except for the -xprofile option, the source files and other compiler options must be exactly the same as those used for the compilation that created the compiled program that generated the feedback file. The same version of the compiler must be used for both the collect build and the use build as well.

If compiled with -xprofile=collect:profdir, the same profile directory name profdir must be used in the optimizing compilation: -xprofile=use:profdir.

See also -xprofile_ircache for speeding up compilation between collect and use phases.

tcov[:profdir]

Instrument object files for basic block coverage analysis using tcov(1).

If the optional profdir argument is specified, the compiler will create a profile directory at the specified location The data stored in the profile directory can be used either by tcov(1) or by the compiler with -xprofile=use:profdir. If the optional profdir pathname is omitted, a profile directory will be created when the profiled program is executed. The data stored in the profile directory can only be used by tcov(1). The location of the profile directory can be controlled using environment variables SUN_PROFDATA and SUN_PROFDATA_DIR.

If the location specified by profdir is not an absolute pathname, it is interpreted at compilation time relative to the current working directory at the time of compilation. If profdir is specified for any object file, the same location must be specified for all object files in the same program. The directory whose location is specified by profdir must be accessible from all machines where the profiled program is to be executed. The profile directory should not be deleted until its contents are no longer needed, because data stored there by the compiler cannot be restored except by recompilation.

Example [1]: if object files for one or more programs are compiled with -xprofile=tcov:/test/profdata, a directory named /test/profdata.profile will be created by the compiler and used to store data describing the profiled object files. The same directory will also be used at execution time to store execution data associated with the profiled object files.

Example [2]: if a program named myprog is compiled with -xprofile=tcov and executed in the directory /home/joe, the directory /home/joe/myprog.profile will be created at run time and used to store runtime profile data.

3.4.172 –xprofile_ircache[=path]

(SPARC) Save and reuse compilation data between collect and use profile phases.

Use with -xprofile=collect|use to improve compilation time during the use phase by reusing compilation data saved from the collect phase.

If specified, path will override the location where the cached files are saved. By default, these files will be saved in the same directory as the object file. Specifying a path is useful when the collect and use phases happen in two different places.

A typical sequence of commands might be:

demo% f95 -xO5 -xprofile=collect -xprofile_ircache t1.c t2.c
demo% a.out     collects feedback data
demo% f95 -xO5 -xprofile=use -xprofile_ircache t1.c t2.c

With large programs, compilation time in the use phase can improve significantly by saving the intermediate data in this manner. But this will be at the expense of disk space, which could increase considerably.

3.4.173 –xprofile_pathmap=collect_prefix:use_prefix

(SPARC) Set path mapping for profile data files .

Use the -xprofile_pathmap option with the -xprofile=use option.

Use -xprofile_pathmap when the compiler is unable to find profile data for an object file that is compiled with -xprofile=use, and:

  • You are compiling with -xprofile=use into a directory that is not the directory used when previously compiling with -xprofile=collect.

  • Your object files share a common basename in the profile but are distinguished from each other by their location in different directories.

The collect-prefix is the prefix of the UNIX pathname of a directory tree in which object files were compiled using -xprofile=collect.

The use-prefix is the prefix of the UNIX pathname of a directory tree in which object files are to be compiled using -xprofile=use.

If you specify multiple instances of -xprofile_pathmap, the compiler processes them in the order of their occurrence. Each use-prefix specified by an instance of -xprofile_pathmap is compared with the object file pathname until either a matching use-prefix is identified or the last specified use-prefix is found not to match the object file pathname.

3.4.174 –xrecursive

Allow routines without RECURSIVE attribute call themselves recursively.

Normally, only subprograms defined with the RECURSIVE attribute can call themselves recursively.

Compiling with -xrecursive enables subprograms to call themselves, even if they are not defined with the RECURSIVE attribute. But, unlike subroutines defined RECURSIVE, use of this flag does not cause local variables to be allocated on the stack by default. For local variables to have separate values in each recursive invocation of the subprogram, compile also with -stackvar to put local variables on the stack.

Indirect recursion (routine A calls routine B which then calls routine A) can give inconsistent results at optimization levels greater than -xO2. Compiling with the -xrecursive flag guarantees correctness with indirect recursion, even at higher optimization levels.

Compiling with -xrecursive can cause performance degradations.

3.4.175 –xreduction

Synonym for -reduction.

3.4.176 –xregs=r

Specifies the usage of registers for the generated code.

r is a comma-separated list that consists of one or more of the following sub-options: appl, float,frameptr.

Prefixing a sub-option with no% disables that sub-option.

Note that —xregs sub-options are restricted to specific hardware platforms.

Example: -xregs=appl,no%float

Table 22  The -xregs Sub-options
Value
Meaning
appl
(SPARC) Allow the compiler to generate code using the application registers as scratch registers. The application registers are:
g2, g3, g4 (on 32–bit platforms)
g2, g3 (on 64–bit platforms)
It is strongly recommended that all system software and libraries be compiled using -xregs=no%appl. System software (including shared libraries) must preserve these registers’ values for the application. Their use is intended to be controlled by the compilation system and must be consistent throughout the application.
In the SPARC ABI, these registers are described as application registers. Using these registers can improve performance because fewer load and store instructions are needed. However, such use can conflict with some old library programs written in assembly code.
float
(SPARC) Allow the compiler to generate code by using the floating-point registers as scratch registers for integer values. Use of floating-point values may use these registers regardless of this option. If you want your code to be free of all references to floating point registers, you need to use -xregs=no%float and also make sure your code does not in any way use floating point types.
frameptr
(x86) Allow the compiler to use the frame-pointer register (%ebp on IA32, %rbp on AMD64) as a general-purpose register.
The default is -xregs=no%frameptr
With -xregs=framptr the compiler is free to use the frame-pointer register to improve program performance. However, some features of the debugger and performance measurement tools may be limited as a result. Stack tracing, debuggers, and performance analyzers cannot report on functions compiled with —xregs=frameptr
Mixed C, Fortran, and C++ code should not be compiled with —xregs=frameptr if a C++ function, called directly or indirectly from a C or Fortran function, can throw an exception. If compiling such mixed source code with —fast, add —xregs=no%frameptr after the —fast option on the command line.
With more available registers on 64–bit platforms, compiling with —xregs=frameptr has a better chance of improving 32–bit code performance than 64–bit code.
The compiler ignores -xregs=frameptr and issues a warning if you also specify -pg. Also, -xkeepframe overrides -xregs=frameptr.

The SPARC default is -xregs=appl,float.

The x86 default is -xregs=no%frameptr. -xregs=frameptr in included in the expansion of -fast.

It is strongly recommended that you compile code intended for shared libraries that will link with applications, with -xregs=no%appl,float. At the very least, the shared library should explicitly document how it uses the application registers so that applications linking with those libraries are aware of these register assignments.

For example, an application using the registers in some global sense (such as using a register to point to some critical data structure) would need to know exactly how a library with code compiled without -xregs=no%appl is using the application registers in order to safely link with that library.

On x86 systems, -pg is incompatible with -xregs=frameptr, and these two options should not be used together. Note also that -xregs=frameptr is included in -fast.

3.4.177 -xs[={yes|no}]

(Oracle Solaris) Link debug information from object files into executable.

-xs is the same as -xs=yes.

The default is -xs=yes.

This option controls the trade-off of executable size versus the need to retain object files in order to debug. For dwarf, use -xs=no to keep the executable small but depend on the object files. This option has almost no effect on dbx performance or the runtime per formance of the program.

When the compile command forces linking (that is, -c is not specified) there will be no object file(s) and the debug information must be placed in the executable. In this case, -xs=no (implicit or explicit) will be ignored.

The feature is implemented by having the compiler adjust the section flags and/or section names in the object file that it emits, which then tells the linker what to do for that object file's debug information. It is therefore a compiler option, not a linker option. It is possible to have an executable with some object files compiled -xs=yes and others compiled -xs=no.

Oracle Linux compilers accept but ignore -xs. They do not accept -xs={yes|no}.

3.4.178 –xsafe=mem

(SPARC) Allow the compiler to assume that no memory protection violations occur.

Using this option allows the compiler to assume no memory–based traps occur. It grants permission to use the speculative load instruction on the SPARC V9 platforms.

This option takes effect only when used with optimization level –xO5 and one of the following –xarch values: sparc, sparcvis, sparcvis2, or sparcvis3 for both –m32 and –m64.


Caution

Caution  -  Because non-faulting loads do not cause a trap when a fault such as address misalignment or segmentation violation occurs, you should use this option only for programs in which such faults cannot occur. Because few programs incur memory-based traps, you can safely use this option for most programs. Do not use this option with programs that explicitly depend on memory-based traps to handle exceptional conditions.


3.4.179 –xsecure_code_analysis{=[yes|no]}

Enable or disable compiler secure code analysis to find and display possible memory safety violations at compile time. Secure code analysis runs in parallel with the compilation process and may result in increased compile time.

If –xsecure_code_analysis is not specified or if it is specified without a yes|no argument, the default is –xsecure_code_analysis=yes.

Use –xsecure_code_analysis=no to disable secure code analysis.

3.4.180 -xsegment_align=n

(Oracle Solaris) This option causes the driver to include a special mapfile on the link line. The mapfile aligns the text, data, and bss segments to the value specified by n. When using very large pages, it is important that the heap and stack segments are aligned on an appropriate boundary. If these segments are not aligned, small pages will be used up to the next boundary, which could cause a performance degradation. The mapfile ensures that the segments are aligned on an appropriate boundary.

The n value must be one of the following:

SPARC: The following values are valid: 8K, 64K, 512K, 2M, 4M, 32M, 256M, 1G, and none.

x86: The following values are valid: 4K, 8K, 64K, 512K, 2M, 4M, 32M, 256M, 1G, and none.

The default for both SPARC and x86 is none.

Recommended usage is as follows:

SPARC 32-bit compilation: -xsegment_align=64K
SPARC 64-bit compilation: -xsegment_align=4M

x86 32-bit compilation: -xsegment_align=8K
x86 64-bit compilation: -xsegment_align=4M

The driver will include the appropriate mapfile. For example, if the user specifies -xsegment_align=4M, the driver adds -M install-directory/lib/compilers/mapfiles/map.4M.align to the link line, where install-directory is the installation directory. The aforementioned segments will then be aligned on a 4M boundary.

3.4.181 –xspace

Do no optimizations that increase the code size.

Example: Do not unroll or parallelize loops if it increases code size.

3.4.182 –xtarget=t

Specifies the target system for the instruction set and optimization.

t must be one of: native, generic, or platform–name.

Each specific value for -xtarget expands into a specific set of values for the -xarch, -xchip, and -xcache options. Use the -xdryrun option to determine the expansion of -xtarget=native on a running system.

For example, –xtarget=T3 is equivalent to –xchip=T3 –xcache=8/16/4:6144/64/24 –xarch=sparcvis3.

 
cc -dryrun -xtarget=T3 |& grep ###
###     command line files and options (expanded):
### -dryrun -xchip=T3 -xcache=8/16/4:6144/64/24 -xarch=sparcvis3

The data type model, either 32-bit or 64-bit, is indicated by the -m32|-m64 option. To specify the 64-bit data type model, specify the –m64 option as follows:

-xtarget=<value> ... -m64

To specify the 32-bit data type model, use the –m32 option as follows:

-xtarget=<value> ... -m32

See also the –m32|–m64 option for a discussion of the default data type model.


Note -  The expansion of -xtarget for a specific host system might not expand to the same -xarch, -xchip, or -xcache settings as -xtarget=native when compiling on that system.
 
demo% f95 -dryrun -xtarget=T3 |& grep ###
###     command line files and options (expanded):
### -dryrun -xchip=T3 -xcache=8/16/4:6144/64/24 -xarch=sparcvis3

Note that the —xtarget expansion for a particular named platform might not be the same as —xtarget=native on that same platform.

Table 23  -xtarget Values for All Systems
Value
Meaning
native
Equivalent to
—m32 —xarch=native —xchip=native —xcache=native
to give best performance on the host 32–bit system.
generic
Equivalent to
—m32 —xarch=generic —xchip=generic —xcache=generic
to give best performance on most 32–bit systems.
system-name
Gets the best performance for the specified platform.
Select a system name from the following lists that represents the actual system you are targeting.

The performance of some programs may benefit by providing the compiler with an accurate description of the target computer hardware. When program performance is critical, the proper specification of the target hardware could be very important, especially when running on the newer SPARC processors. However, for most programs and older SPARC processors, the performance gain is negligible and a generic specification is sufficient.

3.4.182.1 SPARC Platforms

The following table gives a list of the commonly used system platform names accepted by the compiler.

Table 24  Expansions of Commonly Used -xtarget System Platforms
-xtarget= platform-name
-xarch
-xchip
-xcache
sparc64vi (Obsolete)
sparcfmaf
sparc64vi
128/64/2:5120/64/10
sparc64vii (Obsolete)
sparcima
sparc64vii
64/64/2:5120/256/10
sparc64viiplus
sparcima
sparc64viiplus
64/64/2:11264/256/11
sparc64x
sparcace
sparc64x
64/128/4/2:24576/128/24/32
sparc64xplus
sparcaceplus
sparc64xplus
64/128/4/2:24576/128/24/32
sparc64xii
sparcace2
sparc64xii
64/128/4/8:512/128/8/96:32768/128/16/96
ultraT1 (Obsolete)
sparc
ultraT1
8/16/4/4:3072/64/12/32
ultraT2 (Obsolete)
sparcvis2
ultraT2
8/16/4:4096/64/16
ultraT2plus (Obsolete)
sparcvis2
ultraT2plus
8/16/4:4096/64/16
ultraT3
sparcvis3
ultraT3
8/16/4:6144/64/24
T3 (Obsolete)
sparcvis3
T3
8/16/4:6144/64/24
T4
sparc4
T4
16/32/4:128/32/8:4096/64/16
T5
sparc4
T5
16/32/4/8:128/32/8/8:8192/64/16/128
T7
sparc5
T7
16/32/4/8:256/64/8/16:8192/64/8/32
M5
sparc4
M5
16/32/4/8:128/32/8/8:49152/64/12/48
M6
sparc4
M6
16/32/4/8:128/32/8/8:49152/64/12/96
M7
sparc5
M7
16/32/4/8:256/64/8/16:8192/64/8/32

3.4.182.2 x86 Platforms

The valid -xtarget platform names for x86 systems and their expansions are shown in the following table.

Table 25  -xtarget Values on x86 Platforms
-xtarget=
-xarch
-xchip
-xcache
pentium (Obsolete)
generic
generic
generic
pentium_pro (Obsolete)
generic
generic
generic
pentium3 (Obsolete)
generic
generic
generic
pentium4
sse2
pentium4
8/64/4:256/128/8
opteron
sse2a
opteron
64/64/2:1024/64/16
woodcrest
ssse3
core2
32/64/8:4096/64/16
barcelona (Obsolete)
amdsse4a
amdfam10
64/64/2:512/64/16
penryn
sse4_1
penryn
2/64/8:6144/64/24
nehalem
sse4_2
nehalem
32/64/8:256/64/8:8192/64/16
westmere
aes
westmere
32/64/8:256/64/8:30720/64/24
sandybridge
avx
sandybridge
32/64/8/2:256/64/8/2:20480/64/20/16
ivybridge
avx_i
ivybridge
32/64/8/2:256/64/8/2:20480/64/20/16
haswell
avx2
haswell
32/64/8/2:256/64/8/2:20480/64/20/16
broadwell
avx2_i
broadwell
32/64/8/2:256/64/8/2:20480/64/20/16
skylake
avx512
skylake
32/64/8:256/64/8:20480/64/20

Compiling for 64-bit Oracle Solaris OS on 64-bit enabled x86 platform is indicated by the -m64 flag. For example, compiling with -xtarget=opteron is not necessary or sufficient. If -xtarget is specified, the -m64 option must appear after the -xtarget flag, as in:

-xtarget=opteron -m64

otherwise the compilation will be 32-bit x86.

3.4.183 -xtemp=path

Equivalent to -temppath.

3.4.184 -xthroughput[={yes|no}]

The -xthroughput option tells the compiler that the application will be run in situations where many processes are simultaneously running on the system

If -xthroughput=yes, the compiler will favor optimizations that slightly reduce performance for a single process while improving the amount of work achieved by all the processes on the system. As an example, the compiler might choose to be less aggressive in prefetching data. Such a choice would reduce the memory bandwidth consumed by the process, and as such the process may run slower, but it would also leave more memory bandwidth to be shared among other processes.

The default is -xthroughput=no.

3.4.185 –xtime

Synonym for -time.

3.4.186 –xtypemap=spec

Specify default data mappings.

This option provides a flexible way to specify the byte sizes for default data types. This option applies to both default-size variables and constants.

The specification string spec may contain any or all of the following in a comma-delimited list:

real:size,double:size,integer:size

The allowable combinations on each platform are:

  • real:32

  • real:64

  • double:64

  • double:128

  • integer:16

  • integer:32

  • integer:64

For example:

  • –xtypemap=real:64,double:64,integer:64

maps both default REAL and DOUBLE to 8 bytes.

This option applies to all variables declared with default specifications (without explicit byte sizes), as in REAL XYZ (resulting in a 64-bit XYZ). Also, all single-precision REAL constants are promoted to REAL*8.

Note that INTEGER and LOGICAL are treated the same, and COMPLEX is mapped as two REALs. Also, DOUBLE COMPLEX will be treated the way DOUBLE is mapped.

3.4.187 -xunboundsym={yes|no}

Specify whether the program contains references to dynamically bound symbols.

-xunboundsym=yes means the program contains references dynamically bound symbols.

-xunboundsym=no means the program does not contain references to dynamically bound symbols.

The default is -xunboundsym=no.

3.4.188 –xunroll=n

Synonym for -unroll=n.

3.4.189 -xvector[=a]

Enables automatic generation of calls to the vector library functions or the generation of the SIMD (Single Instruction Multiple Data) instructions on processors that support SIMD. You must use default rounding mode by specifying -fround=nearest when you use this option.

The -xvector option requires optimization level -xO3 or greater. The option is silently ignored if the optimization level is lower than -xO3.

The possible values for a are listed in the following table. The no% prefix disables the associated sub-option.

Table 26  -xvector Sub-options
Value
Meaning
[no%]lib
(Oracle Solaris) Enables the compiler to transform math library calls within loops into single calls to the equivalent vector math routines when such transformations are possible. This could result in a performance improvement for loops with large loop counts. Use no%lib to disable this option.
[no%]simd
(SPARC) For –xarch=sparcace, –xarch=sparcaceplus and –xarch=sparcace2, directs the compiler to use floating point and integral SIMD instructions to improve the performance of certain loops. Contrary to that of the other SPARC –xarch values under –xarch=sparcace, –xarch=sparcaceplus and –xarch=sparcace2, –xvector=simd is in effect unless –xvector=none or –xvector=no%simd has been specified. In addition -xO4 or greater is required for –xvector=simd, otherwise –xvector=simd is ignored.
For all other -xarch values, directs the compiler to use the Visual Instruction Set [VIS1, VIS2, ViS3, etc.] SIMD instructions to improve the performance of certain loops. Basically with explicit -xvector=simd option, the compiler will perform loop transformation enabling the generation of special vectorized SIMD instructions to reduce the number of loop iterations. In addition to the optimization level requirement noted below, the -xvector=simd option is effective only if -xarch=sparcvis3 and above.
[no%]simd
(x86) Directs the compiler to use the native x86 SSE SIMD instructions to improve performance of certain loops. Streaming extensions are used on x86 by default at optimization level 3 and above where beneficial. Use no%simd to disable this option..
The compiler will use SIMD only if streaming extensions exist in the target architecture; that is, if target ISA is at least SSE2. For example, you can specify -xtarget=woodcrest, —xarch=generic, -xarch=sse2, -xarch=sse3, or -fast on a modern platform to use it. If the target ISA has no streaming extensions, the sub-option will have no effect.
%none
Disable this option completely.
yes
This option is deprecated; specify –xvector=lib instead.
no
This option is deprecated; specify –xvector=%none instead.

On x86, the default is –xvector=simd. On SPARC, the default is –xvector=simd under –xarch=sparcace, –xarch=sparcaceplus and –xarch=sparcace2, and –xvector=%none on other SPARC –xarch values. If you specify -xvector without a sub-option, the compiler assumes -xvector=simd,lib on Oracle Solaris x86, -xvector=lib on Oracle Solaris SPARC, and -xvector=simd on Oracle Linux platforms.

The compiler includes the libmvec libraries in the load step.

If you compile and link with separate commands, be sure to use -xvector in the linking CC command as well.

3.4.190 –ztext

Generate only pure libraries with no relocations.

The general purpose of -ztext is to verify that a generated library is pure text; instructions are all position–independent code. Therefore, it is generally used with both -G and -pic.

With -ztext, if ld finds an incomplete relocation in the text segment, then it does not build the library. If it finds one in the data segment, then it generally builds the library anyway; the data segment is writable.

Without -ztext, ld builds the library, relocations or not.

A typical use is to make a library from both source files and object files, where you do not know if the object files were made with -pic.

Example: Make library from both source and object files:

demo% f95 -G -pic -ztext -o MyLib -hMyLib a.f b.f x.o y.o

An alternate use is to ask if the code is position–independent already: compile without -pic, but ask if it is pure text.

Example: Ask if it is pure text already—even without -pic:

demo% f95 -G -ztext -o MyLib -hMyLib a.f b.f x.o y.o

The options -ztext and -xprofile=collect should not be used together. While -ztext prohibits position-dependent symbol relocations in read-only storage, -xprofile=collect generates statically initialized, position-dependent symbol relocations in writable storage.

If you compile with -ztext and ld does not build the library, then you can recompile without -ztext, and ld will build the library. The failure to build with -ztext means that one or more components of the library cannot be shared; however, maybe some of the other components can be shared. This raises questions of performance that are best left to you, the programmer.