Sun Studio 12: Fortran User's Guide

3.4.163 –xprefetch[=a[,a]]

Enable prefetch instructions on those architectures that support prefetch.

See 2.3.1.8 The PREFETCH Directives for a description of the Fortran PREFETCH directives.

a must be one of the following:

auto

Enable automatic generation of prefetch instructions

no%auto

Disable automatic generation of prefetch instructions

explicit

Enable explicit prefetch macros (SPARC only)

no%explicit

Disable explicit prefetch macros (SPARC only

latx:factor

(SPARC) Adjust the compiler’s assumed prefetch-to-load and prefetch-to-store latencies by the specified factor. The factor must be a positive floating-point or integer number.

If you are running computationally intensive codes on large SPARC multiprocessors, you might find it advantageous to use -xprefetch=latx:factor. This option instructs the code generator to adjust the default latency time between a prefetch and its associated load or store by the specified factor.

The prefetch latency is the hardware delay between the execution of a prefetch instruction and the time the data being prefetched is available in the cache. The compiler assumes a prefetch latency value when determining how far apart to place a prefetch instruction and the load or store instruction that uses the prefetched data.


Note –

The assumed latency between a prefetch and a load may not be the same as the assumed latency between a prefetch and a store.


The compiler tunes the prefetch mechanism for optimal performance across a wide range of machines and applications. This tuning may not always be optimal. For memory-intensive applications, especially applications intended to run on large multiprocessors, you may be able to obtain better performance by increasing the prefetch latency values. To increase the values, use a factor that is greater than 1. A value between .5 and 2.0 will most likely provide the maximum performance.

For applications with datasets that reside entirely within the external cache, you may be able to obtain better performance by decreasing the prefetch latency values. To decrease the values, use a factor that is less than 1.

To use the -xprefetch=latx:factor option, start with a factor value near 1.0 and run performance tests against the application. Then increase or decrease the factor, as appropriate, and run the performance tests again. Continue adjusting the factor and running the performance tests until you achieve optimum performance. When you increase or decrease the factor in small steps, you will see no performance difference for a few steps, then a sudden difference, then it will level off again.

yes

-xprefetch=yes is the same as -xprefetch=auto,explicit

no

-xprefetch=no is the same as -xprefetch=no%auto,no%explicit

With -xprefetch, -xprefetch=auto, and -xprefetch=yes, the compiler is free to insert prefetch instructions into the code it generates. This may result in a performance improvement on architectures that support prefetch.

3.4.163.1 Defaults:

If -xprefetch is not specified, -xprefetch=no%auto,explicit is assumed.

If only -xprefetch is specified, -xprefetch=auto,explicit is assumed.

The default of no%auto is assumed unless explicitly overridden with the use of -xprefetch without any arguments or with an argument of auto or yes. For example, -xprefetch=explicit is the same as -xprefetch=explicit,no%auto.

The default of explicit is assumed unless explicitly overridden with an argument of no%explicit or an argument of no. For example, -xprefetch=auto is the same as -xprefetch=auto,explicit.

If automatic prefetching is enabled, such as with -xprefetch or -xprefetch=yes, but a latency factor is not specified, then -xprefetch=latx:1.0 is assumed.

3.4.163.2 Interactions:

With -xprefetch=explicit, the compiler will recognize the directives:

$PRAGMA SUN_PREFETCH_READ_ONCE (name)

$PRAGMA SUN_PREFETCH_READ_MANY (name)

$PRAGMA SUN_PREFETCH_WRITE_ONCE (name)

$PRAGMA SUN_PREFETCH_WRITE_MANY (name)

The -xchip setting effects the determination of the assumed latencies and therefore the result of a latx:factor setting.

The latx:factor suboption is valid only when automatic prefetching (auto) is enabled on SPARC processors.

3.4.163.3 Warnings:

Explicit prefetching should only be used under special circumstances that are supported by measurements.

Because the compiler tunes the prefetch mechanism for optimal performance across a wide range of machines and applications, you should only use -xprefetch=latx:factor when the performance tests indicate there is a clear benefit. The assumed prefetch latencies may change from release to release. Therefore, retesting the effect of the latency factor on performance whenever switching to a different release is highly recommended.