JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.2: C User's Guide
search filter icon
search icon

Document Information

Preface

1.  Introduction to the C Compiler

2.  C-Compiler Implementation-Specific Information

3.  Parallelizing C Code

3.1 Overview

3.1.1 Example of Use

3.2 Parallelizing for OpenMP

3.2.1 Handling OpenMP Runtime Warnings

3.3 Environment Variables

3.3.1 PARALLEL or OMP_NUM_THREADS

3.3.2 SUNW_MP_THR_IDLE

3.3.3 SUNW_MP_WARN

3.3.4 STACKSIZE

3.3.5 Using restrict in Parallel Code

3.4 Data Dependence and Interference

3.4.1 Parallel Execution Model

3.4.2 Private Scalars and Private Arrays

3.4.3 Storeback

3.4.4 Reduction Variables

3.5 Speedups

3.5.1 Amdahl's Law

3.5.1.1 Overheads

3.5.1.2 Gustafson's Law

3.6 Load Balance and Loop Scheduling

3.6.1 Static or Chunk Scheduling

3.6.2 Self Scheduling

3.6.3 Guided Self Scheduling

3.7 Loop Transformations

3.7.1 Loop Distribution

3.7.2 Loop Fusion

3.7.3 Loop Interchange

3.8 Aliasing and Parallelization

3.8.1 Array and Pointer References

3.8.2 Restricted Pointers

3.8.3 Explicit Parallelization and Pragmas

3.8.3.1 Serial Pragmas

3.8.3.2 Parallel Pragma

Nesting of for Loops

Eligibility for Parallelizing

Number of Processors

Classifying Variables

Default Scoping Rules for private and shared Variables

private Variables

shared Variables

readonly Variables

storeback Variables

savelast

reduction Variables

Scheduling Control

3.9 Memory Barrier Intrinsics

4.  lint Source Code Checker

5.  Type-Based Alias Analysis

6.  Transitioning to ISO C

7.  Converting Applications for a 64-Bit Environment

8.  cscope: Interactively Examining a C Program

A.  Compiler Options Grouped by Functionality

B.  C Compiler Options Reference

C.  Implementation-Defined ISO/IEC C99 Behavior

D.  Supported Features of C99

E.  Implementation-Defined ISO/IEC C90 Behavior

F.  ISO C Data Representations

G.  Performance Tuning

H.  The Differences Between K&R Solaris Studio C and Solaris Studio ISO C

Index

3.9 Memory Barrier Intrinsics

The compiler provides the header file mbarrier.h, which defines various memory barrier intrinsics for SPARC and x86 processors. These intrinsics may be of use for developers writing multithreaded code using their own synchronization primitives. Users are advised to refer to the documentation of their processors to determine when and whether these intrinsics are necessary for their particular situation.

Memory ordering intrinsics supported by mbarrier.h:

All the barrier intrinsics with the exception of the __compiler_barrier() intrinsic generate memory ordering instructions, on x86 these are mfence, sfence, or lfence instructions, on SPARC platforms these are membar instructions.

The __compiler_barrier() intrinsic generates no instructions and instead informs the compiler that all previous memory operations must be completed before any future memory operations are initiated. The practical result of this is that all non-local variables and local variables with the static storage class specifier will be stored back to memory before the barrier, and reloaded after the barrier, and the compiler will not mix memory operations from before the barrier with those after. All other barriers implicitly include the behaviour of the __compiler_barrier() intrinsic.

For example, in the following code the presence of the __compiler_barrier() intrinsic stops the compiler from merging the two loops:

#include "mbarrier.h"
int thread_start[16];
void start_work()
{
/* Start all threads */
   for (int i=0; i<8; i++)
   {
     thread_start[i]=1;
   }
   __compiler_barrier();
/* Wait for all threads to complete */
   for (int i=0; i<8; i++)
   {
      while (thread_start[i]==1){}
   }
}