Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.3: Thread Analyzer User's Guide Oracle Solaris Studio 12.3 Information Library |
1. What is the Thread Analyzer and What Does It Do?
2.1 Data Race Tutorial Source Files
2.1.1 Getting the Data Race Tutorial Source Files
2.1.2 Source Code for prime_omp.c
2.1.3 Source Code for prime_pthr.c
2.1.3.1 Effect of Data Races in prime_omp.c and prime_pthr.c
2.2 How to Use the Thread Analyzer to Find Data Races
2.2.1.1 To Instrument Source Code
2.2.1.2 To Instrument Binary Code
2.2.2 Create a Data Race Detection Experiment
2.2.3 Examine the Data Race Detection Experiment
2.2.3.1 Using Thread Analyzer to View the Data Race Experiment
2.2.3.2 Using er_print to View the Data Race Experiment
2.3 Understanding the Experiment Results
2.3.1 Data Races in prime_omp.c
2.3.2 Data Races in prime_pthr.c
2.3.3 Call Stack Traces of Data Races
2.4 Diagnosing the Cause of a Data Race
2.4.1 Check Whether or Not the Data Race is a False Positive
2.4.2 Check Whether or Not the Data Race is Benign
2.4.3 Fix the Bug, Not the Data Race
2.4.3.1 Fixing Bugs in prime_omp.c
2.4.3.2 Fixing Bugs in prime_pthr.c
2.5.1 User-Defined Synchronizations
2.5.2 Memory That is Recycled by Different Threads
Some multithreaded applications intentionally allow data races in order to get better performance. A benign data race is an intentional data race whose existence does not affect the correctness of the program. The following examples demonstrate benign data races.
Note - In addition to benign data races, a large class of applications allow data races because they rely on lock-free and wait-free algorithms which are difficult to design correctly. The Thread Analyzer can help determine the locations of data races in these applications.
The threads in prime_omp.c check whether an integer is a prime number by executing the function is_prime().
16 int is_prime(int v) 17 { 18 int i; 19 int bound = floor(sqrt(v)) + 1; 20 21 for (i = 2; i < bound; i++) { 22 /* no need to check against known composites */ 23 if (!pflag[i]) 24 continue; 25 if (v % i == 0) { 26 pflag[v] = 0; 27 return 0; 28 } 29 } 30 return (v > 1); 31 }
The Thread Analyzer reports that there is a data race between the write to pflag[ ] on line 26 and the read of pflag[ ] on line 23. However, this data race is benign as it does not affect the correctness of the final result. At line 23, a thread checks whether or not pflag[i], for a given value of i is equal to zero. If pflag[i] is equal to zero, that means that i is a known composite number (in other words, i is known to be non-prime). Consequently, there is no need to check whether or not v is divisible by i; you only need to check whether or not v is divisible by some prime number. Therefore, if pflag[i] is equal to zero, the thread continues to the next value of i. If pflag[i] is not equal to zero and v is divisible by i, the thread assigns zero to pflag[v] to indicate that v is not a prime number.
It does not matter, from a correctness point of view, if multiple threads check the same pflag[ ] element and write to it concurrently. The initial value of a pflag[ ] element is one. When the threads update that element, they assign it the value zero. That is, the threads store zero in the same bit in the same byte of memory for that element. On current architectures, it is safe to assume that those stores are atomic. This means that, when that element is read by a thread, the value read is either one or zero. If a thread checks a given pflag[ ] element (line 23) before it has been assigned the value zero, it then executes lines 25–28. If, in the meantime, another thread assigns zero to that same pflag[ ] element (line 26), the final result is not changed. Essentially, this means that the first thread executed lines 25–28 unnecessarily, but the final result is the same.
A group of threads call check_bad_array() concurrently to check whether any element of array data_array is “bad”. Each thread checks a different section of the array. If a thread finds that an element is bad, it sets the value of a global shared variable is_bad to true.
20 volatile int is_bad = 0; ... 100 /* 101 * Each thread checks its assigned portion of data_array, and sets 102 * the global flag is_bad to 1 once it finds a bad data element. 103 */ 104 void check_bad_array(volatile data_t *data_array, unsigned int thread_id) 105 { 106 int i; 107 for (i=my_start(thread_id); i<my_end(thread_id); i++) { 108 if (is_bad) 109 return; 110 else { 111 if (is_bad_element(data_array[i])) { 112 is_bad = 1; 113 return; 114 } 115 } 116 } 117 }
There is a data race between the read of is_bad on line 108 and the write to is_bad on line 112. However, the data race does not affect the correctness of the final result.
The initial value of is_bad is zero. When the threads update is_bad, they assign it the value one. That is, the threads store one in the same bit in the same byte of memory for is_bad. On current architectures, it is safe to assume that those stores are atomic. Therefore, when is_bad is read by a thread, the value read will either be zero or one. If a thread checks is_bad (line 108) before it has been assigned the value one, then it continues executing the for loop. If, in the meantime, another thread has assigned the value one to is_bad (line 112), that does not change the final result. It just means that the thread executed the for loop longer than necessary.
A singleton ensures that only one object of a certain type exists throughout the program. Double-checked locking is a common, efficient way to initialize a singleton in multithreaded applications. The following code illustrates such an implementation.
100 class Singleton { 101 public: 102 static Singleton* instance(); 103 ... 104 private: 105 static Singleton* ptr_instance; 106 }; 107 108 Singleton* Singleton::ptr_instance = 0; ... 200 Singleton* Singleton::instance() { 201 Singleton *tmp; 202 if (ptr_instance == 0) { 203 Lock(); 204 if (ptr_instance == 0) { 205 tmp = new Singleton; 206 207 /* Make sure that all writes used to construct new 208 Singleton have been completed. */ 209 memory_barrier(); 210 211 /* Update ptr_instance to point to new Singleton. */ 212 ptr_instance = tmp; 213 214 } 215 Unlock(); 216 } 217 return ptr_instance;
The read of ptr_instance on line 202 is intentionally not protected by a lock. This makes the check to determine whether or not the Singleton has already been instantiated in a multithreaded environment more efficient. Notice that there is a data race on variable ptr_instance between the read on line 202 and the write on line 212, but the program works correctly. However, writing a correct program that allows data races requires extra care. For example, in the above double-checked-locking code, the call to memory_barrier() at line 209 is used to ensure that ptr_instance is not seen to be non-null by the threads until all writes to construct the Singleton have been completed.