Oracle® Solaris Studio 12.4: Thread Analyzer User's Guide

Exit Print View

Updated: December 2014
 
 

Fix the Bug, Not the Data Race

Thread Analyzer can help find data races in the program, but it cannot automatically find bugs in the program nor suggest ways to fix the data races found. A data race might have been introduced by a bug. It is important to find and fix the bug. Merely removing the data race is not the right approach, and could make further debugging even more difficult.

Fixing Bugs in prime_omp.c

This section describes how to fix the bug in prime_omp.c. See Source Code for prime_omp.c for a complete file listing.

Move lines 50 and 51 into a critical section in order to remove the data race on elements of the array primes[ ].

47      #pragma omp parallel for
48      for (i = 2; i < N; i++) {
49          if ( is_prime(i) ) {
                 #pragma omp critical

                 {
50                  primes[total] = i;
51                  total++;
                 }
52          }
53     }

You could also move lines 50 and 51 into two critical sections as follows, but this change fails to correct the program:

47      #pragma omp parallel for
48      for (i = 2; i < N; i++) {
49          if ( is_prime(i) ) {
                 #pragma omp critical
                 {
50                  primes[total] = i;
                 }
                 #pragma omp critical
                 {
51                  total++;
                 }
52          }
53     }

The critical sections around lines 50 and 51 get rid of the data race because the threads are using mutual exclusive locks to control their accesses to the primes[ ] array. However, the program is still incorrect. Two threads might update the same element of primes[ ] using the same value of total, and some elements of primes[ ] might not be assigned a value at all.

The second data race, between a read from pflag[ ] from line 23 and a write to pflag[ ] from line 26, is actually a benign race because it does not lead to incorrect results. It is not essential to fix benign data races.

Fixing Bugs in prime_pthr.c

This section describes how to fix the bug in prime_pthr.c. See Source Code for prime_pthr.c for a complete file listing.

Use a single mutex to remove the data race on prime[ ] at line 44, as well as the data race on total at line 45.

The data race between the write to i on line 60 and the read from the same memory location (named *arg) on line 40, as well as the data race on pflag[ ] on line 27, reveal a problem in the shared access to the variable i by different threads. The initial thread in prime_pthr.c creates the child threads in a loop in lines 60-62, and dispatches them to work on the function work(). The loop index i is passed to work() by address. Since all threads access the same memory location for i, the value of i for each thread will not remain unique, but will change as the initial thread increments the loop index. As different threads use the same value of i, the data races occur. One way to fix the problem is to pass i to work() by value, instead of by address.

The following code is the corrected version of prime_pthr.c:

  1  /*
  2   * Copyright (c) 2006, 2010, Oracle and/or its affiliates. All Rights Reserved.
  3   * @(#)prime_pthr_fixed.c 1.3 (Oracle) 10/03/26
  4   */
  5
  6  #include <stdio.h>
  7  #include <math.h>
  8  #include <pthread.h>
  9
 10  #define THREADS 4
 11  #define N 10000
 12
 13  int primes[N];
 14  int pflag[N];
 15  int total = 0;
 16  pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 17
 18  int is_prime(int v)
 19  {
 20      int i;
 21      int bound = floor(sqrt(v)) + 1;
 22
 23      for (i = 2; i < bound; i++) {
 24          /* no need to check against known composites */
 25          if (!pflag[i])
 26              continue;
 27          if (v % i == 0) {
 28              pflag[v] = 0;
 29              return 0;
 30          }
 31      }
 32      return (v > 1);
 33  }
 34
 35  void *work(void *arg)
 36  {
 37      int start;
 38      int end;
 39      int i;
 40
 41      start = (N/THREADS) * ((int)arg) ;
 42      end = start + N/THREADS;
 43      for (i = start; i < end; i++) {
 44          if ( is_prime(i) ) {
 45              pthread_mutex_lock(&mutex);
 46              primes[total] = i;
 47              total++;
 48              pthread_mutex_unlock(&mutex);
 49          }
 50      }
 51      return NULL;
 52  }
 53
 54  int main(int argn, char **argv)
 55  {
 56      int i;
 57      pthread_t tids[THREADS-1];
 58
 59      for (i = 0; i < N; i++) {
 60          pflag[i] = 1;
 61      }
 62
 63      for (i = 0; i < THREADS-1; i++) {
 64          pthread_create(&tids[i], NULL, work, (void *)i);
 65      }
 66
 67      i = THREADS-1;
 68      work((void *)i);
 69
 70      for (i = 0; i < THREADS-1; i++) {
 71          pthread_join(tids[i], NULL);
 72      }
 73
 74      printf("Number of prime numbers between 2 and %d: %d\n",
 75             N, total);
 76
 77      return 0;
 78  }