Oracle Solaris Studio 12.2:线程分析器用户指南

第 3 章 死锁教程

本教程介绍如何使用线程分析器在多线程程序中检测潜在的死锁和实际的死锁。

本教程涵盖以下主题:

3.1 关于死锁

术语死锁描述的是两个或多个线程由于相互等待而永远被阻塞的情况。导致死锁的原因有很多,例如程序逻辑错误以及不恰当使用同步(如锁和屏障)。本教程重点介绍由于不恰当使用互斥锁而导致的死锁。此类死锁通常发生在多线程应用程序中。

以下三个条件成立时,包含两个或多个线程的进程可能会进入死锁状态:

以下是一个死锁情况的简单示例:

死锁可分为两种类型:潜在死锁或实际死锁,二者的区别如下:

3.2 获取死锁教程源文件

本教程中使用的 din_philo.c 源文件位于目录 /opt/solstudio12.2/prod/examples/tha/din_philo(Oracle Solaris 系统)或 /opt/oracle/solstudio12.2/prod/examples/tha/din_philo(Linux 系统)下。该示例目录包含 MakefileDEMO 说明文件,但是本教程并不遵循这些说明,也不使用 Makefile。本教程将逐步指导您执行命令。

为了学习本教程,可以将 din_philo.c 文件从示例目录复制到其他目录,也可以创建自己的文件并从下面列出的代码内容中复制代码。

模拟哲学家就餐问题的 din_philo.c 样例程序是一个使用 POSIX 线程的 C 程序。该程序可以同时展示潜在死锁和实际死锁。

3.2.1 din_philo.c 的源代码内容

din_philo.c 的源代码如下所示:

  1  /*
  2   * Copyright (c) 2006, 2010, Oracle and/or its affiliates. All Rights Reserved.
  3   * @(#)din_philo.c 1.4 (Oracle) 10/03/26
  4   */
  5
  6  #include <pthread.h>
  7  #include <stdio.h>
  8  #include <unistd.h>
  9  #include <stdlib.h>
 10  #include <errno.h>
 11  #include <assert.h>
 12
 13  #define PHILOS 5
 14  #define DELAY 5000
 15  #define FOOD 100
 16
 17  void *philosopher (void *id);
 18  void grab_chopstick (int,
 19                       int,
 20                       char *);
 21  void down_chopsticks (int,
 22                        int);
 23  int food_on_table ();
 24
 25  pthread_mutex_t chopstick[PHILOS];
 26  pthread_t philo[PHILOS];
 27  pthread_mutex_t food_lock;
 28  int sleep_seconds = 0;
 29
 30
 31  int
 32  main (int argn,
 33        char **argv)
 34  {
 35      int i;
 36
 37      if (argn == 2)
 38          sleep_seconds = atoi (argv[1]);
 39
 40      pthread_mutex_init (&food_lock, NULL);
 41      for (i = 0; i < PHILOS; i++)
 42          pthread_mutex_init (&chopstick[i], NULL);
 43      for (i = 0; i < PHILOS; i++)
 44          pthread_create (&philo[i], NULL, philosopher, (void *)i);
 45      for (i = 0; i < PHILOS; i++)
 46          pthread_join (philo[i], NULL);
 47      return 0;
 48  }
 49
 50  void *
 51  philosopher (void *num)
 52  {
 53      int id;
 54      int i, left_chopstick, right_chopstick, f;
 55
 56      id = (int)num;
 57      printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
 58      right_chopstick = id;
 59      left_chopstick = id + 1;
 60
 61      /* Wrap around the chopsticks. */
 62      if (left_chopstick == PHILOS)
 63          left_chopstick = 0;
 64
 65      while (f = food_on_table ()) {
 66
 67          /* Thanks to philosophers #1 who would like to take a nap
 68           * before picking up the chopsticks, the other philosophers
 69           * may be able to eat their dishes and not deadlock.
 70           */
 71          if (id == 1)
 72              sleep (sleep_seconds);
 73
 74          grab_chopstick (id, right_chopstick, "right ");
 75          grab_chopstick (id, left_chopstick, "left");
 76
 77          printf ("Philosopher %d: eating.\n", id);
 78          usleep (DELAY * (FOOD - f + 1));
 79          down_chopsticks (left_chopstick, right_chopstick);
 80      }
 81
 82      printf ("Philosopher %d is done eating.\n", id);
 83      return (NULL);
 84  }
 85
 86  int
 87  food_on_table ()
 88  {
 89      static int food = FOOD;
 90      int myfood;
 91
 92      pthread_mutex_lock (&food_lock);
 93      if (food > 0) {
 94          food--;
 95      }
 96      myfood = food;
 97      pthread_mutex_unlock (&food_lock);
 98      return myfood;
 99  }
100
101  void
102  grab_chopstick (int phil,
103                  int c,
104                  char *hand)
105  {
106      pthread_mutex_lock (&chopstick[c]);
107      printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
108  }
109
110  void
111  down_chopsticks (int c1,
112                   int c2)
113  {
114      pthread_mutex_unlock (&chopstick[c1]);
115      pthread_mutex_unlock (&chopstick[c2]);
116  }

3.3 哲学家就餐方案

哲学家就餐方案是一个结构如下的经典方案。编号从 0 到 4 的五位哲学家坐在圆桌旁思考。随着时间的推移,每个人开始饥饿并决定进餐。餐桌上有一盘面条,但是每位哲学家只有一根筷子可用。为了吃到面条,他们必须共用筷子。每位哲学家左边(他们面向餐桌而坐)的筷子的编号与该哲学家的编号相同。

图 3–1 哲学家就餐

显示哲学家和他们的筷子处于环状结构的一个图。

每位哲学家首先拿到与自己编号相同的筷子。他拿到分配给自己的筷子后,他将去拿分配给邻座的筷子。拿到两根筷子后,他就可以进餐了。吃完后,他将筷子放回桌上的原来位置,一边一根。该过程一直重复,直到把面条吃完。

3.3.1 哲学家如何发生死锁

每位哲学家都拿到自己的筷子并等待使用其邻座的筷子时,就会发生实际的死锁。

在这种情况下,没人可以吃到东西,于是哲学家们陷入了死锁之中。多次运行该程序,您将看到该程序有时会挂起,而有时又可以运行至完成。该程序挂起的情况如以下运行过程样例所示:


prompt% cc din_philo.c
prompt% a.out
Philosopher 0 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 2: got left chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 4 is done thinking and now ready to eat.
Philosopher 4: got right  chopstick 4
Philosopher 2: eating.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 1 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 1: got right  chopstick 1
(hang)

Execution terminated by pressing CTRL-C

3.3.2 为 1 号哲学家引入一段休眠时间

一种避免死锁的方法是让 1 号哲学家在拿起他的筷子之前等待一段时间。就代码而言,可让他在拿起自己的筷子之前休眠指定的一段时间 (sleep_seconds)。如果他休眠的时间足够长,该程序就可能会完成而不发生任何实际死锁。可以将他休眠的秒数指定为可执行程序的参数。如果不指定参数,该哲学家就不会休眠。

以下伪代码显示了每位哲学家的逻辑:

   while (there is still food on the table)
      {
        if (sleep argument is specified and I am philosopher #1)
           {
             sleep specified amount of time
           }

        grab right fork
        grab left fork
        eat some food
        put down left fork
        put down right fork 
      }

下面列出了该程序的一次运行,在这次运行中 1 号哲学家在拿起他的筷子之前等待 30 秒。该程序运行至完成,所有五位哲学家都吃到了面条。


% a.out 30
Philosopher 0 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 4 is done thinking and now ready to eat.
Philosopher 4: got right  chopstick 4
Philosopher 3 is done thinking and now ready to eat.
Philosopher 3: got right  chopstick 3
Philosopher 0: eating.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 1 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
...
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0 is done eating.
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 4 is done eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 3 is done eating.
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 2 is done eating.
Philosopher 1: got right  chopstick 1
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 1 is done eating.
%

Execution terminated normally

尝试多次运行该程序,并指定不同的休眠参数。如果 1 号哲学家在拿起他的筷子之前仅等待很短的一段时间,那么发生什么情况呢?如果他等待的时间更长一些,又会怎么样?尝试为可执行程序 a.out 指定不同的休眠参数。在使用或不使用休眠参数的情况下多次重新运行该程序。该程序有时会挂起,而有时又可以运行至完成。程序是否挂起取决于线程的调度情况以及线程进行锁请求的时间。

3.4 如何使用线程分析器查找死锁

可以使用线程分析器检查程序中的潜在死锁和实际死锁。线程分析器沿用与 Oracle Solaris Studio 性能分析器相同的“收集-分析”模型。

使用线程分析器的过程涉及三个步骤:

3.4.1 编译源代码

编译代码并务必指定 -g。不要指定高优化级别,因为在高优化级别时报告的信息(如行号和调用栈)可能是错误的。应使用 -g -xopenmp=noopt 编译 OpenMP 程序,并仅使用 -g -mt 编译 POSIX 线程程序。

有关这些选项的更多信息,请参见 cc(1)、CC(1) 或 f95(1) 手册页。

对于本教程,请使用以下命令编译代码:


% cc -g -o din_philo din_philo.c

3.4.2 创建死锁检测实验

使用线程分析器的带有 -r deadlock 选项的 collect 命令。此选项将在程序的执行期间创建死锁检测实验。

对于本教程,请使用以下命令创建名为 din_philo.1.er 的死锁检测实验:


% collect -r deadlock -o din_philo.1.er din_philo

可以通过创建多个死锁检测实验来提高检测到死锁的可能性。对于各个实验,应使用不同的线程数和不同的输入数据。例如,在 din_philo.c 代码中,可以更改以下行中的值:

    13  #define PHILOS 5
    14  #define DELAY 5000
    15  #define FOOD 100

然后可以像以前一样进行编译,并收集其他实验。

有关更多信息,请参见 collect(1) 和 collector(1) 手册页。

3.4.3 检查死锁检测实验

可以使用线程分析器、性能分析器或 er_print 实用程序检查死锁检测实验。线程分析器和性能分析器都提供 GUI 界面;线程分析器显示的是一组简化的缺省标签,但在其他方面与性能分析器完全相同。

3.4.3.1 使用线程分析器查看死锁检测实验

要启动线程分析器并打开 din_philo.1.er 实验,请键入以下命令:


% tha din_philo.1.er

线程分析器包含菜单栏、工具栏以及包含多个标签的拆分窗格(不同标签对应不同的显示)。

打开收集的死锁检测实验时,缺省情况下会在左侧窗格中显示以下标签:

线程分析器显示屏的右侧窗格中会显示以下标签:

3.4.3.2 使用 er_print 查看死锁检测实验

er_print 实用程序提供命令行界面。可以在交互式会话中使用 er_print 实用程序并在该会话期间指定子命令。也可以使用命令行选项以非交互方式指定子命令。

使用 er_print 实用程序检查死锁时,以下子命令非常有用:

有关更多信息,请参阅 collect(1)、tha(1)、analyzer(1) 和 er_print(1) 手册页。

3.5 了解死锁实验结果

本节说明如何使用线程分析器检查哲学家就餐程序中的死锁。

3.5.1 检查出现死锁的运行

下面列出了导致实际死锁的哲学家就餐程序的一次运行。


% cc -g -o din_philo din_philo.c
% collect -r deadlock -o din_philo.1.er din_philo
Creating experiment database din_philo.1.er ...
Philosopher 1 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 0 is done thinking and now ready to eat.
Philosopher 1: got right  chopstick 1
Philosopher 3: got right  chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 1: got left chopstick 2
Philosopher 3: got left chopstick 4
Philosopher 4 is done thinking and now ready to eat.
Philosopher 1: eating.
Philosopher 3: eating.
Philosopher 3: got right  chopstick 3
Philosopher 4: got right  chopstick 4
Philosopher 2: got right  chopstick 2
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 1: got right  chopstick 1
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 3: got right  chopstick 3
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 2: got right  chopstick 2
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 1: got right  chopstick 1
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
...
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 2: got right  chopstick 2
Philosopher 3: got right  chopstick 3
(hang)

Execution terminated by pressing CTRL-C

键入以下命令以使用 er_print 实用程序检查实验:


% er_print din_philo.1.er
(er_print) deadlocks
Deadlock #1, Potential deadlock
  Thread #2
    Lock being held:        0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #3
    Lock being held:        0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #4
    Lock being held:        0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #5
    Lock being held:        0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #6
    Lock being held:        0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"

Deadlock #2, Actual deadlock
  Thread #2
    Lock being held:        0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #3
    Lock being held:        0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #4
    Lock being held:        0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #5
    Lock being held:        0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #6
    Lock being held:        0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested:   0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"

Deadlocks List Summary: Experiment: din_philo.1.er Total Deadlocks: 2
(er_print)

以下屏幕抓图显示了线程分析器中显示的死锁信息。

图 3–2 din_philo.c 中检测到的死锁

线程分析器窗口的屏幕抓图,显示了实际的死锁。

线程分析器报告了 din_philo.c 的两个死锁,一个是潜在死锁,另一个是实际死锁。通过进一步检查,可以发现两个死锁是相同的。

该死锁涉及的循环链如下:

线程 2:在地址 0x215a0 持有锁,在地址 0x215b8 请求锁

线程 3:在地址 0x215b8 持有锁,在地址 0x215d0 请求锁

线程 4:在地址 0x215d0 持有锁,在地址 0x215e8 请求锁

线程 5:在地址 0x215e8 持有锁,在地址 0x21600 请求锁

线程 6:在地址 0x21600 持有锁,在地址 0x215a0 请求锁

选择循环链中的第一个线程(线程 2),然后单击 "Dual Source"(双源)标签,可以在源代码中看到,线程 2 在地址 0x215a0 持有锁,在源代码中也可以看到,该线程在地址 0x215b8 请求锁。以下屏幕抓图显示了对应于线程 2 的 "Dual Source"(双源)标签。每个源代码行的左侧会显示缺省的度量:"Exclusive Deadlocks"(互斥死锁)度量。此度量显示了该行上报告的(死锁涉及的)锁持有或锁请求操作的总次数。

图 3–3 din_philo.c 中的潜在死锁

线程分析器 "Dual Source"(双源)标签的屏幕抓图,显示了潜在的死锁。

3.5.2 检查存在潜在死锁但仍可完成的运行

如果提供足够大的休眠参数,哲学家就餐程序就可避免实际死锁并正常终止。但是,正常终止并不意味着程序可以完全避免死锁。它仅仅表示在给定的一次运行中,持有和请求的锁未形成死锁链。如果在其他运行中出现了时间变化,则仍可能会发生实际死锁。下面列出了由于引入 40 秒休眠时间而正常终止的哲学家就餐程序的一次运行。但是,er_print 实用程序和线程分析器会报告潜在死锁。


% cc -g -o din_philo_pt din_philo.c
% collect -r deadlock -o din_philo_pt.1.er din_philo_pt 40
Creating experiment database tha.2.er ...
Philosopher 0 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 1 is done thinking and now ready to eat.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 3: got right  chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 4 is done thinking and now ready to eat.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
...
Philosopher 4: got right  chopstick 4
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 4 is done eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3 is done eating.
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 0 is done eating.
Philosopher 2 is done eating.
Philosopher 1: got right  chopstick 1
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 1 is done eating.
%

Execution terminated normally

键入下列以粗体显示的命令以使用 er_print 实用程序检查实验:


% er_print din_philo_pt.1.er
(er_print) deadlocks
Deadlock #1, Potential deadlock
  Thread #2
    Lock being held:      0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #3
    Lock being held:      0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #4
    Lock being held:      0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #5
    Lock being held:      0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #6
    Lock being held:      0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"

Deadlocks List Summary: Experiment: din_philo_pt.1.er Total Deadlocks: 1
(er_print)
Deadlock #1, Potential deadlock
  Thread #2
    Lock being held:      0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #3
    Lock being held:      0x215b8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #4
    Lock being held:      0x215d0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #5
    Lock being held:      0x215e8, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
  Thread #6
    Lock being held:      0x21600, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"
    Lock being requested: 0x215a0, at: grab_chopstick + 0x0000002C, line 106 in "din_philo.c"

Deadlocks List Summary: Experiment: din_philo_pt.1.er Total Deadlocks: 1
(er_print)

以下屏幕抓图显示了线程分析器界面中的潜在死锁信息。

图 3–4 din_philo.c 中的潜在死锁

线程分析器窗口的屏幕抓图,显示了潜在的死锁。

3.6 修复死锁和了解误报

一种消除潜在死锁和实际死锁的方法是使用一种令牌机制,该机制要求哲学家必须收到令牌后才能尝试进餐。可用令牌的数量必须少于餐桌前哲学家的人数。当哲学家收到令牌后,他可以根据餐桌规则尝试进餐。每位哲学家在进餐后归还令牌,然后重复该过程。以下伪代码显示了使用令牌机制时每位哲学家的逻辑:

   while (there is still food on the table)
      {
        get token
        grab right fork
        grab left fork
        eat some food
        put down left fork
        put down right fork
        return token 
      }

以下几节将详细介绍令牌机制的两种不同实现。

3.6.1 使用令牌控制哲学家

下面列出了使用令牌机制的哲学家就餐程序的修正版本。该解决方案结合了四个令牌,比就餐人数少一个,因此最多有四位哲学家可同时尝试进餐。该版本的程序称为 din_philo_fix1.c


提示 –

/opt/solstudio12.2/prod/examples/tha/din_philo/opt/oracle/solstudio12.2/prod/examples/tha/din_philo 目录复制 din_philo_fix1.c


     1  /*
     2   * Copyright (c) 2006, 2010, Oracle and/or its affiliates. All Rights Reserved.
     3   * @(#)din_philo_fix1.c 1.3 (Oracle) 10/03/26
     4   */
     5
     6  #include <pthread.h>
     7  #include <stdio.h>
     8  #include <unistd.h>
     9  #include <stdlib.h>
    10  #include <errno.h>
    11  #include <assert.h>
    12
    13  #define PHILOS 5
    14  #define DELAY 5000
    15  #define FOOD 100
    16
    17  void *philosopher (void *id);
    18  void grab_chopstick (int,
    19                       int,
    20                       char *);
    21  void down_chopsticks (int,
    22                        int);
    23  int food_on_table ();
    24  void get_token ();
    25  void return_token ();
    26
    27  pthread_mutex_t chopstick[PHILOS];
    28  pthread_t philo[PHILOS];
    29  pthread_mutex_t food_lock;
    30  pthread_mutex_t num_can_eat_lock;
    31  int sleep_seconds = 0;
    32  uint32_t num_can_eat = PHILOS - 1;
    33
    34
    35  int
    36  main (int argn,
    37        char **argv)
    38  {
    39      int i;
    40
    41      pthread_mutex_init (&food_lock, NULL);
    42      pthread_mutex_init (&num_can_eat_lock, NULL);
    43      for (i = 0; i < PHILOS; i++)
    44          pthread_mutex_init (&chopstick[i], NULL);
    45      for (i = 0; i < PHILOS; i++)
    46          pthread_create (&philo[i], NULL, philosopher, (void *)i);
    47      for (i = 0; i < PHILOS; i++)
    48          pthread_join (philo[i], NULL);
    49      return 0;
    50  }
    51
    52  void *
    53  philosopher (void *num)
    54  {
    55      int id;
    56      int i, left_chopstick, right_chopstick, f;
    57
    58      id = (int)num;
    59      printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
    60      right_chopstick = id;
    61      left_chopstick = id + 1;
    62
    63      /* Wrap around the chopsticks. */
    64      if (left_chopstick == PHILOS)
    65          left_chopstick = 0;
    66
    67      while (f = food_on_table ()) {
    68          get_token ();
    69
    70          grab_chopstick (id, right_chopstick, "right ");
    71          grab_chopstick (id, left_chopstick, "left");
    72
    73          printf ("Philosopher %d: eating.\n", id);
    74          usleep (DELAY * (FOOD - f + 1));
    75          down_chopsticks (left_chopstick, right_chopstick);
    76
    77          return_token ();
    78      }
    79
    80      printf ("Philosopher %d is done eating.\n", id);
    81      return (NULL);
    82  }
    83
    84  int
    85  food_on_table ()
    86  {
    87      static int food = FOOD;
    88      int myfood;
    89
    90      pthread_mutex_lock (&food_lock);
    91      if (food > 0) {
    92          food--;
    93      }
    94      myfood = food;
    95      pthread_mutex_unlock (&food_lock);
    96      return myfood;
    97  }
    98
    99  void
   100  grab_chopstick (int phil,
   101                  int c,
   102                  char *hand)
   103  {
   104      pthread_mutex_lock (&chopstick[c]);
   105      printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
   106  }
   107
   108
   109
   110  void
   111  down_chopsticks (int c1,
   112                   int c2)
   113  {
   114      pthread_mutex_unlock (&chopstick[c1]);
   115      pthread_mutex_unlock (&chopstick[c2]);
   116  }
   117
   118
   119  void
   120  get_token ()
   121  {
   122      int successful = 0;
   123
   124      while (!successful) {
   125          pthread_mutex_lock (&num_can_eat_lock);
   126          if (num_can_eat > 0) {
   127              num_can_eat--;
   128              successful = 1;
   129          }
   130          else {
   131              successful = 0;
   132          }
   133          pthread_mutex_unlock (&num_can_eat_lock);
   134      }
   135  }
   136
   137  void
   138  return_token ()
   139  {
   140      pthread_mutex_lock (&num_can_eat_lock);
   141      num_can_eat++;
   142      pthread_mutex_unlock (&num_can_eat_lock);
   143  }

尝试编译这一修正版本的哲学家就餐程序,并多次运行该程序。令牌机制限制了尝试使用筷子的就餐人数,从而可避免实际死锁和潜在死锁。

要进行编译,请使用以下命令:


cc -g -o din_philo_fix1 din_philo_fix1.c

收集实验:


collect -r deadlock din_philo_fix1 -o din_philo_fix1.1.er

3.6.1.1 误报的报告

即使是使用令牌机制,线程分析器也会报告该实现存在潜在死锁(虽然并不存在死锁)。这是一个误报。请参考以下包含潜在死锁详细信息的屏幕抓图。

图 3–5 误报的潜在死锁

线程分析器窗口的屏幕抓图,显示了线程 2 中的死锁。

选择循环链中的第一个线程(线程 2),然后单击 "Dual Source"(双源)标签,可以看到线程 2 在地址 0x216a8 持有锁的对应源代码位置,以及该线程在 0x216c0 请求锁的对应源代码位置。下图显示了对应于线程 2 的 "Dual Source"(双源)标签。

图 3–6 误报的潜在死锁的源代码

线程分析器 "Dual Source"(双源)标签的屏幕抓图,显示了潜在的死锁。

din_philo_fix1.c 中的 get_token() 函数使用 while 循环来同步线程。线程在成功获得令牌之前不会离开 while 循环(当 num_can_eat 大于 0 时可成功获得令牌)。while 循环将同时就餐的人数限制为 4。但是,线程分析器无法识别由 while 循环实现的同步。它认为所有五位哲学家同时尝试抓取筷子并进餐,因此报告了一个潜在死锁。下一节将详细说明如何使用线程分析器可识别的同步来限制同时就餐的人数。

3.6.2 另一种令牌机制

下面列出了令牌机制的另一种实现。该实现仍然使用四个令牌,所以最多有四个就餐者可同时尝试进餐。但是,该实现使用 sem_wait()sem_post() 信号例程限制就餐的哲学家人数。该版本的源文件称为 din_philo_fix2.c


提示 –

/opt/sunstudio12.2/prod/examples/tha/din_philo/opt/oracle/solstudio12.2/prod/examples/tha/din_philo 目录复制 din_philo_fix2.c


下面列出了详细的 din_philo_fix2.c

     1	/* 
     2	 * Copyright (c) 2006, 2010, Oracle and/or its affiliates. All Rights Reserved.
     3	 * @(#)din_philo_fix2.c 1.3 (Oracle) 10/03/26
     4	 */
     5	
     6	#include <pthread.h>
     7	#include <stdio.h>
     8	#include <unistd.h>
     9	#include <stdlib.h>
    10	#include <errno.h>
    11	#include <assert.h>
    12	#include <semaphore.h>
    13	
    14	#define PHILOS 5
    15	#define DELAY 5000
    16	#define FOOD 100
    17	
    18	void *philosopher (void *id);
    19	void grab_chopstick (int,
    20	                     int,
    21	                     char *);
    22	void down_chopsticks (int,
    23	                      int);
    24	int food_on_table ();
    25	void get_token ();
    26	void return_token ();
    27	
    28	pthread_mutex_t chopstick[PHILOS];
    29	pthread_t philo[PHILOS];
    30	pthread_mutex_t food_lock;
    31	int sleep_seconds = 0;
    32	sem_t num_can_eat_sem;
    33	
    34	
    35	int
    36	main (int argn,
    37	      char **argv)
    38	{
    39	    int i;
    40	
    41	    pthread_mutex_init (&food_lock, NULL);
    42	    sem_init(&num_can_eat_sem, 0, PHILOS - 1);
    43	    for (i = 0; i < PHILOS; i++)
    44	        pthread_mutex_init (&chopstick[i], NULL);
    45	    for (i = 0; i < PHILOS; i++)
    46	        pthread_create (&philo[i], NULL, philosopher, (void *)i);
    47	    for (i = 0; i < PHILOS; i++)
    48	        pthread_join (philo[i], NULL);
    49	    return 0;
    50	}
    51	
    52	void *
    53	philosopher (void *num)
    54	{
    55	    int id;
    56	    int i, left_chopstick, right_chopstick, f;
    57	
    58	    id = (int)num;
    59	    printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
    60	    right_chopstick = id;
    61	    left_chopstick = id + 1;
    62	
    63	    /* Wrap around the chopsticks. */
    64	    if (left_chopstick == PHILOS)
    65	        left_chopstick = 0;
    66	
    67	    while (f = food_on_table ()) {
    68	        get_token ();
    69	
    70	        grab_chopstick (id, right_chopstick, "right ");
    71	        grab_chopstick (id, left_chopstick, "left");
    72	
    73	        printf ("Philosopher %d: eating.\n", id);
    74	        usleep (DELAY * (FOOD - f + 1));
    75	        down_chopsticks (left_chopstick, right_chopstick);
    76	
    77	        return_token ();
    78	    }
    79	
    80	    printf ("Philosopher %d is done eating.\n", id);
    81	    return (NULL);
    82	}
    83	
    84	int
    85	food_on_table ()
    86	{
    87	    static int food = FOOD;
    88	    int myfood;
    89	
    90	    pthread_mutex_lock (&food_lock);
    91	    if (food > 0) {
    92	        food--;
    93	    }
    94	    myfood = food;
    95	    pthread_mutex_unlock (&food_lock);
    96	    return myfood;
    97	}
    98	
    99	void
   100	grab_chopstick (int phil,
   101	                int c,
   102	                char *hand)
   103	{
   104	    pthread_mutex_lock (&chopstick[c]);
   105	    printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
   106	}
   107	
   108	void
   109	down_chopsticks (int c1,
   110	                 int c2)
   111	{
   112	    pthread_mutex_unlock (&chopstick[c1]);
   113	    pthread_mutex_unlock (&chopstick[c2]);
   114	}
   115	
   116	
   117	void
   118	get_token ()
   119	{
   120	    sem_wait(&num_can_eat_sem);
   121	}
   122	
   123	void
   124	return_token ()
   125	{
   126	    sem_post(&num_can_eat_sem);
   127	}

这一新实现使用信号 num_can_eat_sem 来限制可同时就餐的哲学家人数。信号 num_can_eat_sem 初始化为 4,比哲学家人数少一个。尝试进餐前,哲学家调用 get_token(),后者又调用 sem_wait(&num_can_eat_sem)。调用 sem_wait() 会导致执行调用的哲学家进入等待状态,直到信号值变为正值,然后将信号值减 1。哲学家进餐完毕后,他将调用 return_token(),后者又调用 sem_post(&num_can_eat_sem)。调用 sem_post() 可将信号值加 1。线程分析器可识别对 sem_wait()sem_post() 的调用,并判定并非所有的哲学家都同时尝试进餐。


注 –

必须使用 -lrt 编译 din_philo_fix2.c 以链接相应的信号例程。


要编译 din_philo_fix2.c,请使用以下命令:


cc -g -lrt -o din_philo_fix2 din_philo_fix2.c

如果多次运行程序 din_philo_fix2 的这一新实现,将会发现它每次都正常终止,不会挂起。

在这一新的二进制代码上创建实验:


collect -r deadlock -o din_philo_fix2.1.er din_philo_fix2

您将发现线程分析器不报告 din_philo_fix2.1.er 实验中的任何实际死锁或潜在死锁,如下图所示:

图 3–7 未报告 din_philo_fix2.c 中的任何死锁

线程分析器窗口的屏幕抓图,显示没有任何死锁。

有关线程分析器可识别的线程和内存分配 API 的列表,请参见附录 A