Sun Studio 12:线程分析器用户指南

第 3 章 死锁教程

本教程说明如何使用线程分析器在多线程程序中检测潜在的以及实际的死锁。术语“死锁”描述两个或更多线程由于互相等待而被永远阻塞(挂起)的情况。导致死锁的原因可能有多种,如错误的程序逻辑、不恰当使用同步和屏障等。此教程将重点讨论由于不恰当地使用互斥锁而造成的死锁。此类死锁通常出现在多线程应用程序中。以下三个条件成立时,具有两个或多个线程的进程可能会进入死锁状态:

以下是一个死锁情况的简单示例:

线程 1 持有锁 A 并请求锁 B 

线程 2 持有锁 B 并请求锁 A 

死锁可以分为两种类型:潜在死锁或实际死锁,二者的区别如下:

3.1 哲人进餐源文件

模拟哲人进餐问题的样例程序是一个使用 POSIX 线程的 C 程序。源文件名为 din_philo.c。该程序可以同时展示潜在死锁和实际死锁。以下是代码列表,并附有解释:

/* din_philo.c */    
     1  #include <pthread.h>
     2  #include <stdio.h>
     3  #include <unistd.h>
     4  #include <stdlib.h>
     5  #include <errno.h>
     6  #include <assert.h>
     7
     8  #define PHILOS 5
     9  #define DELAY 5000
    10  #define FOOD 50
    11
    12	void *philosopher (void *id);
    13	void grab_chopstick (int,
    14	                     int,
    15	                     char *);
    16	void down_chopsticks (int,
    17	                      int);
    18	int food_on_table ();
    19	
    20	pthread_mutex_t chopstick[PHILOS];
    21	pthread_t philo[PHILOS];
    22	pthread_mutex_t food_lock;
    23	int sleep_seconds = 0;
    24	
    25	
    26	int
    27	main (int argn,
    28	      char **argv)
    29	{
    30	    int i;
    31	
    32	    if (argn == 2)
    33	        sleep_seconds = atoi (argv[1]);
    34	
    35	    pthread_mutex_init (&food_lock, NULL);
    36	    for (i = 0; i < PHILOS; i++)
    37	        pthread_mutex_init (&chopstick[i], NULL);
    38	    for (i = 0; i < PHILOS; i++)
    39	        pthread_create (&philo[i], NULL, philosopher, (void *)i);
    40	    for (i = 0; i < PHILOS; i++)
    41	        pthread_join (philo[i], NULL);
    42	    return 0;
    43	}
    44	
    45	void *
    46	philosopher (void *num)
    47	{
    48	    int id;
    49	    int i, left_chopstick, right_chopstick, f;
    50	
    51	    id = (int)num;
    52	    printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
    53	    right_chopstick = id;
    54	    left_chopstick = id + 1;
    55	
    56	    /* Wrap around the chopsticks. */
    57	    if (left_chopstick == PHILOS)
    58	        left_chopstick = 0;
    59	
    60	    while (f = food_on_table ()) {
    61	
    62	        /* Thanks to philosophers #1 who would like to take a nap
    63	         * before picking up the chopsticks, the other philosophers
    64	         * may be able to eat their dishes and not deadlock.  
    65	         */
    66	        if (id == 1)
    67	            sleep (sleep_seconds);
    68	
    69	        grab_chopstick (id, right_chopstick, "right ");
    70	        grab_chopstick (id, left_chopstick, "left");
    71	
    72	        printf ("Philosopher %d: eating.\n", id);
    73	        usleep (DELAY * (FOOD - f + 1));
    74	        down_chopsticks (left_chopstick, right_chopstick);
    75	    }
    76	
    77	    printf ("Philosopher %d is done eating.\n", id);
    78	    return (NULL);
    79	}
    80	
    81	int
    82	food_on_table ()
    83	{
    84	    static int food = FOOD;
    85	    int myfood;
    86	
    87	    pthread_mutex_lock (&food_lock);
    88	    if (food > 0) {
    89	        food--;
    90	    }
    91	    myfood = food;
    92	    pthread_mutex_unlock (&food_lock);
    93	    return myfood;
    94	}
    95	
    96	void
    97	grab_chopstick (int phil,
    98	                int c,
    99	                char *hand)
   100	{
   101	    pthread_mutex_lock (&chopstick[c]);
   102	    printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
   103	}
   104	
   105	void
   106	down_chopsticks (int c1,
   107	                 int c2)
   108	{
   109	    pthread_mutex_unlock (&chopstick[c1]);
   110	    pthread_mutex_unlock (&chopstick[c2]);
   111	}

3.2 哲人进餐方案

哲人进餐方案是一个结构如下的经典方案:五个哲人,其编号为零到四,坐在圆桌旁思考。随着时间的推移,不同的个人感到饿了,决定去吃面条。桌子上有一个盛有面条的大浅盘,但是每个哲人只有一根筷子可以使用。为了吃面条,他们必须共用筷子。每个哲人左侧(哲人面向餐桌而坐)的筷子具有与该哲人相同的编号。

该图显示了圆桌旁的哲人和他们的筷子。

每个哲人首先去拿带有其编号的筷子。当他拿到指定给自己的筷子后,他将去拿指定给其邻桌的筷子。拿到两根筷子后,他就可以吃了。吃完后,他将筷子放回桌子上的原始位置,一侧放置一根。重复该过程,直到将面条吃完。

3.2.1 哲人怎样发生死锁

当每个哲人都持有他自己的筷子并等待邻座的筷子变为可用时,会发生实际死锁:

哲人 0 持有筷子 0,但等待筷子 1 

哲人 1 持有筷子 1,但等待筷子 2 

哲人 2 持有筷子 2,但等待筷子 3 

哲人 3 持有筷子 3,但等待筷子 4 

哲人 4 持有筷子 4,但等待筷子 0 

在这种情况下,没有人可以吃,这些哲人处于死锁状态。多次重新运行该程序,您将看到该程序可能有时挂起,或有时运行直到完成。

运行哲人进餐程序,并查看它完成还是发生死锁。它可能挂起,如以下样例运行所示:

prompt% cc din_phil.c -mt
prompt% a.out
Philosopher 0 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 2: got left chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 4 is done thinking and now ready to eat.
Philosopher 4: got right  chopstick 4
Philosopher 2: eating.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 1 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 1: got right  chopstick 1
(hang)

Execution terminated by pressing CTRL-C

3.2.2 为哲人 1 引入休眠时间

潜在死锁的一种可能解决方案是,哲人 1 在拿他的筷子之前先等待。就代码而言,可以让他在拿自己的筷子之前休眠指定的时间 (sleep_seconds)。如果他休眠的时间足够长,则程序可能完成而不会发生实际死锁。您可以将他休眠的秒数作为可执行程序的参数进行指定。如果您不指定参数,则哲人不休眠。

下列伪代码显示了每个哲人的逻辑:

   while (there is still food on the table)
      {
        if (sleep argument is specified and I am philosopher #1)
           {
             sleep specified amount of time
           }

        grab right fork
        grab left fork
        eat some food
        put down left fork
        put down right fork 
      }

以下列表说明,如果哲人 1 在拿自己的筷子前等待 30 秒,程序是如何运行的。程序运行直到完成,所有五个哲人都吃完了。

% a.out 30
Philosopher 0 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 4 is done thinking and now ready to eat.
Philosopher 4: got right  chopstick 4
Philosopher 3 is done thinking and now ready to eat.
Philosopher 3: got right  chopstick 3
Philosopher 0: eating.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 1 is done thinking and now ready to eat.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
...
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0 is done eating.
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 4 is done eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 3 is done eating.
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 2 is done eating.
Philosopher 1: got right  chopstick 1
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 1 is done eating.
%

Execution terminated normally

尝试运行几次该程序,指定不同的休眠参数。如果哲人在拿自己的筷子前仅等待很短的时间,会发生什么情况?如果他等待更长的时间,又会怎么样?尝试为可执行的 a.out 指定不同的休眠参数。将该程序重新运行几次,使用或不使用休眠参数。有时程序会挂起,有时则可以完成运行。程序是否挂起取决于线程的调度,以及线程请求锁定的时间。

3.3 如何使用线程分析器查找死锁

可以使用线程分析器检查程序中的潜在死锁和实际死锁。线程分析器沿用与 Sun Studio 性能分析器相同的“收集分析”模型。使用线程分析器包括以下三个步骤:

3.3.1 对源代码进行编译

编译代码,并确保指定了 -g。不要指定高优化级别,因为在高优化级别中可能会错误地报告行号和调用栈等信息。使用 -g -xopenmp=noopt 编译 OpenMP 程序,仅使用 -g -mt 编译 POSIX 线程程序。

有关更多信息,请参见 cc.1CC.1f95.1

3.3.2 创建死锁检测实验

使用线程分析器的带有 -r deadlock 选项的 collect 命令。此选项将在执行程序的过程中创建死锁检测实验。

可通过创建多个死锁检测实验提高检测到死锁的可能性。为不同的实验使用不同的线程数和不同的输入数据。

有关更多信息,请参见 collect.1collector.1

3.3.3 检查实验结果

可以使用 tha 命令、analyzer 命令或 er_print 实用程序检查死锁检测实验。当 er_print 使用命令行界面时,线程分析器分析器均代表 GUI 界面。

有关详细信息,请参见 tha.1analyzer.1er_print.1

3.3.3.1 线程分析器界面

线程分析器包含菜单栏、工具栏和包含各种选项卡的拆分窗格(不同选项卡对应不同的显示)。在左窗格上,缺省情况下显示以下三个选项卡:

3.3.3.2 er_print 界面

与左窗格相对应,右窗格包含了“死锁详细信息”选项卡,该选项卡中显示了“死锁”选项卡中选择的死锁的详细信息。使用 er_print 检查死锁的最有用的子命令如下:

有关更多信息,请参见 er_print.1

3.4 了解实验结果

本部分将详细说明如何使用线程分析器来检查哲人进餐程序中的死锁。首先将执行导致实际死锁的运行,然后将检查正常终止但有潜在死锁的运行。

3.4.1 检查出现死锁的运行

以下列表显示了导致实际死锁的哲人进餐程序的一个运行。

prompt% cc din_philo.c -mt -g             

prompt% collect -r deadlock a.out
Creating experiment database tha.1.er ...
Philosopher 1 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 0 is done thinking and now ready to eat.
Philosopher 1: got right  chopstick 1
Philosopher 3: got right  chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 1: got left chopstick 2
Philosopher 3: got left chopstick 4
Philosopher 4 is done thinking and now ready to eat.
Philosopher 1: eating.
Philosopher 3: eating.
Philosopher 3: got right  chopstick 3
Philosopher 4: got right  chopstick 4
Philosopher 2: got right  chopstick 2
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 1: got right  chopstick 1
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 3: got right  chopstick 3
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 2: got right  chopstick 2
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 1: got right  chopstick 1
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 3: got right  chopstick 3
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 2: got right  chopstick 2
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 1: got right  chopstick 1
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 2: got right  chopstick 2
Philosopher 3: got right  chopstick 3
(hang)

Execution terminated by pressing CTRL-C


% er_print tha.1.er 
(er_print) deadlocks

Deadlock #1, Potential deadlock
	Thread #2
		Lock being held:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #3
		Lock being held:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #4
		Lock being held:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #5
		Lock being held:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #6
		Lock being held:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"

Deadlock #2, Actual deadlock
	Thread #2
		Lock being held:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #3
		Lock being held:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #4
		Lock being held:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #5
		Lock being held:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #6
		Lock being held:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"

Deadlocks List Summary: Experiment: tha.1.er Total Deadlocks: 2
(er_print) 

下面的屏幕抓图给出了线程分析器中显示的死锁信息:

显示实际死锁的线程分析器窗口屏幕抓图。

线程分析器报告了 din_philo.c 的两个死锁,一个潜在死锁和一个实际死锁。通过进一步地检查,我们发现这两个死锁是相同的。死锁中涉及的循环链如下:

线程 2: 

在地址 0x215a8 占用锁,在地址 0x215c0 请求锁

线程 3: 

在地址 0x215c0 占用锁,在地址 0x215d8 请求锁

线程 4: 

在地址 0x215d8 占用锁,在地址 0x215f0 请求锁

线程 5: 

在地址 0x215f0 占用锁,在地址 0x21608 请求锁

线程 6: 

在地址 0x21608 占用锁 ,在地址 0x215a8 请求锁

选择链中的第一个线程(线程 #2),然后单击“双重数据源”选项卡,查看在源代码中的何处,线程 #2 占用锁(在地址 0x215a8),以及在何处请求锁(在地址 0x215c0)。以下屏幕抓图显示了 2 号线程的“双重数据源”选项卡。在每个源代码行的左侧显示了缺省度量(互斥死锁度量)。该度量提供(死锁中所涉及的)锁定占用或锁定请求操作在该代码行上报告的次数。

显示潜在死锁的线程分析器“双重数据源”选项卡屏幕抓图。

3.4.2 检查存在潜在死锁但仍可完成的运行

在提供足够大的休眠参数时,哲人进餐程序可避免实际死锁,并正常终止。但是,正常终止并不意味着程序可以完全避免死锁。它仅仅表示在一次特定的运行中,占用和请求的锁定不会构成死锁链。如果在其他运行中计时发生更改,则还会发生实际死锁。以下列表显示了正常终止的哲人进餐程序的一次运行。但是 er_print 实用程序和线程分析器会报告潜在死锁。

% cc din_philo.c -mt -g                


% collect -r deadlock a.out 40
Creating experiment database tha.2.er ...
Philosopher 0 is done thinking and now ready to eat.
Philosopher 2 is done thinking and now ready to eat.
Philosopher 1 is done thinking and now ready to eat.
Philosopher 3 is done thinking and now ready to eat.
Philosopher 2: got right  chopstick 2
Philosopher 3: got right  chopstick 3
Philosopher 0: got right  chopstick 0
Philosopher 4 is done thinking and now ready to eat.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got right  chopstick 0
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 4: got right  chopstick 4
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 0: got right  chopstick 0
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
...
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 4: got right  chopstick 4
Philosopher 3: got right  chopstick 3
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 0: got right  chopstick 0
Philosopher 3: got left chopstick 4
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3: eating.
Philosopher 0: got right  chopstick 0
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 4: got right  chopstick 4
Philosopher 3: got right  chopstick 3
Philosopher 2: got right  chopstick 2
Philosopher 4: got left chopstick 0
Philosopher 4: eating.
Philosopher 4 is done eating.
Philosopher 3: got left chopstick 4
Philosopher 3: eating.
Philosopher 0: got right  chopstick 0
Philosopher 0: got left chopstick 1
Philosopher 0: eating.
Philosopher 3 is done eating.
Philosopher 2: got left chopstick 3
Philosopher 2: eating.
Philosopher 0 is done eating.
Philosopher 2 is done eating.
Philosopher 1: got right  chopstick 1
Philosopher 1: got left chopstick 2
Philosopher 1: eating.
Philosopher 1 is done eating.
%

Execution terminated normally


% er_print tha.2.er 
(er_print) deadlocks

Deadlock #1, Potential deadlock
	Thread #2
		Lock being held:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #3
		Lock being held:	0x215c0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #4
		Lock being held:	0x215d8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #5
		Lock being held:	0x215f0, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
	Thread #6
		Lock being held:	0x21608, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"
		Lock being requested:	0x215a8, at: grab_chopstick + 0x0000002C, line 101 in "din_philo.c"

Deadlocks List Summary: Experiment: tha.2.er Total Deadlocks: 1
(er_print) 
 

下面的屏幕抓图显示了线程分析器界面的潜在死锁信息:

显示潜在死锁的线程分析器窗口屏幕抓图。

3.5 修复死锁并了解误报

除了哲人在开始进餐前等待的策略,我们还可以使用令牌机制,在该机制中,哲人必须在收到令牌后才能尝试进餐。可用令牌的数量必须少于餐桌前哲人的数量。当哲人收到令牌后,他可以按照餐桌规则尝试进餐。进餐后,每个哲人返还令牌并重复该过程。下列伪代码显示了使用令牌机制时每个哲人的逻辑:

   while (there is still food on the table)
      {
        get token
        grab right fork
        grab left fork
        eat some food
        put down left fork
        put down right fork
        return token 
      }

以下部分将详细说明令牌机制的两种不同实现方法。

3.5.1 使用令牌控制哲人

下面列出了使用令牌机制的哲人进餐程序的修正版本。该解决方案结合使用了 4 个令牌,比用餐者的数量少一,因此能够同时尝试进餐的哲人不会超过 4 个。该版本程序称为 din_philo_fix1.c

    1	#include <pthread.h>
     2	#include <stdio.h>
     3	#include <unistd.h>
     4	#include <stdlib.h>
     5	#include <errno.h>
     6	#include <assert.h>
     7	
     8	#define PHILOS 5
     9	#define DELAY 5000
    10	#define FOOD 50
    11	
    12	void *philosopher (void *id);
    13	void grab_chopstick (int,
    14	                     int,
    15	                     char *);
    16	void down_chopsticks (int,
    17	                      int);
    18	int food_on_table ();
    19	int get_token ();
    20	void return_token ();
    21	
    22	pthread_mutex_t chopstick[PHILOS];
    23	pthread_t philo[PHILOS];
    24	pthread_mutex_t food_lock;
    25	pthread_mutex_t num_can_eat_lock;
    26	int sleep_seconds = 0;
    27	uint32_t num_can_eat = PHILOS - 1;
    28	
    29	
    30	int
    31	main (int argn,
    32	      char **argv)
    33	{
    34	    int i;
    35	
    36	    pthread_mutex_init (&food_lock, NULL);
    37	    pthread_mutex_init (&num_can_eat_lock, NULL);
    38	    for (i = 0; i < PHILOS; i++)
    39	        pthread_mutex_init (&chopstick[i], NULL);
    40	    for (i = 0; i < PHILOS; i++)
    41	        pthread_create (&philo[i], NULL, philosopher, (void *)i);
    42	    for (i = 0; i < PHILOS; i++)
    43	        pthread_join (philo[i], NULL);
    44	    return 0;
    45	}
    46	
    47	void *
    48	philosopher (void *num)
    49	{
    50	    int id;
    51	    int i, left_chopstick, right_chopstick, f;
    52	
    53	    id = (int)num;
    54	    printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
    55	    right_chopstick = id;
    56	    left_chopstick = id + 1;
    57	
    58	    /* Wrap around the chopsticks. */
    59	    if (left_chopstick == PHILOS)
    60	        left_chopstick = 0;
    61	
    62	    while (f = food_on_table ()) {
    63	        get_token ();
    64	
    65	        grab_chopstick (id, right_chopstick, "right ");
    66	        grab_chopstick (id, left_chopstick, "left");
    67	
    68	        printf ("Philosopher %d: eating.\n", id);
    69	        usleep (DELAY * (FOOD - f + 1));
    70	        down_chopsticks (left_chopstick, right_chopstick);
    71	
    72	        return_token ();
    73	    }
    74	
    75	    printf ("Philosopher %d is done eating.\n", id);
    76	    return (NULL);
    77	}
    78	
    79	int
    80	food_on_table ()
    81	{
    82	    static int food = FOOD;
    83	    int myfood;
    84	
    85	    pthread_mutex_lock (&food_lock);
    86	    if (food > 0) {
    87	        food--;
    88	    }
    89	    myfood = food;
    90	    pthread_mutex_unlock (&food_lock);
    91	    return myfood;
    92	}
    93	
    94	void
    95	grab_chopstick (int phil,
    96	                int c,
    97	                char *hand)
    98	{
    99	    pthread_mutex_lock (&chopstick[c]);
   100	    printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
   101	}
   102	
   103	void
   104	down_chopsticks (int c1,
   105	                 int c2)
   106	{
   107	    pthread_mutex_unlock (&chopstick[c1]);
   108	    pthread_mutex_unlock (&chopstick[c2]);
   109	}
   110	
   111	
   112	int
   113	get_token ()
   114	{
   115	    int successful = 0;
   116	
   117	    while (!successful) {
   118	        pthread_mutex_lock (&num_can_eat_lock);
   119	        if (num_can_eat > 0) {
   120	            num_can_eat--;
   121	            successful = 1;
   122	        }
   123	        else {
   124	            successful = 0;
   125	        }
   126	        pthread_mutex_unlock (&num_can_eat_lock);
   127	    }
   128	}
   129	
   130	void
   131	return_token ()
   132	{
   133	    pthread_mutex_lock (&num_can_eat_lock);
   134	    num_can_eat++;
   135	    pthread_mutex_unlock (&num_can_eat_lock);
   136	}

尝试编译并运行这一哲人进餐程序的修正版本,并多运行几次。令牌机制限制了尝试使用筷子的用餐者的数量,从而可避免实际死锁和潜在死锁。

3.5.1.1 误报的报告

尽管使用了令牌机制,线程分析器仍为此实现报告了一个潜在死锁(虽然并不存在死锁)。这是一个误报。考虑以下包含潜在死锁详细信息的屏幕抓图:

显示 2 号线程死锁的线程分析器窗口屏幕抓图。

选择该链中的第一个线程(线程 #2),然后单击“双重数据源”选项卡,查看线程 #2 在源代码中占用锁的位置(在地址 0x215a8)以及请求锁的位置(在地址 0x215c0)。以下屏幕抓图显示了线程 #2 的“双重数据源”选项卡。

显示潜在死锁的线程分析器“双重数据源”选项卡屏幕抓图。

din_philo_fix1.c 中的 get_token() 函数使用 while 循环同步线程。在成功获得一个令牌(当 num_can_eat 大于 0 时将发生这种情况)之前,线程不会离开 while 循环。while 循环将同时进餐的人数限制为 4。但是线程分析器无法识别 while 循环实现的同步。它假定所有 5 个哲人都尝试同时取筷子并进餐,因此报告了一个潜在死锁。下一部分将详细说明如何使用线程分析器可识别的同步限制同时进餐的人数。

3.5.2 另一种令牌机制

以下列表显示了令牌机制的另一种实现。该实现仍使用 4 个令牌,因此同时尝试进餐的人数不会超过 4 个。但是,该实现使用 sem_wait()sem_post() 信号例程来限制进餐的哲人数量。该版本的源文件称为 din_philo_fix2.c


注 –

必须使用 -lrt 编译 din_philo_fix2.c,以链接正确的信号例程。


以下列表详细说明了 din_philo_fix2.c

    1	#include <pthread.h>
     2	#include <stdio.h>
     3	#include <unistd.h>
     4	#include <stdlib.h>
     5	#include <errno.h>
     6	#include <assert.h>
     7	#include <semaphore.h>
     8	
     9	#define PHILOS 5
    10	#define DELAY 5000
    11	#define FOOD 50
    12	
    13	void *philosopher (void *id);
    14	void grab_chopstick (int,
    15	                     int,
    16	                     char *);
    17	void down_chopsticks (int,
    18	                      int);
    19	int food_on_table ();
    20	int get_token ();
    21	void return_token ();
    22	
    23	pthread_mutex_t chopstick[PHILOS];
    24	pthread_t philo[PHILOS];
    25	pthread_mutex_t food_lock;
    26	int sleep_seconds = 0;
    27	sem_t num_can_eat_sem;
    28	
    29	
    30	int
    31	main (int argn,
    32	      char **argv)
    33	{
    34	    int i;
    35	
    36	    pthread_mutex_init (&food_lock, NULL);
    37	    sem_init(&num_can_eat_sem, 0, PHILOS - 1);
    38	    for (i = 0; i < PHILOS; i++)
    39	        pthread_mutex_init (&chopstick[i], NULL);
    40	    for (i = 0; i < PHILOS; i++)
    41	        pthread_create (&philo[i], NULL, philosopher, (void *)i);
    42	    for (i = 0; i < PHILOS; i++)
    43	        pthread_join (philo[i], NULL);    
    44	    return 0;
    45	}
    46	
    47	void *
    48	philosopher (void *num)
    49	{
    50	    int id;
    51	    int i, left_chopstick, right_chopstick, f;
    52	
    53	    id = (int)num;
    54	    printf ("Philosopher %d is done thinking and now ready to eat.\n", id);
    55	    right_chopstick = id;
    56	    left_chopstick = id + 1;
    57	
    58	    /* Wrap around the chopsticks. */
    59	    if (left_chopstick == PHILOS)
    60	        left_chopstick = 0;
    61	
    62	    while (f = food_on_table ()) {
    63	        get_token ();
    64	
    65	        grab_chopstick (id, right_chopstick, "right ");
    66	        grab_chopstick (id, left_chopstick, "left");
    67	
    68	        printf ("Philosopher %d: eating.\n", id);
    69	        usleep (DELAY * (FOOD - f + 1));
    70	        down_chopsticks (left_chopstick, right_chopstick);
    71	
    72	        return_token ();
    73	    }
    74	
    75	    printf ("Philosopher %d is done eating.\n", id);
    76	    return (NULL);
    77	}
    78	
    79	int
    80	food_on_table ()
    81	{
    82	    static int food = FOOD;
    83	    int myfood;
    84	
    85	    pthread_mutex_lock (&food_lock);
    86	    if (food > 0) {
    87	        food--;
    88	    }
    89	    myfood = food;
    90	    pthread_mutex_unlock (&food_lock);
    91	    return myfood;
    92	}
    93	
    94	void
    95	grab_chopstick (int phil,
    96	                int c,
    97	                char *hand)
    98	{
    99	    pthread_mutex_lock (&chopstick[c]);
   100	    printf ("Philosopher %d: got %s chopstick %d\n", phil, hand, c);
   101	}
   102	
   103	void
   104	down_chopsticks (int c1,
   105	                 int c2)
   106	{
   107	    pthread_mutex_unlock (&chopstick[c1]);
   108	    pthread_mutex_unlock (&chopstick[c2]);
   109	}
   110	
   111	
   112	int
   113	get_token ()
   114	{
   115	    sem_wait(&num_can_eat_sem);
   116	}
   117	
   118	void
   119	return_token ()
   120	{
   121	    sem_post(&num_can_eat_sem);
   122	}

这一新实现使用信号 num_can_eat_sem 来限制同时进餐的哲人数量。信号 num_can_eat_sem 初始化为 4,比哲人的数量少一。在尝试进餐前,哲人将调用 get_token(),从而会调用 sem_wait(&num_can_eat_sem)。调用 sem_wait() 将导致发出调用的哲人进入等待状态,直到信号值为正,然后通过从值中减 1 来更改信号的值。当一位哲人进完餐后,他将调用 return_token(),从而会调用 sem_post(&num_can_eat_sem)。调用 sem_post() 可将信号值加 1。线程分析器可识别对 sem_wait()sem_post() 的调用,并确定并非所有哲人都会同时尝试进餐。

如果多次运行这一新的程序实现,您将发现该程序每次都会正常终止,不会挂起。此外,还将发现线程分析器不会报告任何实际死锁或潜在死锁,如下面的屏幕抓图所示:

显示无死锁的线程分析器窗口屏幕抓图。

有关线程分析器可识别的线程和内存分配 API 的列表,请参见附录 A,线程分析器用户 API