You might want to know on which CPUs processes and threads are
waiting to run. You can use the enqueue
probe along with the dequeue
probe to
answer this question. Type the following source code and save
it in a file named qtime.d
:
sched:::enqueue { a[args[0]->pr_lwpid, args[1]->pr_pid, args[2]->cpu_id] = timestamp; } sched:::dequeue /a[args[0]->pr_lwpid, args[1]->pr_pid, args[2]->cpu_id]/ { @[args[2]->cpu_id] = quantize(timestamp - a[args[0]->pr_lwpid, args[1]->pr_pid, args[2]->cpu_id]); a[args[0]->pr_lwpid, args[1]->pr_pid, args[2]->cpu_id] = 0; }
Running the previous script for several seconds results in output that is similar to the following:
#dtrace -s qtime.d
dtrace: script 'qtime.d' matched 16 probes^C
1 value ------------- Distribution ------------- count 8192 | 0 16384 | 1 32768 |@ 47 65536 |@@@@@@@ 365 131072 |@@@@@@@@@@@@ 572 262144 |@@@@@@@@@@@@ 570 524288 |@@@@@@@ 354 1048576 |@ 57 2097152 | 7 4194304 | 1 8388608 | 1 16777216 | 0 0 value ------------- Distribution ------------- count 8192 | 0 16384 | 6 32768 |@ 49 65536 |@@@@@ 261 131072 |@@@@@@@@@@@@@ 753 262144 |@@@@@@@@@@@@ 704 524288 |@@@@@@@@ 455 1048576 |@ 74 2097152 | 9 4194304 | 2 8388608 | 0
Rather than looking at wait times, you might want to examine
the length of the run queue over time. Using the
enqueue
and dequeue
probes, you can set up an associative array to track the queue
length. Type the following source code and save it in a file
named qlen.d
:
sched:::enqueue { this->len = qlen[args[2]->cpu_id]++; @[args[2]->cpu_id] = lquantize(this->len, 0, 100); } sched:::dequeue /qlen[args[2]->cpu_id]/ { qlen[args[2]->cpu_id]--; }
Running the previous script on a largely idle dual-core processor system for approximately 30 seconds results in output that is similar to the following:
#dtrace -s qlen.d
dtrace: script 'qlen.d' matched 16 probes^C
1 value ------------- Distribution ------------- count < 0 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 8124 1 |@@@@@@ 1558 2 |@ 160 3 | 51 4 | 24 5 | 13 6 | 11 7 | 9 8 | 6 9 | 0 0 value ------------- Distribution ------------- count < 0 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 8569 1 |@@@@@@@@@ 2429 2 |@ 292 3 | 25 4 | 8 5 | 5 6 | 4 7 | 4 8 | 1 9 | 0
The output is roughly what you would expect for an idle system: the majority of the time that a runnable thread is enqueued, the run queues were very short (three or fewer threads in length). However, given that the system was largely idle, the exceptional data points at the bottom of each table might be unexpected. For example, why were the run queues as long as 8 runnable threads? To explore this question further, you could write a D script that displays the contents of the run queue when the length of the run queue is long. This problem is complicated because D enablings cannot iterate over data structures, and therefore cannot simply iterate over the entire run queue. Even if D enablings could do so, you should avoid dependencies on the kernel's internal data structures.
For this type of script, you would enable the
enqueue
and dequeue
probes and then use both speculations and associative arrays.
Whenever a thread is enqueued, the script increments the
length of the queue and records the timestamp in an
associative array keyed by the thread. You cannot use a
thread-local variable in this case because a thread might be
enqueued by another thread. The script then checks to see if
the queue length exceeds the maximum, and if so, the script
starts a new speculation, and records the timestamp and the
new maximum. Then, when a thread is dequeued, the script
compares the enqueue
timestamp to the
timestamp of the longest length: if the thread was enqueued
before the timestamp of the longest length, the thread was in
the queue when the longest length was recorded. In this case,
the script speculatively traces the thread's information. When
the kernel dequeues the last thread that was enqueued at the
timestamp of the longest length, the script commits the
speculation data. Type the following source code and save it
in a file named whoqueue.d
:
#pragma D option quiet #pragma D option nspec=4 #pragma D option specsize=100k int maxlen; int spec[int]; sched:::enqueue { this->len = ++qlen[this->cpu = args[2]->cpu_id]; in[args[0]->pr_addr] = timestamp; } sched:::enqueue /this->len > maxlen && spec[this->cpu]/ { /* * There is already a speculation for this CPU. We just set a new * record, so we’ll discard the old one. */ discard(spec[this->cpu]); } sched:::enqueue /this->len > maxlen/ { /* * We have a winner. Set the new maximum length and set the timestamp * of the longest length. */ maxlen = this->len; longtime[this->cpu] = timestamp; /* * Now start a new speculation, and speculatively trace the length. */ this->spec = spec[this->cpu] = speculation(); speculate(this->spec); printf("Run queue of length %d:\n", this->len); } sched:::dequeue /(this->in = in[args[0]->pr_addr]) && this->in <= longtime[this->cpu = args[2]->cpu_id]/ { speculate(spec[this->cpu]); printf(" %d/%d (%s)\n", args[1]->pr_pid, args[0]->pr_lwpid, stringof(args[1]->pr_fname)); } sched:::dequeue /qlen[args[2]->cpu_id]/ { in[args[0]->pr_addr] = 0; this->len = --qlen[args[2]->cpu_id]; } sched:::dequeue /this->len == 0 && spec[this->cpu]/ { /* * We just processed the last thread that was enqueued at the time * of longest length; commit the speculation, which by now contains * each thread that was enqueued when the queue was longest. */ commit(spec[this->cpu]); spec[this->cpu] = 0; }
Running the previous script on the same system results in output that is similar to the following:
#dtrace -s whoqueue.d
Run queue of length 1: 2850/2850 (java) Run queue of length 2: 4034/4034 (kworker/0:1) 16/16 (sync_supers) Run queue of length 3: 10/10 (ksoftirqd/1) 1710/1710 (hald-addon-inpu) 25350/25350 (dtrace) Run queue of length 4: 2852/2852 (java) 2850/2850 (java) 1710/1710 (hald-addon-inpu) 2099/2099 (Xorg) Run queue of length 5: 3149/3149 (notification-da) 2417/2417 (gnome-settings-) 2437/2437 (gnome-panel) 2461/2461 (wnck-applet) 2432/2432 (metacity) Run queue of length 9: 3685/3685 (firefox) 3149/3149 (notification-da) 2417/2417 (gnome-settings-) 2437/2437 (gnome-panel) 2852/2852 (java) 2452/2452 (nautilus) 2461/2461 (wnck-applet) 2432/2432 (metacity) 2749/2749 (gnome-terminal)^C