今天我们看一看内核对长期处于睡眠状态的进程的一种监测机制:hung task。

hung task 案例

用 crash tool 解析出的结果如下:


      KERNEL: examples/hungtask/vmlinux         
    DUMPFILE: examples/hungtask/dump-hungtask.bin
        CPUS: 2 [OFFLINE: 1]
        DATE: Thu Nov 23 10:31:23 CST 2023
      UPTIME: 00:03:44
LOAD AVERAGE: 2.60, 1.08, 0.41
       TASKS: 51
    NODENAME: (none)
     RELEASE: 4.19.298
     VERSION: #24 SMP PREEMPT Thu Nov 23 10:27:05 CST 2023
     MACHINE: aarch64  (unknown Mhz)
      MEMORY: 128 MB
       PANIC: "Kernel panic - not syncing: hung_task: blocked tasks"
         PID: 476
     COMMAND: "khungtaskd"
        TASK: ffff800004a61b00  [THREAD_INFO: ffff800004a61b00]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)
PANIC 提示:hung_task 机制侦测到有(长期)阻塞的任务。

PANIC: "Kernel panic - not syncing: hung_task: blocked tasks"

COMMAND 提示:panic 发生在 "khungtaskd" 进程中。

COMMAND: "khungtaskd"

用 log 命令查看内核 log,有如下两段提示。第一段 log 提示了 task hung 的原因,有进程 block 超过 60 秒。


[   11.028762] test_work_func enter
[  121.875251] INFO: task kworker/1:1:44 blocked for more than 60 seconds.
[  121.875749]       Not tainted 4.19.298 #24
[  121.875886] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  121.876116] kworker/1:1     D    0    44      2 0x00000028
[  121.876995] Workqueue: events test_work_func
[  121.877180] Call trace:
[  121.877285]  __switch_to+0xe8/0x148
[  121.877413]  __schedule+0x1e8/0x5c8
[  121.877482]  schedule+0x38/0xa0
[  121.877526]  schedule_timeout+0x24c/0x388
[  121.877581]  __down+0x74/0xc8
[  121.877629]  down+0x48/0x60
[  121.877667]  test_work_func+0x3c/0x58
[  121.877748]  process_one_work+0x1b4/0x2e8
[  121.877869]  worker_thread+0x48/0x400
[  121.877979]  kthread+0x128/0x160
[  121.878080]  ret_from_fork+0x10/0x1c
[  121.878389] Kernel panic - not syncing: hung_task: blocked tasks

我们可以用 dis 命令看看 test_work_func 函数的源码,确实是在 down 一个信号量,并且该信号量初始值是 0,所以会 马上会进入阻塞。而其他地方并没有 up 该信号量的操作。


crash> dis -ls test_work_func
FILE: ../drivers/input/keyboard/gpio_keys.c
LINE: 404

  399   
  400   static void test_work_func(struct work_struct *work)
  401   {
  402           int i;
  403   
* 404           pr_info("%s enter\n", __func__);
  405           sema_init(&test_sem, 0);
  406           down(&test_sem);
  407           pr_info("%s exit\n", __func__);
  408   }

第二段 log 提示了 hung task 的侦测流程,是通过 watchdog 这个内核线程来侦测,当满足 hung task 的条件后主动 panic 。


[  121.878679] CPU: 1 PID: 476 Comm: khungtaskd Not tainted 4.19.298 #24
[  121.878879] Hardware name: linux,dummy-virt (DT)
[  121.879060] Call trace:
[  121.879124]  dump_backtrace+0x0/0x158
[  121.879198]  show_stack+0x14/0x20
[  121.879260]  dump_stack+0x94/0xc4
[  121.879322]  panic+0x13c/0x2a0
[  121.879376]  watchdog+0x290/0x3a0
[  121.879431]  kthread+0x128/0x160
[  121.879487]  ret_from_fork+0x10/0x1c
[  121.879773] SMP: stopping secondary CPUs
[  121.880707] Kernel Offset: disabled
[  121.880863] CPU features: 0x00,24002004
[  121.880935] Memory Limit: none
[  121.881307] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---

hung task 侦测原理

内核会创建一个内核线程(khungtaskd),定期监测处于 D 状态(TASK_UNINTERRUPTIBLE)的进程,如果发现有进程在两次监测之间没有发生任何的调度,则认为该进程很有可能已经死锁,于是输出告警日志供开发人员参考。

进程长期处于 D 状态,这个并不一定代表内核出现了问题,但需要密切注意,所以给出告警信息。如果内核配置了 hung task 后 panic 的功能,则会发起 panic。

  • CONFIG_DEFAULT_HUNG_TASK_TIMEOUT 用于配置监测周期
  • CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE 用于配置是否 panic

/*
 * kthread which checks for tasks stuck in D state
 */
static int watchdog(void *dummy)
{
        unsigned long hung_last_checked = jiffies;

        set_user_nice(current, 0);

        for ( ; ; ) {
                unsigned long timeout = sysctl_hung_task_timeout_secs;
                unsigned long interval = sysctl_hung_task_check_interval_secs;
                long t;

                if (interval == 0)
                        interval = timeout;
                interval = min_t(unsigned long, interval, timeout);
                t = hung_timeout_jiffies(hung_last_checked, interval);
                if (t <= 0) {
                        if (!atomic_xchg(&reset_hung_task, 0) &&
                            !hung_detector_suspended)
                                check_hung_uninterruptible_tasks(timeout);
                        hung_last_checked = jiffies;
                        continue;
                }
                schedule_timeout_interruptible(t);
        }

        return 0;
}