CVE-2024-50082

In the Linux kernel, the following vulnerability has been resolved: blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait().
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc1:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc3:*:*:*:*:*:*

History

08 Nov 2024, 16:15

Type Values Removed Values Added
References
  • () https://git.kernel.org/stable/c/d04b72c9ef2b0689bfc1057d21c4aeed087c329f -

30 Oct 2024, 15:44

Type Values Removed Values Added
References () https://git.kernel.org/stable/c/04f283fc16c8d5db641b6bffd2d8310aa7eccebc - () https://git.kernel.org/stable/c/04f283fc16c8d5db641b6bffd2d8310aa7eccebc - Patch
References () https://git.kernel.org/stable/c/3bc6d0f8b70a9101456cf02ab99acb75254e1852 - () https://git.kernel.org/stable/c/3bc6d0f8b70a9101456cf02ab99acb75254e1852 - Patch
References () https://git.kernel.org/stable/c/455a469758e57a6fe070e3e342db12e4a629e0eb - () https://git.kernel.org/stable/c/455a469758e57a6fe070e3e342db12e4a629e0eb - Patch
References () https://git.kernel.org/stable/c/4c5b123ab289767afe940389dbb963c5c05e594e - () https://git.kernel.org/stable/c/4c5b123ab289767afe940389dbb963c5c05e594e - Patch
References () https://git.kernel.org/stable/c/b5e900a3612b69423a0e1b0ab67841a1fb4af80f - () https://git.kernel.org/stable/c/b5e900a3612b69423a0e1b0ab67841a1fb4af80f - Patch
References () https://git.kernel.org/stable/c/e972b08b91ef48488bae9789f03cfedb148667fb - () https://git.kernel.org/stable/c/e972b08b91ef48488bae9789f03cfedb148667fb - Patch
First Time Linux
Linux linux Kernel
CWE NVD-CWE-noinfo
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 4.7
CPE cpe:2.3:o:linux:linux_kernel:6.12:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc3:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.12:rc1:*:*:*:*:*:*

29 Oct 2024, 14:34

Type Values Removed Values Added
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: blk-rq-qos: se corrige el fallo en la ejecución rq_qos_wait vs. rq_qos_wake_function Estamos viendo fallos de rq_qos_wake_function que se parecen a esto: ERROR: no se puede manejar el error de página para la dirección: ffffafe180a40084 #PF: acceso de escritura de supervisor en modo kernel #PF: error_code(0x0002) - página no presente PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 No contaminado 6.12.0-rc3-00013-geca631b8fe80 #11 Nombre del hardware: PC estándar QEMU (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 01/04/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Código: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 000000000001e7240 R09: 0000000000000011 R10: 00000000000000028 R11: 00000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Seguimiento de llamadas: try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... Entonces rq_qos_wake_function() llama a wake_up_process(data-&gt;task), que llama a try_to_wake_up(), que falla en raw_spin_lock_irqsave(&amp;p-&gt;pi_lock). p viene de data-&gt;task, y data viene de la entrada de la cola de espera, que se almacena en la pila del que espera en rq_qos_wait(). Al analizar el volcado de memoria con drgn, descubrí que el que espera ya se había despertado y se había movido a una ruta de código completamente no relacionada, destruyendo lo que antes era data-&gt;task. Mientras tanto, el waker estaba pasando la basura golpeada en data-&gt;task a wake_up_process(), lo que provocó el bloqueo. Lo que está sucediendo es que entre rq_qos_wake_function() eliminando la entrada de la cola de espera y llamando a wake_up_process(), rq_qos_wait() descubre que ya obtuvo un token y regresa. La ejecución se ve así: rq_qos_wait() rq_qos_wake_function() ============================================================ prepare_to_wait_exclusive() data-&gt;got_token = true; list_del_init(&amp;curr-&gt;entry); if (data.got_token) break; finish_wait(&amp;rqw-&gt;wait, &amp;data.wq); ^- retorna inmediatamente porque list_empty_careful(&amp;wq_entry-&gt;entry) es verdadero... retorna, ve a hacer otra cosa... wake_up_process(data-&gt;task) (¡YA NO ES VÁLIDO!)-^ Normalmente, se supone que finish_wait() se sincroniza con el activador. Pero, como se señaló anteriormente, retorna inmediatamente porque la entrada de la cola de espera ya se eliminó de la cola de espera. El error es que rq_qos_wake_function() está accediendo a la entrada de la cola de espera DESPUÉS de eliminarla. Tenga en cuenta que autoremove_wake_function() despierta al que espera y LUEGO elimina la entrada de la cola de espera, que es el orden correcto. Arréglelo intercambiando el orden. También debemos usar list_del_init_careful() para que coincida con list_empty_careful() en finish_wait().

29 Oct 2024, 01:15

Type Values Removed Values Added
New CVE

Information

Published : 2024-10-29 01:15

Updated : 2024-11-08 16:15


NVD link : CVE-2024-50082

Mitre link : CVE-2024-50082

CVE.ORG link : CVE-2024-50082


JSON object : View

Products Affected

linux

  • linux_kernel