CVE-2024-50095

In the Linux kernel, the following vulnerability has been resolved: RDMA/mad: Improve handling of timed out WRs of mad agent Current timeout handler of mad agent acquires/releases mad_agent_priv lock for every timed out WRs. This causes heavy locking contention when higher no. of WRs are to be handled inside timeout handler. This leads to softlockup with below trace in some use cases where rdma-cm path is used to establish connection between peer nodes Trace: ----- BUG: soft lockup - CPU#4 stuck for 26s! [kworker/u128:3:19767] CPU: 4 PID: 19767 Comm: kworker/u128:3 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.13.1.el9_4.x86_64 #1 Hardware name: Dell Inc. PowerEdge R740/01YM03, BIOS 2.4.8 11/26/2019 Workqueue: ib_mad1 timeout_sends [ib_core] RIP: 0010:__do_softirq+0x78/0x2ac RSP: 0018:ffffb253449e4f98 EFLAGS: 00000246 RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 000000000000001f RDX: 000000000000001d RSI: 000000003d1879ab RDI: fff363b66fd3a86b RBP: ffffb253604cbcd8 R08: 0000009065635f3b R09: 0000000000000000 R10: 0000000000000040 R11: ffffb253449e4ff8 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff8caa1fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd9ec9db900 CR3: 0000000891934006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __irq_exit_rcu+0xa1/0xc0 ? watchdog_timer_fn+0x1b2/0x210 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x127/0x2c0 ? hrtimer_interrupt+0xfc/0x210 ? __sysvec_apic_timer_interrupt+0x5c/0x110 ? sysvec_apic_timer_interrupt+0x37/0x90 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __do_softirq+0x78/0x2ac ? __do_softirq+0x60/0x2ac __irq_exit_rcu+0xa1/0xc0 sysvec_call_function_single+0x72/0x90 </IRQ> <TASK> asm_sysvec_call_function_single+0x16/0x20 RIP: 0010:_raw_spin_unlock_irq+0x14/0x30 RSP: 0018:ffffb253604cbd88 EFLAGS: 00000247 RAX: 000000000001960d RBX: 0000000000000002 RCX: ffff8cad2a064800 RDX: 000000008020001b RSI: 0000000000000001 RDI: ffff8cad5d39f66c RBP: ffff8cad5d39f600 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8caa443e0c00 R11: ffffb253604cbcd8 R12: ffff8cacb8682538 R13: 0000000000000005 R14: ffffb253604cbd90 R15: ffff8cad5d39f66c cm_process_send_error+0x122/0x1d0 [ib_cm] timeout_sends+0x1dd/0x270 [ib_core] process_one_work+0x1e2/0x3b0 ? __pfx_worker_thread+0x10/0x10 worker_thread+0x50/0x3a0 ? __pfx_worker_thread+0x10/0x10 kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 </TASK> Simplified timeout handler by creating local list of timed out WRs and invoke send handler post creating the list. The new method acquires/ releases lock once to fetch the list and hence helps to reduce locking contetiong when processing higher no. of WRs
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*

History

12 Nov 2024, 20:26

Type Values Removed Values Added
CWE NVD-CWE-noinfo
CPE cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 5.5
First Time Linux
Linux linux Kernel
References () https://git.kernel.org/stable/c/2a777679b8ccd09a9a65ea0716ef10365179caac - () https://git.kernel.org/stable/c/2a777679b8ccd09a9a65ea0716ef10365179caac - Patch
References () https://git.kernel.org/stable/c/3e799fa463508abe7a738ce5d0f62a8dfd05262a - () https://git.kernel.org/stable/c/3e799fa463508abe7a738ce5d0f62a8dfd05262a - Patch
References () https://git.kernel.org/stable/c/7022a517bf1ca37ef5a474365bcc5eafd345a13a - () https://git.kernel.org/stable/c/7022a517bf1ca37ef5a474365bcc5eafd345a13a - Patch
References () https://git.kernel.org/stable/c/713adaf0ecfc49405f6e5d9e409d984f628de818 - () https://git.kernel.org/stable/c/713adaf0ecfc49405f6e5d9e409d984f628de818 - Patch
References () https://git.kernel.org/stable/c/a195a42dd25ca4f12489687065d00be64939409f - () https://git.kernel.org/stable/c/a195a42dd25ca4f12489687065d00be64939409f - Patch
References () https://git.kernel.org/stable/c/e80eadb3604a92d2d086e956b8b2692b699d4d0a - () https://git.kernel.org/stable/c/e80eadb3604a92d2d086e956b8b2692b699d4d0a - Patch

06 Nov 2024, 18:17

Type Values Removed Values Added
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: RDMA/mad: Mejorar el manejo de los WR con tiempo de espera agotado del agente mad El controlador de tiempo de espera actual del agente mad adquiere/libera el bloqueo mad_agent_priv para cada WR con tiempo de espera agotado. Esto provoca una fuerte contención de bloqueo cuando se deben manejar una mayor cantidad de WR dentro del controlador de tiempo de espera. Esto conduce a un bloqueo suave con el siguiente seguimiento en algunos casos de uso donde se usa la ruta rdma-cm para establecer la conexión entre nodos pares Seguimiento: ----- ERROR: bloqueo suave: ¡CPU n.º 4 atascada durante 26 s! [kworker/u128:3:19767] CPU: 4 PID: 19767 Comm: kworker/u128:3 Kdump: cargado Contaminado: G OE ------- --- 5.14.0-427.13.1.el9_4.x86_64 #1 Nombre del hardware: Dell Inc. PowerEdge R740/01YM03, BIOS 2.4.8 26/11/2019 Cola de trabajo: ib_mad1 timeout_sends [ib_core] RIP: 0010:__do_softirq+0x78/0x2ac RSP: 0018:ffffb253449e4f98 EFLAGS: 00000246 RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 000000000000001f RDX: 000000000000001d RSI: 000000003d1879ab RDI: fff363b66fd3a86b RBP: ffffb253604cbcd8 R08: 0000009065635f3b R09: 0000000000000000 R10: 0000000000000040 R11: ffffb253449e4ff8 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040 FS: 000000000000000(0000) GS:ffff8caa1fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd9ec9db900 CR3: 0000000891934006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000000400 PKRU: 55555554 Seguimiento de llamadas: ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __irq_exit_rcu+0xa1/0xc0 ? watchdog_timer_fn+0x1b2/0x210 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x127/0x2c0 ? hrtimer_interrupt+0xfc/0x210 ? __sysvec_apic_timer_interrupt+0x5c/0x110 ? sysvec_apic_timer_interrupt+0x37/0x90 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __do_softirq+0x78/0x2ac ? asm_sysvec_call_function_single+0x16/0x20 RIP: 0010:_raw_spin_unlock_irq+0x14/0x30 RSP: 0018:ffffb253604cbd88 EFLAGS: 00000247 RAX: 000000000001960d RBX: 0000000000000002 RCX: ffff8cad2a064800 RDX: 000000008020001b RSI: 00000000000000001 RDI: ffff8cad5d39f66c RBP: ffff8cad5d39f600 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8caa443e0c00 R11: ffffb253604cbcd8 R12: ffff8cacb8682538 R13: 0000000000000005 R14: ffffb253604cbd90 R15: ffff8cad5d39f66c cm_process_send_error+0x122/0x1d0 [ib_cm] timeout_sends+0x1dd/0x270 [ib_core] process_one_work+0x1e2/0x3b0 ? __pfx_worker_thread+0x10/0x10 worker_thread+0x50/0x3a0 ? __pfx_worker_thread+0x10/0x10 kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 Se simplificó el controlador de tiempo de espera al crear una lista local de los WR que han expirado y al invocar el controlador de envío después de crear la lista. El nuevo método adquiere/libera el bloqueo una vez para obtener la lista y, por lo tanto, ayuda a reducir la contención del bloqueo cuando se procesa una cantidad mayor de WR.

05 Nov 2024, 17:15

Type Values Removed Values Added
New CVE

Information

Published : 2024-11-05 17:15

Updated : 2024-11-12 20:26


NVD link : CVE-2024-50095

Mitre link : CVE-2024-50095

CVE.ORG link : CVE-2024-50095


JSON object : View

Products Affected

linux

  • linux_kernel