CVE-2024-43834

In the Linux kernel, the following vulnerability has been resolved: xdp: fix invalid wait context of page_pool_destroy() If the driver uses a page pool, it creates a page pool with page_pool_create(). The reference count of page pool is 1 as default. A page pool will be destroyed only when a reference count reaches 0. page_pool_destroy() is used to destroy page pool, it decreases a reference count. When a page pool is destroyed, ->disconnect() is called, which is mem_allocator_disconnect(). This function internally acquires mutex_lock(). If the driver uses XDP, it registers a memory model with xdp_rxq_info_reg_mem_model(). The xdp_rxq_info_reg_mem_model() internally increases a page pool reference count if a memory model is a page pool. Now the reference count is 2. To destroy a page pool, the driver should call both page_pool_destroy() and xdp_unreg_mem_model(). The xdp_unreg_mem_model() internally calls page_pool_destroy(). Only page_pool_destroy() decreases a reference count. If a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we will face an invalid wait context warning. Because xdp_unreg_mem_model() calls page_pool_destroy() with rcu_read_lock(). The page_pool_destroy() internally acquires mutex_lock(). Splat looks like: ============================= [ BUG: Invalid wait context ] 6.10.0-rc6+ #4 Tainted: G W ----------------------------- ethtool/1806 is trying to lock: ffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150 other info that might help us debug this: context-{5:5} 3 locks held by ethtool/1806: stack backtrace: CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 Call Trace: <TASK> dump_stack_lvl+0x7e/0xc0 __lock_acquire+0x1681/0x4de0 ? _printk+0x64/0xe0 ? __pfx_mark_lock.part.0+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 lock_acquire+0x1b3/0x580 ? mem_allocator_disconnect+0x73/0x150 ? __wake_up_klogd.part.0+0x16/0xc0 ? __pfx_lock_acquire+0x10/0x10 ? dump_stack_lvl+0x91/0xc0 __mutex_lock+0x15c/0x1690 ? mem_allocator_disconnect+0x73/0x150 ? __pfx_prb_read_valid+0x10/0x10 ? mem_allocator_disconnect+0x73/0x150 ? __pfx_llist_add_batch+0x10/0x10 ? console_unlock+0x193/0x1b0 ? lockdep_hardirqs_on+0xbe/0x140 ? __pfx___mutex_lock+0x10/0x10 ? tick_nohz_tick_stopped+0x16/0x90 ? __irq_work_queue_local+0x1e5/0x330 ? irq_work_queue+0x39/0x50 ? __wake_up_klogd.part.0+0x79/0xc0 ? mem_allocator_disconnect+0x73/0x150 mem_allocator_disconnect+0x73/0x150 ? __pfx_mem_allocator_disconnect+0x10/0x10 ? mark_held_locks+0xa5/0xf0 ? rcu_is_watching+0x11/0xb0 page_pool_release+0x36e/0x6d0 page_pool_destroy+0xd7/0x440 xdp_unreg_mem_model+0x1a7/0x2a0 ? __pfx_xdp_unreg_mem_model+0x10/0x10 ? kfree+0x125/0x370 ? bnxt_free_ring.isra.0+0x2eb/0x500 ? bnxt_free_mem+0x5ac/0x2500 xdp_rxq_info_unreg+0x4a/0xd0 bnxt_free_mem+0x1356/0x2500 bnxt_close_nic+0xf0/0x3b0 ? __pfx_bnxt_close_nic+0x10/0x10 ? ethnl_parse_bit+0x2c6/0x6d0 ? __pfx___nla_validate_parse+0x10/0x10 ? __pfx_ethnl_parse_bit+0x10/0x10 bnxt_set_features+0x2a8/0x3e0 __netdev_update_features+0x4dc/0x1370 ? ethnl_parse_bitset+0x4ff/0x750 ? __pfx_ethnl_parse_bitset+0x10/0x10 ? __pfx___netdev_update_features+0x10/0x10 ? mark_held_locks+0xa5/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x70 ? __pm_runtime_resume+0x7d/0x110 ethnl_set_features+0x32d/0xa20 To fix this problem, it uses rhashtable_lookup_fast() instead of rhashtable_lookup() with rcu_read_lock(). Using xa without rcu_read_lock() here is safe. xa is freed by __xdp_mem_allocator_rcu_free() and this is called by call_rcu() of mem_xa_remove(). The mem_xa_remove() is called by page_pool_destroy() if a reference count reaches 0. The xa is already protected by the reference count mechanism well in the control plane. So removing rcu_read_lock() for page_pool_destroy() is safe.

References

Link	Resource
https://git.kernel.org/stable/c/12144069209eec7f2090ce9afa15acdcc2c2a537	Patch
https://git.kernel.org/stable/c/3fc1be360b99baeea15cdee3cf94252cd3a72d26	Patch
https://git.kernel.org/stable/c/59a931c5b732ca5fc2ca727f5a72aeabaafa85ec	Patch
https://git.kernel.org/stable/c/6c390ef198aa69795427a5cb5fd7cb4bc7e6cd7a	Patch
https://git.kernel.org/stable/c/be9d08ff102df3ac4f66e826ea935cf3af63a4bd	Patch
https://git.kernel.org/stable/c/bf0ce5aa5f2525ed1b921ba36de96e458e77f482	Patch

Configurations

Configuration 1 (hide)

OR	cpe:2.3:o:linux:linux_kernel::::::::
	cpe:2.3:o:linux:linux_kernel::::::::
	cpe:2.3:o:linux:linux_kernel::::::::
	cpe:2.3:o:linux:linux_kernel::::::::
	cpe:2.3:o:linux:linux_kernel::::::::

History

30 Oct 2024, 21:44

Type	Values Removed	Values Added
CPE		cpe:2.3:o:linux:linux_kernel::::::::
CVSS	v2 : ~~unknown~~ v3 : ~~unknown~~	v2 : unknown v3 : 5.5
CWE		NVD-CWE-noinfo
First Time		Linux Linux linux Kernel
References	~~() https://git.kernel.org/stable/c/12144069209eec7f2090ce9afa15acdcc2c2a537 -~~	() https://git.kernel.org/stable/c/12144069209eec7f2090ce9afa15acdcc2c2a537 - Patch
References	~~() https://git.kernel.org/stable/c/3fc1be360b99baeea15cdee3cf94252cd3a72d26 -~~	() https://git.kernel.org/stable/c/3fc1be360b99baeea15cdee3cf94252cd3a72d26 - Patch
References	~~() https://git.kernel.org/stable/c/59a931c5b732ca5fc2ca727f5a72aeabaafa85ec -~~	() https://git.kernel.org/stable/c/59a931c5b732ca5fc2ca727f5a72aeabaafa85ec - Patch
References	~~() https://git.kernel.org/stable/c/6c390ef198aa69795427a5cb5fd7cb4bc7e6cd7a -~~	() https://git.kernel.org/stable/c/6c390ef198aa69795427a5cb5fd7cb4bc7e6cd7a - Patch
References	~~() https://git.kernel.org/stable/c/be9d08ff102df3ac4f66e826ea935cf3af63a4bd -~~	() https://git.kernel.org/stable/c/be9d08ff102df3ac4f66e826ea935cf3af63a4bd - Patch
References	~~() https://git.kernel.org/stable/c/bf0ce5aa5f2525ed1b921ba36de96e458e77f482 -~~	() https://git.kernel.org/stable/c/bf0ce5aa5f2525ed1b921ba36de96e458e77f482 - Patch

19 Aug 2024, 12:59

Type	Values Removed	Values Added
Summary		(es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: xdp: corrige el contexto de espera no válido de page_pool_destroy() Si el controlador utiliza un grupo de páginas, crea un grupo de páginas con page_pool_create(). El recuento de referencias del grupo de páginas es 1 de forma predeterminada. Un grupo de páginas se destruirá solo cuando el recuento de referencias llegue a 0. page_pool_destroy() se utiliza para destruir el grupo de páginas, disminuye el recuento de referencias. Cuando se destruye un grupo de páginas, se llama a ->disconnect(), que es mem_allocator_disconnect(). Esta función adquiere internamente mutex_lock(). Si el controlador usa XDP, registra un modelo de memoria con xdp_rxq_info_reg_mem_model(). xdp_rxq_info_reg_mem_model() aumenta internamente el recuento de referencias del grupo de páginas si un modelo de memoria es un grupo de páginas. Ahora el recuento de referencias es 2. Para destruir un grupo de páginas, el controlador debe llamar tanto a page_pool_destroy() como a xdp_unreg_mem_model(). xdp_unreg_mem_model() llama internamente a page_pool_destroy(). Solo page_pool_destroy() disminuye el recuento de referencias. Si un controlador llama a page_pool_destroy() y luego a xdp_unreg_mem_model(), nos enfrentaremos a una advertencia de contexto de espera no válido. Porque xdp_unreg_mem_model() llama a page_pool_destroy() con rcu_read_lock(). Page_pool_destroy() adquiere internamente mutex_lock(). Splat se ve así: ============================= [ERROR: Contexto de espera no válido] 6.10.0-rc6+ #4 Contaminado: GW ----------------------- ethtool/1806 está intentando bloquear: ffffffff90387b90 (mem_id_lock){+.+.}-{4 :4}, en: mem_allocator_disconnect+0x73/0x150 otra información que podría ayudarnos a depurar esto: contexto-{5:5} 3 bloqueos mantenidos por ethtool/1806: seguimiento de pila: CPU: 0 PID: 1806 Comm: ethtool Tainted: GW 6.10.0-rc6+ #4 f916f41f172891c800f2fed Nombre del hardware: Nombre del producto del sistema ASUS/PRIME Z690-P D4, BIOS 0603 01/11/2021 Seguimiento de llamadas: dump_stack_lvl+0x7e/0xc0 __lock_acquire+0x1681/0x4de0 ? _printk+0x64/0xe0 ? __pfx_mark_lock.part.0+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 lock_acquire+0x1b3/0x580 ? mem_allocator_disconnect+0x73/0x150? __wake_up_klogd.part.0+0x16/0xc0 ? __pfx_lock_acquire+0x10/0x10? dump_stack_lvl+0x91/0xc0 __mutex_lock+0x15c/0x1690 ? mem_allocator_disconnect+0x73/0x150? __pfx_prb_read_valid+0x10/0x10 ? mem_allocator_disconnect+0x73/0x150? __pfx_llist_add_batch+0x10/0x10? console_unlock+0x193/0x1b0? lockdep_hardirqs_on+0xbe/0x140? __pfx___mutex_lock+0x10/0x10 ? tick_nohz_tick_stopped+0x16/0x90? __irq_work_queue_local+0x1e5/0x330 ? irq_work_queue+0x39/0x50? __wake_up_klogd.part.0+0x79/0xc0 ? mem_allocator_disconnect+0x73/0x150 mem_allocator_disconnect+0x73/0x150? __pfx_mem_allocator_disconnect+0x10/0x10? mark_held_locks+0xa5/0xf0? rcu_is_watching+0x11/0xb0 page_pool_release+0x36e/0x6d0 page_pool_destroy+0xd7/0x440 xdp_unreg_mem_model+0x1a7/0x2a0 ? __pfx_xdp_unreg_mem_model+0x10/0x10 ? kgratis+0x125/0x370 ? bnxt_free_ring.isra.0+0x2eb/0x500 ? bnxt_free_mem+0x5ac/0x2500 xdp_rxq_info_unreg+0x4a/0xd0 bnxt_free_mem+0x1356/0x2500 bnxt_close_nic+0xf0/0x3b0 ? __pfx_bnxt_close_nic+0x10/0x10 ? ethnl_parse_bit+0x2c6/0x6d0? __pfx___nla_validate_parse+0x10/0x10 ? __pfx_ethnl_parse_bit+0x10/0x10 bnxt_set_features+0x2a8/0x3e0 __netdev_update_features+0x4dc/0x1370 ? ethnl_parse_bitset+0x4ff/0x750? __pfx_ethnl_parse_bitset+0x10/0x10? __pfx___netdev_update_features+0x10/0x10? mark_held_locks+0xa5/0xf0? _raw_spin_unlock_irqrestore+0x42/0x70? __pm_runtime_resume+0x7d/0x110 ethnl_set_features+0x32d/0xa20 Para solucionar este problema, utiliza rhashtable_lookup_fast() en lugar de rhashtable_lookup() con rcu_read_lock(). Usar xa sin rcu_read_lock() aquí es seguro. xa es liberado por __xdp_mem_allocator_rcu_free() y esto es llamado por call_rcu() de mem_xa_remove(). page_pool_destroy() llama a mem_xa_remove() si un recuento de referencias llega a 0. ----truncado-----

19 Aug 2024, 05:15

Type	Values Removed	Values Added
References		() https://git.kernel.org/stable/c/6c390ef198aa69795427a5cb5fd7cb4bc7e6cd7a - () https://git.kernel.org/stable/c/be9d08ff102df3ac4f66e826ea935cf3af63a4bd -

17 Aug 2024, 10:15

Type	Values Removed	Values Added
New CVE

5.5^/10

Attack Vector LOCAL

Attack Complexity LOW

Privileges Required LOW

User Interaction NONE

Confidentiality Impact NONE

Integrity Impact NONE

Availability Impact HIGH

Scope UNCHANGED