CVE-2024-38557

In the Linux kernel, the following vulnerability has been resolved: net/mlx5: Reload only IB representors upon lag disable/enable On lag disable, the bond IB device along with all of its representors are destroyed, and then the slaves' representors get reloaded. In case the slave IB representor load fails, the eswitch error flow unloads all representors, including ethernet representors, where the netdevs get detached and removed from lag bond. Such flow is inaccurate as the lag driver is not responsible for loading/unloading ethernet representors. Furthermore, the flow described above begins by holding lag lock to prevent bond changes during disable flow. However, when reaching the ethernet representors detachment from lag, the lag lock is required again, triggering the following deadlock: Call trace: __switch_to+0xf4/0x148 __schedule+0x2c8/0x7d0 schedule+0x50/0xe0 schedule_preempt_disabled+0x18/0x28 __mutex_lock.isra.13+0x2b8/0x570 __mutex_lock_slowpath+0x1c/0x28 mutex_lock+0x4c/0x68 mlx5_lag_remove_netdev+0x3c/0x1a0 [mlx5_core] mlx5e_uplink_rep_disable+0x70/0xa0 [mlx5_core] mlx5e_detach_netdev+0x6c/0xb0 [mlx5_core] mlx5e_netdev_change_profile+0x44/0x138 [mlx5_core] mlx5e_netdev_attach_nic_profile+0x28/0x38 [mlx5_core] mlx5e_vport_rep_unload+0x184/0x1b8 [mlx5_core] mlx5_esw_offloads_rep_load+0xd8/0xe0 [mlx5_core] mlx5_eswitch_reload_reps+0x74/0xd0 [mlx5_core] mlx5_disable_lag+0x130/0x138 [mlx5_core] mlx5_lag_disable_change+0x6c/0x70 [mlx5_core] // hold ldev->lock mlx5_devlink_eswitch_mode_set+0xc0/0x410 [mlx5_core] devlink_nl_cmd_eswitch_set_doit+0xdc/0x180 genl_family_rcv_msg_doit.isra.17+0xe8/0x138 genl_rcv_msg+0xe4/0x220 netlink_rcv_skb+0x44/0x108 genl_rcv+0x40/0x58 netlink_unicast+0x198/0x268 netlink_sendmsg+0x1d4/0x418 sock_sendmsg+0x54/0x60 __sys_sendto+0xf4/0x120 __arm64_sys_sendto+0x30/0x40 el0_svc_common+0x8c/0x120 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x90/0xb8 el0_sync+0x160/0x180 Thus, upon lag enable/disable, load and unload only the IB representors of the slaves preventing the deadlock mentioned above. While at it, refactor the mlx5_esw_offloads_rep_load() function to have a static helper method for its internal logic, in symmetry with the representor unload design.
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*

History

29 Aug 2024, 02:23

Type Values Removed Values Added
CPE cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
CWE CWE-667
References () https://git.kernel.org/stable/c/0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4 - () https://git.kernel.org/stable/c/0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4 - Patch
References () https://git.kernel.org/stable/c/0f320f28f54b1b269a755be2e3fb3695e0b80b07 - () https://git.kernel.org/stable/c/0f320f28f54b1b269a755be2e3fb3695e0b80b07 - Patch
References () https://git.kernel.org/stable/c/e93fc8d959e56092e2eca1e5511c2d2f0ad6807a - () https://git.kernel.org/stable/c/e93fc8d959e56092e2eca1e5511c2d2f0ad6807a - Patch
References () https://git.kernel.org/stable/c/f03c714a0fdd1f93101a929d0e727c28a66383fc - () https://git.kernel.org/stable/c/f03c714a0fdd1f93101a929d0e727c28a66383fc - Patch
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 5.5
First Time Linux
Linux linux Kernel

20 Jun 2024, 12:44

Type Values Removed Values Added
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: net/mlx5: recarga solo los representantes IB al desactivar/activar el retraso. Al desactivar el retraso, el dispositivo IB de enlace junto con todos sus representantes se destruyen y luego se recargan los representantes de los esclavos. . En caso de que falle la carga del representante IB esclavo, el flujo de error de conmutación descarga todos los representantes, incluidos los representantes de Ethernet, donde los netdevs se desconectan y se eliminan del vínculo de retraso. Dicho flujo es inexacto ya que el controlador de retraso no es responsable de cargar/descargar representantes de Ethernet. Además, el flujo descrito anteriormente comienza manteniendo el bloqueo de retardo para evitar cambios de unión durante la desactivación del flujo. Sin embargo, cuando se alcanza la separación del retraso de los representantes de Ethernet, se requiere nuevamente el bloqueo del retraso, lo que desencadena el siguiente punto muerto: Seguimiento de llamadas: __switch_to+0xf4/0x148 __schedule+0x2c8/0x7d0 Schedule+0x50/0xe0 Schedule_preempt_disabled+0x18/0x28 __mutex_lock.isra. 13+0x2b8/0x570 __mutex_lock_slowpath+0x1c/0x28 mutex_lock+0x4c/0x68 mlx5_lag_remove_netdev+0x3c/0x1a0 [mlx5_core] mlx5e_uplink_rep_disable+0x70/0xa0 [mlx5_core] 6c/0xb0 [mlx5_core] mlx5e_netdev_change_profile+0x44/0x138 [mlx5_core] mlx5e_netdev_attach_nic_profile+0x28 /0x38 [mlx5_core] mlx5e_vport_rep_unload+0x184/0x1b8 [mlx5_core] mlx5_esw_offloads_rep_load+0xd8/0xe0 [mlx5_core] mlx5_eswitch_reload_reps+0x74/0xd0 [mlx5_core] 138 [mlx5_core] mlx5_lag_disable_change+0x6c/0x70 [mlx5_core] // mantenga presionado ldev- >bloquear mlx5_devlink_eswitch_mode_set+0xc0/0x410 [mlx5_core] devlink_nl_cmd_eswitch_set_doit+0xdc/0x180 genl_family_rcv_msg_doit.isra.17+0xe8/0x138 genl_rcv_msg+0xe4/0x220 b+0x44/0x108 genl_rcv+0x40/0x58 netlink_unicast+0x198/0x268 netlink_sendmsg+0x1d4/0x418 sock_sendmsg +0x54/0x60 __sys_sendto+0xf4/0x120 __arm64_sys_sendto+0x30/0x40 el0_svc_common+0x8c/0x120 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x90/0xb8 el0_sync+0x160/0x180 Por lo tanto, tras el retraso habilitar/deshabilitar, cargar y descargar sólo los representantes IB de los esclavos evitan el punto muerto mencionado anteriormente. Mientras lo hace, refactorice la función mlx5_esw_offloads_rep_load() para tener un método auxiliar estático para su lógica interna, en simetría con el diseño de descarga del representante.

19 Jun 2024, 14:15

Type Values Removed Values Added
New CVE

Information

Published : 2024-06-19 14:15

Updated : 2024-08-29 02:23


NVD link : CVE-2024-38557

Mitre link : CVE-2024-38557

CVE.ORG link : CVE-2024-38557


JSON object : View

Products Affected

linux

  • linux_kernel
CWE
CWE-667

Improper Locking