CVE-2024-35795
drm/amdgpu: fix deadlock while reading mqd from debugfs
Severity Score
Exploit Likelihood
Affected Versions
Public Exploits
0Exploited in Wild
-Decision
Descriptions
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: fix deadlock while reading mqd from debugfs
An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scenario in the amdgpu debugfs files. The machine
also hard-resets immediately after those lines are printed (although I
wasn't able to reproduce that part when reading by hand):
[ 1318.016074][ T1082] ======================================================
[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected
[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted
[ 1318.017598][ T1082] ------------------------------------------------------
[ 1318.018096][ T1082] tar/1082 is trying to acquire lock:
[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80
[ 1318.019084][ T1082]
[ 1318.019084][ T1082] but task is already holding lock:
[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.020607][ T1082]
[ 1318.020607][ T1082] which lock already depends on the new lock.
[ 1318.020607][ T1082]
[ 1318.022081][ T1082]
[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is:
[ 1318.023083][ T1082]
[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0
[ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90
[ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330
[ 1318.025683][ T1082] do_one_initcall+0x6a/0x350
[ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.026728][ T1082] kernel_init+0x15/0x1a0
[ 1318.027242][ T1082] ret_from_fork+0x2c/0x40
[ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.028281][ T1082]
[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}:
[ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330
[ 1318.029790][ T1082] do_one_initcall+0x6a/0x350
[ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.030722][ T1082] kernel_init+0x15/0x1a0
[ 1318.031168][ T1082] ret_from_fork+0x2c/0x40
[ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.032011][ T1082]
[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680
[ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0
[ 1318.033487][ T1082] __might_fault+0x58/0x80
[ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu]
[ 1318.034181][ T1082] full_proxy_read+0x55/0x80
[ 1318.034487][ T1082] vfs_read+0xa7/0x360
[ 1318.034788][ T1082] ksys_read+0x70/0xf0
[ 1318.035085][ T1082] do_syscall_64+0x94/0x180
[ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.035664][ T1082]
[ 1318.035664][ T1082] other info that might help us debug this:
[ 1318.035664][ T1082]
[ 1318.036487][ T1082] Chain exists of:
[ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[ 1318.036487][ T1082]
[ 1318.037310][ T1082] Possible unsafe locking scenario:
[ 1318.037310][ T1082]
[ 1318.037838][ T1082] CPU0 CPU1
[ 1318.038101][ T1082] ---- ----
[ 1318.038350][ T1082] lock(reservation_ww_class_mutex);
[ 1318.038590][ T1082] lock(reservation_ww_class_acquire);
[ 1318.038839][ T1082] lock(reservation_ww_class_mutex);
[ 1318.039083][ T1082] rlock(&mm->mmap_lock);
[ 1318.039328][ T1082]
[ 1318.039328][ T1082] *** DEADLOCK ***
[ 1318.039328][ T1082]
[ 1318.040029][ T1082] 1 lock held by tar/1082:
[ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.040560][ T1082]
[ 1318.040560][ T1082] stack backtrace:
[
---truncated---
En el kernel de Linux, se resolvió la siguiente vulnerabilidad: drm/amdgpu: corrige el punto muerto al leer mqd desde debugfs Una copia de seguridad de disco errónea en mi escritorio entró en debugfs y desencadenó el siguiente escenario de punto muerto en los archivos amdgpu debugfs. La máquina también se reinicia inmediatamente después de imprimir esas líneas (aunque no pude reproducir esa parte cuando leí a mano): [ 1318.016074][ T1082] =============== ======================================= [ 1318.016607][ T1082] ADVERTENCIA: posible bloqueo circular dependencia detectada [ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 No contaminado [ 1318.017598][ T1082] ----------------------- ------------------------------- [ 1318.018096][ T1082] tar/1082 está intentando adquirir el bloqueo: [ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, en: __might_fault+0x40/0x80 [ 1318.019084][ T1082] [ 1318.019084][ T1082] pero la tarea ya mantiene el bloqueo: [ 1318.020052 ][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, en: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.020607][ T1082] [ 1318.020607][ T1082] el bloqueo ya depende del nuevo cerrar con llave. [ 1318.020607][ T1082] [ 1318.022081][ T1082] [ 1318.022081][ T1082] la cadena de dependencia existente (en orden inverso) es: [ 1318.023083][ T1082] [ 1318.023083][ T1082] -> #2 (reservation_ww_ clase_mutex){+ .+.}-{3:3}: [ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 [ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 [ 1318.025161][ T1082] dep+0x18a/0x330 [ 1318.025683 ][ T1082] do_one_initcall+0x6a/0x350 [ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.026728][ T1082] kernel_init+0x15/0x1a0 [ 1318.027242][ T1082] from_fork+0x2c/0x40 [ 1318.027759][ T1082] ret_from_fork_asm+ 0x11/0x20 [ 1318.028281][ T1082] [ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: [ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 [ 1 318.029790][ T1082] do_one_initcall+0x6a/0x350 [ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.030722][ T1082] kernel_init+0x15/0x1a0 [ 1318.031168][ T1082] bifurcación+0x2c/0x40 [ 1318.031598][ T1082] ret_from_fork_asm+0x11/ 0x20 [ 1318.032011][ T1082] [ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: [ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.033141 ] [ T1082] lock_acquire+0xcd/0x2c0 [ 1318.033487][ T1082] __might_fault+0x58/0x80 [ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] [ 1318.03418 1][T1082] lectura_proxy_completa+0x55/0x80 [1318.034487][T1082] vfs_read+0xa7/0x360 [ 1318.034788][ T1082] ksys_read+0x70/0xf0 [ 1318.035085][ T1082] do_syscall_64+0x94/0x180 [ 1318.035375][ T1082] wframe+0x46/0x4e [ 1318.035664][ T1082] [ 1318.035664][ T1082] Otra información que podría ayudarnos a depurar esto: [1318.035664] [T1082] [1318.036487] [T1082] La cadena existe de: [1318.036487] [T1082] & mm-> mmap_lock-> reservation_ww_class_acquire-> reservación_www_mutex. ] [ 1318.037310][T1082] Posible escenario de bloqueo inseguro: [ 1318.037310][ T1082] [ 1318.037838][ T1082] CPU0 CPU1 [ 1318.038101][ T1082] ---- ---- [ 1318.038350][ T1082] _class_mutex); [ 1318.038590][ T1082] bloqueo(reservation_ww_class_acquire); [ 1318.038839][ T1082] bloqueo(reservation_ww_class_mutex); [ 1318.039083][ T1082] rlock(&mm->mmap_lock); [ 1318.039328][ T1082] [ 1318.039328][ T1082] *** DEADLOCK *** [ 1318.039328][ T1082] [ 1318.040029][ T1082] 1 bloqueo retenido por tar/1082: [ 1318.040259][ T1082] #0: ffff98c4c13f55f8 ( reserve_ww_class_mutex){+.+.}-{3:3}, en: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.040560][ T1082] [ 1318.040560][ T1082] seguimiento de pila: [ ---truncado---
A flaw was found in the `drm/amdgpu` subsystem in the Linux kernel, involving a deadlock occurring when reading the Memory Queue Descriptor (MQD) from `debugfs`. This issue could cause the system to hang during debug operations.
CVSS Scores
SSVC
- Decision:Track
Timeline
- 2024-05-17 CVE Reserved
- 2024-05-17 CVE Published
- 2024-05-18 EPSS Updated
- 2024-12-19 CVE Updated
- ---------- Exploited in Wild
- ---------- KEV Due Date
- ---------- First Exploit
CWE
- CWE-400: Uncontrolled Resource Consumption
CAPEC
References (7)
URL | Tag | Source |
---|---|---|
https://git.kernel.org/stable/c/445d85e3c1dfd8c45b24be6f1527f1e117256d0e | Vuln. Introduced |
URL | Date | SRC |
---|
URL | Date | SRC |
---|---|---|
https://access.redhat.com/security/cve/CVE-2024-35795 | 2024-11-12 | |
https://bugzilla.redhat.com/show_bug.cgi?id=2281155 | 2024-11-12 |
Affected Vendors, Products, and Versions
Vendor | Product | Version | Other | Status | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Vendor | Product | Version | Other | Status | <-- --> | Vendor | Product | Version | Other | Status |
Linux Search vendor "Linux" | Linux Kernel Search vendor "Linux" for product "Linux Kernel" | >= 6.5 < 6.6.24 Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.6.24" | en |
Affected
| ||||||
Linux Search vendor "Linux" | Linux Kernel Search vendor "Linux" for product "Linux Kernel" | >= 6.5 < 6.7.12 Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.7.12" | en |
Affected
| ||||||
Linux Search vendor "Linux" | Linux Kernel Search vendor "Linux" for product "Linux Kernel" | >= 6.5 < 6.8.3 Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.8.3" | en |
Affected
| ||||||
Linux Search vendor "Linux" | Linux Kernel Search vendor "Linux" for product "Linux Kernel" | >= 6.5 < 6.9 Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.9" | en |
Affected
|