// For flags

CVE-2024-35795

drm/amdgpu: fix deadlock while reading mqd from debugfs

Severity Score

5.5
*CVSS v3.1

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

0
*Multiple Sources

Exploited in Wild

-
*KEV

Decision

Track
*SSVC
Descriptions

In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu: fix deadlock while reading mqd from debugfs

An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scenario in the amdgpu debugfs files. The machine
also hard-resets immediately after those lines are printed (although I
wasn't able to reproduce that part when reading by hand):

[ 1318.016074][ T1082] ======================================================
[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected
[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted
[ 1318.017598][ T1082] ------------------------------------------------------
[ 1318.018096][ T1082] tar/1082 is trying to acquire lock:
[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80
[ 1318.019084][ T1082]
[ 1318.019084][ T1082] but task is already holding lock:
[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.020607][ T1082]
[ 1318.020607][ T1082] which lock already depends on the new lock.
[ 1318.020607][ T1082]
[ 1318.022081][ T1082]
[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is:
[ 1318.023083][ T1082]
[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0
[ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90
[ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330
[ 1318.025683][ T1082] do_one_initcall+0x6a/0x350
[ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.026728][ T1082] kernel_init+0x15/0x1a0
[ 1318.027242][ T1082] ret_from_fork+0x2c/0x40
[ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.028281][ T1082]
[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}:
[ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330
[ 1318.029790][ T1082] do_one_initcall+0x6a/0x350
[ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310
[ 1318.030722][ T1082] kernel_init+0x15/0x1a0
[ 1318.031168][ T1082] ret_from_fork+0x2c/0x40
[ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20
[ 1318.032011][ T1082]
[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680
[ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0
[ 1318.033487][ T1082] __might_fault+0x58/0x80
[ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu]
[ 1318.034181][ T1082] full_proxy_read+0x55/0x80
[ 1318.034487][ T1082] vfs_read+0xa7/0x360
[ 1318.034788][ T1082] ksys_read+0x70/0xf0
[ 1318.035085][ T1082] do_syscall_64+0x94/0x180
[ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.035664][ T1082]
[ 1318.035664][ T1082] other info that might help us debug this:
[ 1318.035664][ T1082]
[ 1318.036487][ T1082] Chain exists of:
[ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[ 1318.036487][ T1082]
[ 1318.037310][ T1082] Possible unsafe locking scenario:
[ 1318.037310][ T1082]
[ 1318.037838][ T1082] CPU0 CPU1
[ 1318.038101][ T1082] ---- ----
[ 1318.038350][ T1082] lock(reservation_ww_class_mutex);
[ 1318.038590][ T1082] lock(reservation_ww_class_acquire);
[ 1318.038839][ T1082] lock(reservation_ww_class_mutex);
[ 1318.039083][ T1082] rlock(&mm->mmap_lock);
[ 1318.039328][ T1082]
[ 1318.039328][ T1082] *** DEADLOCK ***
[ 1318.039328][ T1082]
[ 1318.040029][ T1082] 1 lock held by tar/1082:
[ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.040560][ T1082]
[ 1318.040560][ T1082] stack backtrace:
[
---truncated---

En el kernel de Linux, se resolvió la siguiente vulnerabilidad: drm/amdgpu: corrige el punto muerto al leer mqd desde debugfs Una copia de seguridad de disco errónea en mi escritorio entró en debugfs y desencadenó el siguiente escenario de punto muerto en los archivos amdgpu debugfs. La máquina también se reinicia inmediatamente después de imprimir esas líneas (aunque no pude reproducir esa parte cuando leí a mano): [ 1318.016074][ T1082] =============== ======================================= [ 1318.016607][ T1082] ADVERTENCIA: posible bloqueo circular dependencia detectada [ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 No contaminado [ 1318.017598][ T1082] ----------------------- ------------------------------- [ 1318.018096][ T1082] tar/1082 está intentando adquirir el bloqueo: [ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, en: __might_fault+0x40/0x80 [ 1318.019084][ T1082] [ 1318.019084][ T1082] pero la tarea ya mantiene el bloqueo: [ 1318.020052 ][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, en: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.020607][ T1082] [ 1318.020607][ T1082] el bloqueo ya depende del nuevo cerrar con llave. [ 1318.020607][ T1082] [ 1318.022081][ T1082] [ 1318.022081][ T1082] la cadena de dependencia existente (en orden inverso) es: [ 1318.023083][ T1082] [ 1318.023083][ T1082] -> #2 (reservation_ww_ clase_mutex){+ .+.}-{3:3}: [ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 [ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 [ 1318.025161][ T1082] dep+0x18a/0x330 [ 1318.025683 ][ T1082] do_one_initcall+0x6a/0x350 [ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.026728][ T1082] kernel_init+0x15/0x1a0 [ 1318.027242][ T1082] from_fork+0x2c/0x40 [ 1318.027759][ T1082] ret_from_fork_asm+ 0x11/0x20 [ 1318.028281][ T1082] [ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: [ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 [ 1 318.029790][ T1082] do_one_initcall+0x6a/0x350 [ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.030722][ T1082] kernel_init+0x15/0x1a0 [ 1318.031168][ T1082] bifurcación+0x2c/0x40 [ 1318.031598][ T1082] ret_from_fork_asm+0x11/ 0x20 [ 1318.032011][ T1082] [ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: [ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.033141 ] [ T1082] lock_acquire+0xcd/0x2c0 [ 1318.033487][ T1082] __might_fault+0x58/0x80 [ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] [ 1318.03418 1][T1082] lectura_proxy_completa+0x55/0x80 [1318.034487][T1082] vfs_read+0xa7/0x360 [ 1318.034788][ T1082] ksys_read+0x70/0xf0 [ 1318.035085][ T1082] do_syscall_64+0x94/0x180 [ 1318.035375][ T1082] wframe+0x46/0x4e [ 1318.035664][ T1082] [ 1318.035664][ T1082] Otra información que podría ayudarnos a depurar esto: [1318.035664] [T1082] [1318.036487] [T1082] La cadena existe de: [1318.036487] [T1082] & mm-> mmap_lock-> reservation_ww_class_acquire-> reservación_www_mutex. ] [ 1318.037310][T1082] Posible escenario de bloqueo inseguro: [ 1318.037310][ T1082] [ 1318.037838][ T1082] CPU0 CPU1 [ 1318.038101][ T1082] ---- ---- [ 1318.038350][ T1082] _class_mutex); [ 1318.038590][ T1082] bloqueo(reservation_ww_class_acquire); [ 1318.038839][ T1082] bloqueo(reservation_ww_class_mutex); [ 1318.039083][ T1082] rlock(&mm->mmap_lock); [ 1318.039328][ T1082] [ 1318.039328][ T1082] *** DEADLOCK *** [ 1318.039328][ T1082] [ 1318.040029][ T1082] 1 bloqueo retenido por tar/1082: [ 1318.040259][ T1082] #0: ffff98c4c13f55f8 ( reserve_ww_class_mutex){+.+.}-{3:3}, en: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.040560][ T1082] [ 1318.040560][ T1082] seguimiento de pila: [ ---truncado---

A flaw was found in the `drm/amdgpu` subsystem in the Linux kernel, involving a deadlock occurring when reading the Memory Queue Descriptor (MQD) from `debugfs`. This issue could cause the system to hang during debug operations.

*Credits: N/A
CVSS Scores
Attack Vector
Local
Attack Complexity
Low
Privileges Required
Low
User Interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
High
* Common Vulnerability Scoring System
SSVC
  • Decision:Track
Exploitation
None
Automatable
No
Tech. Impact
Partial
* Organization's Worst-case Scenario
Timeline
  • 2024-05-17 CVE Reserved
  • 2024-05-17 CVE Published
  • 2024-05-18 EPSS Updated
  • 2024-12-19 CVE Updated
  • ---------- Exploited in Wild
  • ---------- KEV Due Date
  • ---------- First Exploit
CWE
  • CWE-400: Uncontrolled Resource Consumption
CAPEC
Affected Vendors, Products, and Versions
Vendor Product Version Other Status
Vendor Product Version Other Status <-- --> Vendor Product Version Other Status
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.5 < 6.6.24
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.6.24"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.5 < 6.7.12
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.7.12"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.5 < 6.8.3
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.8.3"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.5 < 6.9
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.5 < 6.9"
en
Affected