CVE-2024-26939

drm/i915/vma: Fix UAF on destroy against retire race

Severity Score

5.5

*CVSS v3.1

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

*Multiple Sources

Exploited in Wild

*KEV

Decision

Track*

*SSVC

Descriptions

In the Linux kernel, the following vulnerability has been resolved: drm/i915/vma: Fix UAF on destroy against retire race Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object when parking a GT believed to be idle. [161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] That has been tracked down to be happening when another thread is
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter has been already decremented to 0, but before deactivation
of the VMA's object is reported to the object debugging tool. We could prevent from that race by serializing i915_active_fini() with
__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
being used, e.g. from __i915_vma_retire() called at the end of
__active_retire(), after that VMA has been already freed by a concurrent
i915_vma_destroy() on return from the i915_active_fini(). Then, we should
rather fix the issue at the VMA level, not in i915_active. Since __i915_vma_parked() is called from __gt_park() on last put of the
GT's wakeref, the issue could be addressed by holding the GT wakeref long
enough for __active_retire() to complete before that wakeref is released
and the GT parked. I believe the issue was introduced by commit d93939730347 ("drm/i915:
Remove the vma refcount") which moved a call to i915_active_fini() from
a dropped i915_vma_release(), called on last put of the removed VMA kref,
to i915_vma_parked() processing path called on last put of a GT wakeref.
However, its visibility to the object debugging tool was suppressed by a
bug in i915_active that was fixed two weeks later with commit e92eb246feb9
("drm/i915/active: Fix missing debug object activation"). A VMA associated with a request doesn't acquire a GT wakeref by itself.
Instead, it depends on a wakeref held directly by the request's active
intel_context for a GT associated with its VM, and indirectly on that
intel_context's engine wakeref if the engine belongs to the same GT as the
VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. Fix the issue by getting a wakeref for the VMA's GT when activating it,
and putting that wakeref only after the VMA is deactivated. However,
exclude global GTT from that processing path, otherwise the GPU never goes
idle. Since __i915_vma_retire() may be called from atomic contexts, use
async variant of wakeref put. Also, to avoid circular locking dependency,
take care of acquiring the wakeref before VM mutex when both are needed. v7: Add inline comments with justifications for: - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - using async variant of _put(), - not getting the wakeref in case of a global GTT, - always getting the first wakeref outside vm->mutex.
v6: Since __i915_vma_active/retire() callbacks are not serialized, storing a wakeref tracking handle inside struct i915_vma is not safe, and there is no other good place for that. Use untracked variants of intel_gt_pm_get/put_async().
v5: Replace "tile" with "GT" across commit description (Rodrigo), - ---truncated---

En el kernel de Linux, se resolvió la siguiente vulnerabilidad: drm/i915/vma: Corrección de UAF al destruir contra ejecución de retirada. Las herramientas de depuración de objetos informaban esporádicamente intentos ilegales de liberar un objeto i915 VMA aún activo al estacionar un GT que se creía que estaba inactivo. [161.359441] ODEBUG: objeto activo libre (estado activo 0): ffff88811643b958 tipo de objeto: i915_active sugerencia: __i915_vma_active+0x0/0x50 [i915] [161.360082] ADVERTENCIA: CPU: 5 PID: 276 en lib/debugobjects.c:514 _imprimir_objeto+ 0x80/0xb0 ... [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 No contaminado 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 [161.360314] Nombre de hardware: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 21/04/2022 [161.360322] Cola de trabajo: i915 desordenado __intel_wakeref_put_work [i915] [161.360592] RIP 0010:debug_print_object+0x 80/0xb0... [161.361347] debug_object_free +0xeb/0x110 [161.361362] i915_active_fini+0x14/0x130 [i915] [161.361866] referencias de versión+0xfe/0x1f0 [i915] [161.362543] i915_vma_parked+0x1db/0x380 [i915] .363129] __gt_park+0x121/0x230 [i915] [161.363515 ] ____intel_wakeref_put_last+0x1f/0x70 [i915] Se ha rastreado que eso sucede cuando otro subproceso desactiva el VMA dentro del asistente __active_retire(), después de que el contador activo del VMA ya se haya reducido a 0, pero antes de que se desactive la desactivación del objeto del VMA. reportado a la herramienta de depuración de objetos. Podríamos evitar esa ejecución serializando i915_active_fini() con __active_retire() a través de ref->tree_lock, pero eso no impediría que se use VMA, por ejemplo, desde __i915_vma_retire() llamado al final de __active_retire(), después de ese VMA ya ha sido liberado por un i915_vma_destroy() concurrente al regresar de i915_active_fini(). Entonces, deberíamos solucionar el problema a nivel de VMA, no en i915_active. Dado que __i915_vma_parked() se llama desde __gt_park() en la última colocación del wakeref del GT, el problema podría solucionarse manteniendo el wakeref del GT el tiempo suficiente para que __active_retire() se complete antes de que se libere el wakeref y se estacione el GT. Creo que el problema fue introducido por el commit d93939730347 ("drm/i915: Eliminar el recuento de vma") que movió una llamada a i915_active_fini() desde un i915_vma_release() eliminado, llamado en la última colocación del kref de VMA eliminado, a i915_vma_parked() ruta de procesamiento llamada en la última colocación de un wakeref GT. Sin embargo, su visibilidad para la herramienta de depuración de objetos fue suprimida por un error en i915_active que se solucionó dos semanas después con el commit e92eb246feb9 ("drm/i915/active: Reparar la activación del objeto de depuración que falta"). Un VMA asociado con una solicitud no adquiere un wakeref GT por sí solo. En cambio, depende de un wakeref mantenido directamente por el intel_context activo de la solicitud para un GT asociado con su VM, e indirectamente del wakeref del motor de ese intel_context si el motor pertenece al mismo GT que la VM del VMA. Esos wakerefs se liberan de forma asincrónica con la desactivación de VMA. Solucione el problema obteniendo un wakeref para el GT del VMA al activarlo y colocando ese wakeref solo después de que se desactive el VMA. Sin embargo, excluya el GTT global de esa ruta de procesamiento; de lo contrario, la GPU nunca quedará inactiva. Dado que se puede llamar a __i915_vma_retire() desde contextos atómicos, utilice la variante asíncrona de wakeref put. Además, para evitar la dependencia del bloqueo circular, tenga cuidado de adquirir el wakeref antes del mutex de VM cuando ambos sean necesarios. v7: agregue comentarios en línea con justificaciones para: - usar variantes sin seguimiento de intel_gt_pm_get/put() (Nirmoy), - usar la variante asíncrona de _put(), - no obtener el wakeref en caso de un GTT global, - obtener siempre el primer wakeref fuera de vm->mutex. ---truncado---

A use-after-free flaw was found in drivers/gpu/drm/i915/i915_vma.c in the Linux kernel that may lead to a crash.

In the Linux kernel, the following vulnerability has been resolved: drm/i915/vma: Fix UAF on destroy against retire race Object debugging tools were sporadically reporting illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle. [161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] [161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 ... [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 [161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 [161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] [161.360592] RIP: 0010:debug_print_object+0x80/0xb0 ... [161.361347] debug_object_free+0xeb/0x110 [161.361362] i915_active_fini+0x14/0x130 [i915] [161.361866] release_references+0xfe/0x1f0 [i915] [161.362543] i915_vma_parked+0x1db/0x380 [i915] [161.363129] __gt_park+0x121/0x230 [i915] [161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] That has been tracked down to be happening when another thread is deactivating the VMA inside __active_retire() helper, after the VMA's active counter has been already decremented to 0, but before deactivation of the VMA's object is reported to the object debugging tool. We could prevent from that race by serializing i915_active_fini() with __active_retire() via ref->tree_lock, but that wouldn't stop the VMA from being used, e.g. from __i915_vma_retire() called at the end of __active_retire(), after that VMA has been already freed by a concurrent i915_vma_destroy() on return from the i915_active_fini(). Then, we should rather fix the issue at the VMA level, not in i915_active. Since __i915_vma_parked() is called from __gt_park() on last put of the GT's wakeref, the issue could be addressed by holding the GT wakeref long enough for __active_retire() to complete before that wakeref is released and the GT parked. I believe the issue was introduced by commit d93939730347 ("drm/i915: Remove the vma refcount") which moved a call to i915_active_fini() from a dropped i915_vma_release(), called on last put of the removed VMA kref, to i915_vma_parked() processing path called on last put of a GT wakeref. However, its visibility to the object debugging tool was suppressed by a bug in i915_active that was fixed two weeks later with commit e92eb246feb9 ("drm/i915/active: Fix missing debug object activation"). A VMA associated with a request doesn't acquire a GT wakeref by itself. Instead, it depends on a wakeref held directly by the request's active intel_context for a GT associated with its VM, and indirectly on that intel_context's engine wakeref if the engine belongs to the same GT as the VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. Fix the issue by getting a wakeref for the VMA's GT when activating it, and putting that wakeref only after the VMA is deactivated. However, exclude global GTT from that processing path, otherwise the GPU never goes idle. Since __i915_vma_retire() may be called from atomic contexts, use async variant of wakeref put. Also, to avoid circular locking dependency, take care of acquiring the wakeref before VM mutex when both are needed. v7: Add inline comments with justifications for: - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - using async variant of _put(), - not getting the wakeref in case of a global GTT, - always getting the first wakeref outside vm->mutex. v6: Since __i915_vma_active/retire() callbacks are not serialized, storing a wakeref tracking handle inside struct i915_vma is not safe, and there is no other good place for that. Use untracked variants of intel_gt_pm_get/put_async(). v5: Replace "tile" with "GT" across commit description (Rodrigo), - ---truncated---

Several vulnerabilities have been discovered in the Linux kernel that may lead to a privilege escalation, denial of service or information leaks.

*Credits: N/A

Attack Vector

Local

Attack Complexity

Low

Privileges Required

Low

User Interaction

None

Scope

Unchanged

Confidentiality

None

Integrity

None

Availability

High

Attack Vector

Local

Attack Complexity

Low

Authentication

Single

Confidentiality

Complete

Integrity

Complete

Availability

Complete

* Common Vulnerability Scoring System

SSVC

Decision:Track*

Exploitation

None

Automatable

Tech. Impact

Total

* Organization's Worst-case Scenario

Timeline

2024-02-19 CVE Reserved
2024-05-01 CVE Published
2025-05-04 CVE Updated
2025-06-15 EPSS Updated
---------- Exploited in Wild
---------- KEV Due Date
---------- First Exploit

CWE

CWE-416: Use After Free

CAPEC

References (7)

URL	Tag	Source
https://git.kernel.org/stable/c/d93939730347360db0afe6a4367451b6f84ab7b1	Vuln. Introduced

URL	Date	SRC

URL	Date	SRC
https://git.kernel.org/stable/c/704edc9252f4988ae1ad7dafa23d0db8d90d7190	2024-04-27
https://git.kernel.org/stable/c/5e3eb862df9f972ab677fb19e0d4b9b1be8db7b5	2024-04-27
https://git.kernel.org/stable/c/59b2626dd8c8a2e13f18054b3530e0c00073d79f	2024-04-03
https://git.kernel.org/stable/c/0e45882ca829b26b915162e8e86dbb1095768e9e	2024-03-28

URL	Date	SRC
https://access.redhat.com/security/cve/CVE-2024-26939	2025-06-25
https://bugzilla.redhat.com/show_bug.cgi?id=2278220	2025-06-25

Affected Vendors, Products, and Versions

Vendor		Product				Version		Other		Status
Vendor	Product	Version	Other	Status	<-- -->	Vendor	Product	Version	Other	Status
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 5.19 < 6.1.88 Search vendor "Linux" for product "Linux Kernel" and version " >= 5.19 < 6.1.88"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 5.19 < 6.6.29 Search vendor "Linux" for product "Linux Kernel" and version " >= 5.19 < 6.6.29"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 5.19 < 6.8.3 Search vendor "Linux" for product "Linux Kernel" and version " >= 5.19 < 6.8.3"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 5.19 < 6.9 Search vendor "Linux" for product "Linux Kernel" and version " >= 5.19 < 6.9"		en		Affected