CVE-2024-40927

xhci: Handle TD clearing for multiple streams case

Severity Score

6.1

*CVSS v3.1

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

*Multiple Sources

Exploited in Wild

*KEV

Decision

Track

*SSVC

Descriptions

In the Linux kernel, the following vulnerability has been resolved: xhci: Handle TD clearing for multiple streams case When multiple streams are in use, multiple TDs might be in flight when
an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
each, to ensure everything is reset properly and the caches cleared.
Change the logic so that any N>1 TDs found active for different streams
are deferred until after the first one is processed, calling
xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
queue another command until we are done with all of them. Also change
the error/"should never happen" paths to ensure we at least clear any
affected TDs, even if we can't issue a command to clear the hardware
cache, and complain loudly with an xhci_warn() if this ever happens. This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct
assumptions about number of rings per endpoint.") early on in the XHCI
driver's life, when stream support was first added.
It was then identified but not fixed nor made into a warning in commit
674f8438c121 ("xhci: split handling halted endpoints into two steps"),
which added a FIXME comment for the problem case (without materially
changing the behavior as far as I can tell, though the new logic made
the problem more obvious). Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some
cached cancelled URBs."), it was acknowledged again. [Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached
cancelled URBs.") was a targeted regression fix to the previously mentioned
patch. Users reported issues with usb stuck after unmounting/disconnecting
UAS devices. This rolled back the TD clearing of multiple streams to its
original state.] Apparently the commit author was aware of the problem (yet still chose
to submit it): It was still mentioned as a FIXME, an xhci_dbg() was
added to log the problem condition, and the remaining issue was mentioned
in the commit description. The choice of making the log type xhci_dbg()
for what is, at this point, a completely unhandled and known broken
condition is puzzling and unfortunate, as it guarantees that no actual
users would see the log in production, thereby making it nigh
undebuggable (indeed, even if you turn on DEBUG, the message doesn't
really hint at there being a problem at all). It took me *months* of random xHC crashes to finally find a reliable
repro and be able to do a deep dive debug session, which could all have
been avoided had this unhandled, broken condition been actually reported
with a warning, as it should have been as a bug intentionally left in
unfixed (never mind that it shouldn't have been left in at all). > Another fix to solve clearing the caches of all stream rings with
> cancelled TDs is needed, but not as urgent. 3 years after that statement and 14 years after the original bug was
introduced, I think it's finally time to fix it. And maybe next time
let's not leave bugs unfixed (that are actually worse than the original
bug), and let's actually get people to review kernel commits please. Fixes xHC crashes and IOMMU faults with UAS devices when handling
errors/faults. Easiest repro is to use `hdparm` to mark an early sector
(e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
At least in the case of JMicron controllers, the read errors end up
having to cancel two TDs (for two queued requests to different streams)
and the one that didn't get cleared properly ends up faulting the xHC
entirely when it tries to access DMA pages that have since been unmapped,
referred to by the stale TDs. This normally happens quickly (after two
or three loops). After this fix, I left the `cat` in a loop running
overnight and experienced no xHC failures, with all read errors
recovered properly. Repro'd and tested on an Apple M1 Mac Mini
(dwc3 host). On systems without an IOMMU, this bug would instead silently corrupt
freed memory, making this a
---truncated---

In the Linux kernel, the following vulnerability has been resolved: xhci: Handle TD clearing for multiple streams case When multiple streams are in use, multiple TDs might be in flight when an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for each, to ensure everything is reset properly and the caches cleared. Change the logic so that any N>1 TDs found active for different streams are deferred until after the first one is processed, calling xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to queue another command until we are done with all of them. Also change the error/"should never happen" paths to ensure we at least clear any affected TDs, even if we can't issue a command to clear the hardware cache, and complain loudly with an xhci_warn() if this ever happens. This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") early on in the XHCI driver's life, when stream support was first added. It was then identified but not fixed nor made into a warning in commit 674f8438c121 ("xhci: split handling halted endpoints into two steps"), which added a FIXME comment for the problem case (without materially changing the behavior as far as I can tell, though the new logic made the problem more obvious). Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs."), it was acknowledged again. [Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") was a targeted regression fix to the previously mentioned patch. Users reported issues with usb stuck after unmounting/disconnecting UAS devices. This rolled back the TD clearing of multiple streams to its original state.] Apparently the commit author was aware of the problem (yet still chose to submit it): It was still mentioned as a FIXME, an xhci_dbg() was added to log the problem condition, and the remaining issue was mentioned in the commit description. The choice of making the log type xhci_dbg() for what is, at this point, a completely unhandled and known broken condition is puzzling and unfortunate, as it guarantees that no actual users would see the log in production, thereby making it nigh undebuggable (indeed, even if you turn on DEBUG, the message doesn't really hint at there being a problem at all). It took me *months* of random xHC crashes to finally find a reliable repro and be able to do a deep dive debug session, which could all have been avoided had this unhandled, broken condition been actually reported with a warning, as it should have been as a bug intentionally left in unfixed (never mind that it shouldn't have been left in at all). > Another fix to solve clearing the caches of all stream rings with > cancelled TDs is needed, but not as urgent. 3 years after that statement and 14 years after the original bug was introduced, I think it's finally time to fix it. And maybe next time let's not leave bugs unfixed (that are actually worse than the original bug), and let's actually get people to review kernel commits please. Fixes xHC crashes and IOMMU faults with UAS devices when handling errors/faults. Easiest repro is to use `hdparm` to mark an early sector (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. At least in the case of JMicron controllers, the read errors end up having to cancel two TDs (for two queued requests to different streams) and the one that didn't get cleared properly ends up faulting the xHC entirely when it tries to access DMA pages that have since been unmapped, referred to by the stale TDs. This normally happens quickly (after two or three loops). After this fix, I left the `cat` in a loop running overnight and experienced no xHC failures, with all read errors recovered properly. Repro'd and tested on an Apple M1 Mac Mini (dwc3 host). On systems without an IOMMU, this bug would instead silently corrupt freed memory, making this a ---truncated---

Chenyuan Yang discovered that the CEC driver driver in the Linux kernel contained a use-after-free vulnerability. A local attacker could use this to cause a denial of service or possibly execute arbitrary code. Chenyuan Yang discovered that the USB Gadget subsystem in the Linux kernel did not properly check for the device to be enabled before writing. A local attacker could possibly use this to cause a denial of service.

*Credits: N/A

Attack Vector

Local

Attack Complexity

Low

Privileges Required

Low

User Interaction

None

Scope

Unchanged

Confidentiality

None

Integrity

Low

Availability

High

Attack Vector

Local

Attack Complexity

Low

Authentication

Single

Confidentiality

Complete

Integrity

Complete

Availability

Complete

* Common Vulnerability Scoring System

SSVC

Decision:Track

Exploitation

None

Automatable

Tech. Impact

Partial

* Organization's Worst-case Scenario

Timeline

2024-07-12 CVE Reserved
2024-07-12 CVE Published
2025-05-04 CVE Updated
2025-06-16 EPSS Updated
---------- Exploited in Wild
---------- KEV Due Date
---------- First Exploit

CWE

CWE-820: Missing Synchronization

CAPEC

References (8)

URL	Tag	Source
https://git.kernel.org/stable/c/e9df17eb1408cfafa3d1844bfc7f22c7237b31b8	Vuln. Introduced

URL	Date	SRC

URL	Date	SRC
https://git.kernel.org/stable/c/26460c1afa311524f588e288a4941432f0de6228	2024-07-05
https://git.kernel.org/stable/c/633f72cb6124ecda97b641fbc119340bd88d51a9	2024-06-21
https://git.kernel.org/stable/c/949be4ec5835e0ccb3e2a8ab0e46179cb5512518	2024-06-21
https://git.kernel.org/stable/c/61593dc413c3655e4328a351555235bc3089486a	2024-06-21
https://git.kernel.org/stable/c/5ceac4402f5d975e5a01c806438eb4e554771577	2024-06-12

URL	Date	SRC
https://access.redhat.com/security/cve/CVE-2024-40927	2024-09-11
https://bugzilla.redhat.com/show_bug.cgi?id=2297511	2024-09-11

Affected Vendors, Products, and Versions

Vendor		Product				Version		Other		Status
Vendor	Product	Version	Other	Status	<-- -->	Vendor	Product	Version	Other	Status
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 2.6.35 < 5.15.162 Search vendor "Linux" for product "Linux Kernel" and version " >= 2.6.35 < 5.15.162"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 2.6.35 < 6.1.95 Search vendor "Linux" for product "Linux Kernel" and version " >= 2.6.35 < 6.1.95"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 2.6.35 < 6.6.35 Search vendor "Linux" for product "Linux Kernel" and version " >= 2.6.35 < 6.6.35"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 2.6.35 < 6.9.6 Search vendor "Linux" for product "Linux Kernel" and version " >= 2.6.35 < 6.9.6"		en		Affected
Linux Search vendor "Linux"		Linux Kernel Search vendor "Linux" for product "Linux Kernel"				>= 2.6.35 < 6.10 Search vendor "Linux" for product "Linux Kernel" and version " >= 2.6.35 < 6.10"		en		Affected