// For flags

CVE-2023-52489

mm/sparsemem: fix race in accessing memory_section->usage

Severity Score

5.5
*CVSS v3.1

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

0
*Multiple Sources

Exploited in Wild

-
*KEV

Decision

Track
*SSVC
Descriptions

In the Linux kernel, the following vulnerability has been resolved:

mm/sparsemem: fix race in accessing memory_section->usage

The below race is observed on a PFN which falls into the device memory
region with the system memory configuration where PFN's are such that
[ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and end
pfn contains the device memory PFN's as well, the compaction triggered
will try on the device memory PFN's too though they end up in NOP(because
pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections). When
from other core, the section mappings are being removed for the
ZONE_DEVICE region, that the PFN in question belongs to, on which
compaction is currently being operated is resulting into the kernel crash
with CONFIG_SPASEMEM_VMEMAP enabled. The crash logs can be seen at [1].

compact_zone() memunmap_pages
------------- ---------------
__pageblock_pfn_to_page
......
(a)pfn_valid():
valid_section()//return true
(b)__remove_pages()->
sparse_remove_section()->
section_deactivate():
[Free the array ms->usage and set
ms->usage = NULL]
pfn_section_valid()
[Access ms->usage which
is NULL]

NOTE: From the above it can be said that the race is reduced to between
the pfn_valid()/pfn_section_valid() and the section deactivate with
SPASEMEM_VMEMAP enabled.

The commit b943f045a9af("mm/sparse: fix kernel crash with
pfn_section_valid check") tried to address the same problem by clearing
the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns
false thus ms->usage is not accessed.

Fix this issue by the below steps:

a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage.

b) RCU protected read side critical section will either return NULL
when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage.

c) Free the ->usage with kfree_rcu() and set ms->usage = NULL. No
attempt will be made to access ->usage after this as the
SECTION_HAS_MEM_MAP is cleared thus valid_section() return false.

Thanks to David/Pavan for their inputs on this patch.

[1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/

On Snapdragon SoC, with the mentioned memory configuration of PFN's as
[ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of
issues daily while testing on a device farm.

For this particular issue below is the log. Though the below log is
not directly pointing to the pfn_section_valid(){ ms->usage;}, when we
loaded this dump on T32 lauterbach tool, it is pointing.

[ 540.578056] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[ 540.578068] Mem abort info:
[ 540.578070] ESR = 0x0000000096000005
[ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits
[ 540.578077] SET = 0, FnV = 0
[ 540.578080] EA = 0, S1PTW = 0
[ 540.578082] FSC = 0x05: level 1 translation fault
[ 540.578085] Data abort info:
[ 540.578086] ISV = 0, ISS = 0x00000005
[ 540.578088] CM = 0, WnR = 0
[ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--)
[ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c
[ 540.579454] lr : compact_zone+0x994/0x1058
[ 540.579460] sp : ffffffc03579b510
[ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c
[ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640
[ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000
[ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140
[ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff
[ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001
[ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440
[ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4
[ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000
---truncated---

En el kernel de Linux, se resolvió la siguiente vulnerabilidad: mm/sparsemem: corrige la carrera al acceder a la sección_memoria->uso La siguiente carrera se observa en un PFN que cae en la región de memoria del dispositivo con la configuración de memoria del sistema donde los PFN son tales que [ ZONA_NORMAL ZONA_DISPOSITIVO ZONA_NORMAL]. Dado que el pfn de inicio y fin de zona normal también contiene los PFN de la memoria del dispositivo, la compactación activada probará también los PFN de la memoria del dispositivo aunque terminen en NOP (porque pfn_to_online_page() devuelve NULL para las secciones de memoria ZONE_DEVICE). Cuando desde otro núcleo, las asignaciones de sección se eliminan para la región ZONE_DEVICE, a la que pertenece el PFN en cuestión, en la que se está operando la compactación actualmente, se produce el bloqueo del kernel con CONFIG_SPASEMEM_VMEMAP habilitado. Los registros de fallos se pueden ver en [1]. compact_zone() memunmap_pages ------------- --------------- __pageblock_pfn_to_page ...... (a)pfn_valid(): valid_section()/ /return true (b)__remove_pages()-> sparse_remove_section()-> section_deactivate(): [Libere la matriz ms->usage y establezca ms->usage = NULL] pfn_section_valid() [Acceda a ms->usage que es NULL] NOTA: De lo anterior se puede decir que la carrera se reduce a entre pfn_valid()/pfn_section_valid() y la sección desactivada con SPASEMEM_VMEMAP habilitado. La confirmación b943f045a9af("mm/sparse: fix kernel crash with pfn_section_valid check") intentó solucionar el mismo problema borrando SECTION_HAS_MEM_MAP con la expectativa de que valid_section() devuelva false, por lo que no se accede a ms->usage. Solucione este problema siguiendo los pasos a continuación: a) Borre SECTION_HAS_MEM_MAP antes de liberar el ->uso. b) La sección crítica del lado de lectura protegida por RCU devolverá NULL cuando se borre SECTION_HAS_MEM_MAP o podrá acceder con éxito a ->uso. c) Libere ->usage con kfree_rcu() y establezca ms->usage = NULL. No se intentará acceder a ->uso después de esto, ya que SECTION_HAS_MEM_MAP se borra, por lo que valid_section() devuelve falso. Gracias a David/Pavan por sus aportes en este parche. [1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/ En Snapdragon SoC, con la configuración de memoria mencionada de PFN como [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], pueden ver una gran cantidad de problemas diariamente mientras realizan pruebas en una granja de dispositivos. Para este problema en particular, a continuación se encuentra el registro. Aunque el siguiente registro no apunta directamente a pfn_section_valid(){ ms->usage;}, cuando cargamos este volcado en la herramienta Lauterbach T32, sí apunta. [ 540.578056] No se puede manejar la desreferencia del puntero NULL del kernel en la dirección virtual 0000000000000000 [ 540.578068] Información de cancelación de memoria: [ 540.578070] ESR = 0x0000000096000005 [ 540.578073] EC = 0x25: DABT ( EL actual), IL = 32 bits [ 540.578077] SET = 0 , FnV = 0 [ 540.578080] EA = 0, S1PTW = 0 [ 540.578082] FSC = 0x05: error de traducción de nivel 1 [ 540.578085] Información de cancelación de datos: [ 540.578086] ISV = 0, ISS = 0x00000005 [ 540.578088 ] CM = 0, WnR = 0 [ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--) [ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c [ 540.579454] lr : compact_zone+0x994/0x10 58 [540.579460] sp: ffffffc03579b510 [ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:00000000000000000c [ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640 [ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:00000000000000000 [ 540.579483] x20: 0 000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140 [ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff [ 540.579495] x14: 0000008000000000 x13: 00000000000000 00 x12:0000000000000001 [ 540.579501]---truncado---

A race condition was found on a PFN in the Linux Kernel, which can fall into the device memory region with the system memory configuration. Normal zone start and end PFNs contain the device memory PFNs as well, and the compaction triggered will try on the device memory PFNs and end up in NOP. This may lead to compromised Availability.

*Credits: N/A
CVSS Scores
Attack Vector
Local
Attack Complexity
Low
Privileges Required
Low
User Interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
High
* Common Vulnerability Scoring System
SSVC
  • Decision:Track
Exploitation
None
Automatable
No
Tech. Impact
Partial
* Organization's Worst-case Scenario
Timeline
  • 2024-02-20 CVE Reserved
  • 2024-02-29 CVE Published
  • 2024-03-01 EPSS Updated
  • 2024-08-02 CVE Updated
  • ---------- Exploited in Wild
  • ---------- KEV Due Date
  • ---------- First Exploit
CWE
  • CWE-362: Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
CAPEC
Affected Vendors, Products, and Versions
Vendor Product Version Other Status
Vendor Product Version Other Status <-- --> Vendor Product Version Other Status
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 5.10.210
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 5.10.210"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 5.15.149
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 5.15.149"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 6.1.76
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 6.1.76"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 6.6.15
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 6.6.15"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 6.7.3
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 6.7.3"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.3 < 6.8
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.3 < 6.8"
en
Affected