// For flags

CVE-2024-41010

bpf: Fix too early release of tcx_entry

Severity Score

5.5
*CVSS v3.1

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

0
*Multiple Sources

Exploited in Wild

-
*KEV

Decision

Track
*SSVC
Descriptions

In the Linux kernel, the following vulnerability has been resolved:

bpf: Fix too early release of tcx_entry

Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
an issue that the tcx_entry can be released too early leading to a use
after free (UAF) when an active old-style ingress or clsact qdisc with a
shared tc block is later replaced by another ingress or clsact instance.

Essentially, the sequence to trigger the UAF (one example) can be as follows:

1. A network namespace is created
2. An ingress qdisc is created. This allocates a tcx_entry, and
&tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
same time, a tcf block with index 1 is created.
3. chain0 is attached to the tcf block. chain0 must be connected to
the block linked to the ingress qdisc to later reach the function
tcf_chain0_head_change_cb_del() which triggers the UAF.
4. Create and graft a clsact qdisc. This causes the ingress qdisc
created in step 1 to be removed, thus freeing the previously linked
tcx_entry:

rtnetlink_rcv_msg()
=> tc_modify_qdisc()
=> qdisc_create()
=> clsact_init() [a]
=> qdisc_graft()
=> qdisc_destroy()
=> __qdisc_destroy()
=> ingress_destroy() [b]
=> tcx_entry_free()
=> kfree_rcu() // tcx_entry freed

5. Finally, the network namespace is closed. This registers the
cleanup_net worker, and during the process of releasing the
remaining clsact qdisc, it accesses the tcx_entry that was
already freed in step 4, causing the UAF to occur:

cleanup_net()
=> ops_exit_list()
=> default_device_exit_batch()
=> unregister_netdevice_many()
=> unregister_netdevice_many_notify()
=> dev_shutdown()
=> qdisc_put()
=> clsact_destroy() [c]
=> tcf_block_put_ext()
=> tcf_chain0_head_change_cb_del()
=> tcf_chain_head_change_item()
=> clsact_chain_head_change()
=> mini_qdisc_pair_swap() // UAF

There are also other variants, the gist is to add an ingress (or clsact)
qdisc with a specific shared block, then to replace that qdisc, waiting
for the tcx_entry kfree_rcu() to be executed and subsequently accessing
the current active qdisc's miniq one way or another.

The correct fix is to turn the miniq_active boolean into a counter. What
can be observed, at step 2 above, the counter transitions from 0->1, at
step [a] from 1->2 (in order for the miniq object to remain active during
the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
eventual release. The reference counter in general ranges from [0,2] and
it does not need to be atomic since all access to the counter is protected
by the rtnl mutex. With this in place, there is no longer a UAF happening
and the tcx_entry is freed at the correct time.

En el kernel de Linux, se resolvió la siguiente vulnerabilidad: bpf: Se solucionó el lanzamiento demasiado temprano de tcx_entry Pedro Pinto y más tarde, de forma independiente, también Hyunwoo Kim y Wongi Lee informaron un problema por el cual tcx_entry se puede lanzar demasiado pronto, lo que lleva a un uso posterior a la liberación (UAF ) cuando una qdisc ingress o clsact antigua activa con un bloque tc compartido se reemplaza posteriormente por otra instancia de ingress o clsact. Esencialmente, la secuencia para activar la UAF (un ejemplo) puede ser la siguiente: 1. Se crea un espacio de nombres de red. 2. Se crea una qdisc de entrada. Esto asigna un tcx_entry, y &tcx_entry->miniq se almacena en el miniqp->p_miniq de la qdisc. Al mismo tiempo, se crea un bloque tcf con índice 1. 3. chain0 está adjunta al bloque tcf. chain0 debe estar conectado al bloque vinculado a la qdisc de ingreso para luego llegar a la función tcf_chain0_head_change_cb_del() que activa la UAF. 4. Cree e injerte una qdisc clsact. Esto hace que se elimine la qdisc de entrada creada en el paso 1, liberando así la tcx_entry previamente vinculada: rtnetlink_rcv_msg() => tc_modify_qdisc() => qdisc_create() => clsact_init() [a] => qdisc_graft() => qdisc_destroy( ) => __qdisc_destroy() => ingress_destroy() [b] => tcx_entry_free() => kfree_rcu() // tcx_entry liberado 5. Finalmente, se cierra el espacio de nombres de la red. Esto registra el trabajador cleanup_net y, durante el proceso de liberación de la qdisc clsact restante, accede a tcx_entry que ya se liberó en el paso 4, lo que provoca que se produzca la UAF: cleanup_net() => ops_exit_list() => default_device_exit_batch() => unregister_netdevice_many() => unregister_netdevice_many_notify() => dev_shutdown() => qdisc_put() => clsact_destroy() [c] => tcf_block_put_ext() => tcf_chain0_head_change_cb_del() => tcf_chain_head_change_item() => clsact_chain_head_change() => mini_qdisc_pair _intercambiar( ) // UAF También hay otras variantes, lo esencial es agregar una qdisc de ingreso (o clsact) con un bloque compartido específico, luego reemplazar esa qdisc, esperar a que se ejecute tcx_entry kfree_rcu() y posteriormente acceder al activo actual miniq de qdisc de una forma u otra. La solución correcta es convertir el booleano miniq_active en un contador. Lo que se puede observar, en el paso 2 anterior, el contador pasa de 0->1, en el paso [a] de 1->2 (para que el objeto miniq permanezca activo durante el reemplazo), luego en [b] de 2->1 y finalmente [c] 1->0 con el eventual lanzamiento. El contador de referencia en general oscila entre [0,2] y no necesita ser atómico ya que todo acceso al contador está protegido por el mutex rtnl. Con esto implementado, ya no ocurre ningún UAF y tcx_entry se libera en el momento correcto.

*Credits: N/A
CVSS Scores
Attack Vector
Local
Attack Complexity
Low
Privileges Required
Low
User Interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
High
* Common Vulnerability Scoring System
SSVC
  • Decision:Track
Exploitation
None
Automatable
No
Tech. Impact
Partial
* Organization's Worst-case Scenario
Timeline
  • 2024-07-12 CVE Reserved
  • 2024-07-17 CVE Published
  • 2024-07-20 EPSS Updated
  • 2024-12-19 CVE Updated
  • ---------- Exploited in Wild
  • ---------- KEV Due Date
  • ---------- First Exploit
CWE
  • CWE-416: Use After Free
CAPEC
Affected Vendors, Products, and Versions
Vendor Product Version Other Status
Vendor Product Version Other Status <-- --> Vendor Product Version Other Status
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.6 < 6.6.41
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.6 < 6.6.41"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.6 < 6.9.10
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.6 < 6.9.10"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 6.6 < 6.10
Search vendor "Linux" for product "Linux Kernel" and version " >= 6.6 < 6.10"
en
Affected