// For flags

CVE-2021-47041

nvmet-tcp: fix incorrect locking in state_change sk callback

Severity Score

"-"
*CVSS v-

Exploit Likelihood

*EPSS

Affected Versions

*CPE

Public Exploits

0
*Multiple Sources

Exploited in Wild

-
*KEV

Decision

Track
*SSVC
Descriptions

In the Linux kernel, the following vulnerability has been resolved:

nvmet-tcp: fix incorrect locking in state_change sk callback

We are not changing anything in the TCP connection state so
we should not take a write_lock but rather a read lock.

This caused a deadlock when running nvmet-tcp and nvme-tcp
on the same system, where state_change callbacks on the
host and on the controller side have causal relationship
and made lockdep report on this with blktests:

================================
WARNING: inconsistent lock state
5.12.0-rc3 #1 Tainted: G I
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-R} usage.
nvme/1324 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888363151000 (clock-AF_INET){++-?}-{2:2}, at: nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
{IN-SOFTIRQ-W} state was registered at:
__lock_acquire+0x79b/0x18d0
lock_acquire+0x1ca/0x480
_raw_write_lock_bh+0x39/0x80
nvmet_tcp_state_change+0x21/0x170 [nvmet_tcp]
tcp_fin+0x2a8/0x780
tcp_data_queue+0xf94/0x1f20
tcp_rcv_established+0x6ba/0x1f00
tcp_v4_do_rcv+0x502/0x760
tcp_v4_rcv+0x257e/0x3430
ip_protocol_deliver_rcu+0x69/0x6a0
ip_local_deliver_finish+0x1e2/0x2f0
ip_local_deliver+0x1a2/0x420
ip_rcv+0x4fb/0x6b0
__netif_receive_skb_one_core+0x162/0x1b0
process_backlog+0x1ff/0x770
__napi_poll.constprop.0+0xa9/0x5c0
net_rx_action+0x7b3/0xb30
__do_softirq+0x1f0/0x940
do_softirq+0xa1/0xd0
__local_bh_enable_ip+0xd8/0x100
ip_finish_output2+0x6b7/0x18a0
__ip_queue_xmit+0x706/0x1aa0
__tcp_transmit_skb+0x2068/0x2e20
tcp_write_xmit+0xc9e/0x2bb0
__tcp_push_pending_frames+0x92/0x310
inet_shutdown+0x158/0x300
__nvme_tcp_stop_queue+0x36/0x270 [nvme_tcp]
nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp]
nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp]
nvme_do_delete_ctrl+0x100/0x10c [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write_iter+0x2c7/0x460
new_sync_write+0x36c/0x610
vfs_write+0x5c0/0x870
ksys_write+0xf9/0x1d0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
irq event stamp: 10687
hardirqs last enabled at (10687): [<ffffffff9ec376bd>] _raw_spin_unlock_irqrestore+0x2d/0x40
hardirqs last disabled at (10686): [<ffffffff9ec374d8>] _raw_spin_lock_irqsave+0x68/0x90
softirqs last enabled at (10684): [<ffffffff9f000608>] __do_softirq+0x608/0x940
softirqs last disabled at (10649): [<ffffffff9cdedd31>] do_softirq+0xa1/0xd0

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(clock-AF_INET);
<Interrupt>
lock(clock-AF_INET);

*** DEADLOCK ***

5 locks held by nvme/1324:
#0: ffff8884a01fe470 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0
#1: ffff8886e435c090 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460
#2: ffff888104d90c38 (kn->active#255){++++}-{0:0}, at: kernfs_remove_self+0x22d/0x330
#3: ffff8884634538d0 (&queue->queue_lock){+.+.}-{3:3}, at: nvme_tcp_stop_queue+0x52/0xb0 [nvme_tcp]
#4: ffff888363150d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_shutdown+0x59/0x300

stack backtrace:
CPU: 26 PID: 1324 Comm: nvme Tainted: G I 5.12.0-rc3 #1
Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.10.0 11/12/2020
Call Trace:
dump_stack+0x93/0xc2
mark_lock_irq.cold+0x2c/0xb3
? verify_lock_unused+0x390/0x390
? stack_trace_consume_entry+0x160/0x160
? lock_downgrade+0x100/0x100
? save_trace+0x88/0x5e0
? _raw_spin_unlock_irqrestore+0x2d/0x40
mark_lock+0x530/0x1470
? mark_lock_irq+0x1d10/0x1d10
? enqueue_timer+0x660/0x660
mark_usage+0x215/0x2a0
__lock_acquire+0x79b/0x18d0
? tcp_schedule_loss_probe.part.0+0x38c/0x520
lock_acquire+0x1ca/0x480
? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
? rcu_read_unlock+0x40/0x40
? tcp_mtu_probe+0x1ae0/0x1ae0
? kmalloc_reserve+0xa0/0xa0
? sysfs_file_ops+0x170/0x170
_raw_read_lock+0x3d/0xa0
? nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
nvme_tcp_state_change+0x21/0x150 [nvme_tcp]
? sysfs_file_ops
---truncated---

En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: nvmet-tcp: corrige el bloqueo incorrecto en la devolución de llamada de state_change sk No estamos cambiando nada en el estado de la conexión TCP, por lo que no debemos tomar un bloqueo de escritura sino un bloqueo de lectura. Esto causó un punto muerto al ejecutar nvmet-tcp y nvme-tcp en el mismo sistema, donde las devoluciones de llamada state_change en el host y en el lado del controlador tienen una relación causal y generaron un informe de lockdep sobre esto con blktests: ========= ======================= ADVERTENCIA: estado de bloqueo inconsistente 5.12.0-rc3 #1 Contaminado: GI ------------ -------------------- uso inconsistente de {IN-SOFTIRQ-W} -&gt; {SOFTIRQ-ON-R}. nvme/1324 [HC0[0]:SC0[0]:HE1:SE1] toma: ffff888363151000 (reloj-AF_INET){++-?}-{2:2}, en: nvme_tcp_state_change+0x21/0x150 [nvme_tcp] { IN-SOFTIRQ-W} el estado se registró en: __lock_acquire+0x79b/0x18d0 lock_acquire+0x1ca/0x480 _raw_write_lock_bh+0x39/0x80 nvmet_tcp_state_change+0x21/0x170 [nvmet_tcp] tcp_fin+0x2a8/0x780 tcp_data_queue +0xf94/0x1f20 tcp_rcv_establecido+0x6ba/0x1f00 tcp_v4_do_rcv +0x502/0x760 tcp_v4_rcv+0x257e/0x3430 ip_protocol_deliver_rcu+0x69/0x6a0 ip_local_deliver_finish+0x1e2/0x2f0 ip_local_deliver+0x1a2/0x420 ip_rcv+0x4fb/0x6b0 __netif_receive _skb_one_core+0x162/0x1b0 proceso_backlog+0x1ff/0x770 __napi_poll.constprop.0+0xa9/0x5c0 net_rx_action+ 0x7b3/0xb30 __do_softirq+0x1f0/0x940 do_softirq+0xa1/0xd0 __local_bh_enable_ip+0xd8/0x100 ip_finish_output2+0x6b7/0x18a0 __ip_queue_xmit+0x706/0x1aa0 __tcp_transmit_skb+0 x2068/0x2e20 tcp_write_xmit+0xc9e/0x2bb0 __tcp_push_pending_frames+0x92/0x310 inet_shutdown+0x158/0x300 __nvme_tcp_stop_queue+ 0x36/0x270 [nvme_tcp] nvme_tcp_stop_queue+0x87/0xb0 [nvme_tcp] nvme_tcp_teardown_admin_queue+0x69/0xe0 [nvme_tcp] nvme_do_delete_ctrl+0x100/0x10c [nvme_core] nv me_sysfs_delete.cold+0x8/0xd [nvme_core] kernfs_fop_write_iter+0x2c7/0x460 new_sync_write+0x36c/0x610 vfs_write+0x5c0/0x870 ksys_write+0xf9/0x1d0 do_syscall_64+0x33/0x40 Entry_SYSCALL_64_after_hwframe+0x44/0xae Sello de evento irq: 10687 hardirqs habilitado por última vez en (10687): [] _raw_spin_unlock _irqrestore+0x2d/0x40 hardirqs se deshabilitó por última vez en (10686) : [] _raw_spin_lock_irqsave+0x68/0x90 softirqs habilitado por última vez en (10684): [] __do_softirq+0x608/0x940 softirqs deshabilitado por última vez en (10649): [] do_softirq+0xa1 /0xd0 otra información que podría ayudarnos a depurar esto: Posible escenario de bloqueo inseguro: CPU0 ---- lock(clock-AF_INET); bloqueo(reloj-AF_INET); *** DEADLOCK *** 5 bloqueos retenidos por nvme/1324: #0: ffff8884a01fe470 (sb_writers#4){.+.+}-{0:0}, en: ksys_write+0xf9/0x1d0 #1: ffff8886e435c090 (&amp;of -&gt;mutex){+.+.}-{3:3}, en: kernfs_fop_write_iter+0x216/0x460 #2: ffff888104d90c38 (kn-&gt;active#255){++++}-{0:0}, en : kernfs_remove_self+0x22d/0x330 #3: ffff8884634538d0 (&amp;queue-&gt;queue_lock){+.+.}-{3:3}, en: nvme_tcp_stop_queue+0x52/0xb0 [nvme_tcp] #4: ffff888363150d30 (sk_lock-AF _INET){+ .+.}-{0:0}, en: inet_shutdown+0x59/0x300 seguimiento de pila: CPU: 26 PID: 1324 Comunicaciones: nvme Contaminado: GI 5.12.0-rc3 #1 Nombre de hardware: Dell Inc. PowerEdge R640/06NR82 , BIOS 2.10.0 12/11/2020 Seguimiento de llamadas: dump_stack+0x93/0xc2 mark_lock_irq.cold+0x2c/0xb3? verificar_lock_unused+0x390/0x390? stack_trace_consume_entry+0x160/0x160? lock_downgrade+0x100/0x100? save_trace+0x88/0x5e0? _raw_spin_unlock_irqrestore+0x2d/0x40 mark_lock+0x530/0x1470 ? mark_lock_irq+0x1d10/0x1d10? enqueue_timer+0x660/0x660 mark_usage+0x215/0x2a0 __lock_acquire+0x79b/0x18d0? tcp_schedule_loss_probe.part.0+0x38c/0x520 lock_acquire+0x1ca/0x480? nvme_tcp_state_change+0x21/0x150 [nvme_tcp] ? rcu_read_unlock+0x40/0x40? tcp_mtu_probe+0x1ae0/0x1ae0? kmalloc_reserve+0xa0/0xa0? sysfs_file_ops+0x170/0x170 _raw_read_lock+0x3d/0xa0 ? nvme_tcp_state_change+0x21/0x150 [nvme_tcp] nvme_tcp_state_change+0x21/0x150 [nvme_tcp] ? sysfs_file_ops ---truncado---

*Credits: N/A
CVSS Scores
Attack Vector
-
Attack Complexity
-
Privileges Required
-
User Interaction
-
Scope
-
Confidentiality
-
Integrity
-
Availability
-
* Common Vulnerability Scoring System
SSVC
  • Decision:Track
Exploitation
None
Automatable
No
Tech. Impact
Partial
* Organization's Worst-case Scenario
Timeline
  • 2024-02-27 CVE Reserved
  • 2024-02-28 CVE Published
  • 2024-02-29 EPSS Updated
  • 2024-08-04 CVE Updated
  • ---------- Exploited in Wild
  • ---------- KEV Due Date
  • ---------- First Exploit
CWE
CAPEC
Affected Vendors, Products, and Versions
Vendor Product Version Other Status
Vendor Product Version Other Status <-- --> Vendor Product Version Other Status
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.0 < 5.4.119
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.0 < 5.4.119"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.0 < 5.10.37
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.0 < 5.10.37"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.0 < 5.11.21
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.0 < 5.11.21"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.0 < 5.12.4
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.0 < 5.12.4"
en
Affected
Linux
Search vendor "Linux"
Linux Kernel
Search vendor "Linux" for product "Linux Kernel"
>= 5.0 < 5.13
Search vendor "Linux" for product "Linux Kernel" and version " >= 5.0 < 5.13"
en
Affected