A new type of spinlock for the BPF subsystem

The 6.15 merge window saw the inclusion of a new type of lock for BPF programs: a resilient queued spinlock that Kumar Kartikeya Dwivedi has been working on for some time. Eventually, he hopes to convert all of the spinlocks currently used in the BPF subsystem to his new lock. He gave a remote presentation about the design of the lock at the 2025 Linux Storage, Filesystem, Memory-Management, and BPF summit.

在 6.15 合并窗口中,BPF 程序新增了一种新的锁类型:一种名为“resilient queued spinlock”(弹性队列自旋锁)的机制,这是 Kumar Kartikeya Dwivedi 长期以来一直在研究的成果。他的目标是最终将 BPF 子系统中现有的所有自旋锁都替换为这种新型锁。在 2025 年的 Linux 存储、文件系统、内存管理和 BPF 峰会上,他以远程形式介绍了该锁的设计。

Dwivedi began by providing a bit of background on existing locking in BPF. In 2019, Alexei Starovoitov introduced bpf_spin_lock(), which allowed BPF programs to update map values atomically. But the lock came with the serious limitation that a BPF program could only hold one lock at a time, and could not perform any function calls while the lock was held. This let the verifier ensure that BPF programs could not deadlock, but was awkward to use, Dwivedi said.

Dwivedi 首先介绍了 BPF 当前的锁机制背景。2019 年,Alexei Starovoitov 引入了 bpf_spin_lock(),使得 BPF 程序能够以原子方式更新 map 中的值。然而,这种锁有一个严重的限制:BPF 程序在任意时刻只能持有一个锁,并且在持锁期间无法调用任何函数。虽然这种设计让 verifier(验证器)能够保证 BPF 程序不会死锁,但使用起来非常不方便,Dwivedi 说道。

In 2022, sched_ext led to the introduction of more kernel data structures to BPF, including linked lists and red-black trees. The verifier was tasked with ensuring that the BPF program could lock and unlock those data structures correctly while manipulating them, but still only supported holding one lock at a time, and only allowed restricted operations while it was held. Some algorithms are much easier to express if the program is allowed to take two locks, Dwivedi explained. So this was a lot of friction to impose on BPF users, all for the sake of avoiding deadlocks.

到了 2022 年,随着 sched_ext 的引入,BPF 支持了更多的内核数据结构,包括链表和红黑树。verifier 的任务是确保 BPF 程序在操作这些数据结构时能够正确地加锁和解锁,但它仍然只允许一次持有一个锁,并且在持锁期间只能执行受限的操作。Dwivedi 解释说,有些算法在允许同时获取两个锁的情况下会容易得多。因此,这种限制虽然避免了死锁,却给 BPF 用户带来了很大的阻碍。

The thing is, the syzbot kernel fuzzing system regularly finds deadlocks in th

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Kernel_RDMA

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值