一、背景
在客户端拆锁后,因为锁的数量比较多,加之Client端业务复杂,各个锁之间保护的资源存在各种交互的情况,各个锁之间势必存在相互嵌套的情况。而多个锁之间加锁又必须有顺序性,最简单的几个锁,如果上锁顺序为:
1)线程1:AA 线程2 就会拿不到锁而死锁。
2)线程1:AB 线程2:BA 就会互相拿不到锁而死锁。
3)线程1:AB 线程2:BC 线程3:CA 一样也和两个线程ABBA而死锁。
4)线程重复放锁,不合规范。
以前介绍的lockdep算法,已经详细介绍过其核心检测算法为:
![]()
但是lockdep有个大的缺点是非常影响性能。因为他需要一个全局的集合去记录每个线程的加锁解锁情况,这个全局集合在所现称情况下自身也需要一把锁,那这个锁就会存在巨大的争抢问题,造成性能下降严重。
lockdep算法是“事后检测”,在代码完成后发生了死锁而检测出来断言。所以加之性能下降的问题,考虑到如上缺点,换个思路,只要保证每个线程自己的上锁顺序正确,也可以避免死锁问题,算是一种“事前检测”。由此,检测流程只需要记录和考虑自己线程的情况即可,而检测流程无非就是做一些运算,对性能影响不大。
二、锁管理状态机
(一)梳理锁串联表
Client拆锁后,总共有24把锁,进行梳理后,列出如下的锁串联表:
| client_lock | fh_lock | inode_map_lock | im_lock | mdsmap_lock | mds_sessions_lock | session_lock | mds_requests_lock | |
| dn_lock | ||||||||
| oc_lock | ||||||||
| trim_lock | ||||||||
| root_lock | ||||||||
| opened_session_caps_lock | ||||||||
| delayed_remove_caps_lock | ||||||||
| snap_realms_lock | snaprealm_lock | |||||||
| fd_or_odirs_lock | ||||||||
| timer_lock | ||||||||
| cr_inode_lock | ||||||||
| free_inode_map_lock | ||||||||
| generic_lock | ||||||||
| doing_lock | ||||||||
| cond_lock | ||||||||
| qos_lock |
如上表所示,从左到右表示锁的优先级大小为从大到小,同一列表示锁优先级相同,不允许优先级相同的锁同时加锁。例如从上表可得出:
client_lock > im_lock > session_lock > dn_lock
client_lock > trim_lock
im_lock > oc_lock
fd_or_odirs_lock 与 timer_lock无关联
等等。
(二)状态机定义
根据上面的锁串联表,定义状态机如下:
const lock_state_t LockManager::lock_state[LOCK_MAX] = {
// larger recursive readwrite spinlock weight lock_ptr
[CLIENT_LOCK] = {0, false, false, false, 10, std::make_shared<LockWrapper<Client, Mutex>>(&Client::client_lock)},
[DOING_LOCK] = {0, false, false, false, 10, std::make_shared<LockWrapper<Client, Mutex>>(&Client::doing_lock)},
[FH_LOCK] = {CLIENT_LOCK, false, false, false, 20, std::make_shared<LockWrapper<Fh, Mutex>>(&Fh::fh_lock)},
[TRIM_LOCK] = {CLIENT_LOCK, false, false, false, 20, std::make_shared<LockWrapper<Client, Mutex>>(&Client::trim_lock)},
[INODE_MAP_LOCK] = {FH_LOCK, false, true, false, 30, std::make_shared<LockWrapper<Client, RWLock>>(&Client::inode_map_lock)},
[COND_LOCK] = {FH_LOCK, false, false, false, 30, std::make_shared<LockWrapper<MetaRequest, Mutex>>(&MetaRequest::cond_lock)},
[IM_LOCK] = {INODE_MAP_LOCK, true, false, false, 40, std::make_shared<LockWrapper<Inode, Mutex>>(&Inode::im_lock)},
[MDSMAP_LOCK] = {IM_LOCK, false, true, false, 50, std::make_shared<LockWrapper<Client, RWLock>>(&Client::mdsmap_lock)},
[OC_LOCK] = {IM_LOCK, false, false, false, 50, std::make_shared<LockWrapper<Client, Mutex>>(&Client::oc_lock)},
[ROOT_LOCK] = {IM_LOCK, false, true, false, 50, std::make_shared<LockWrapper<Client, RWLock>>(&Client::root_lock)},
[FD_OR_ODIRS_LOCK] = {IM_LOCK, false, false, true, 50, std::make_shared<LockWrapper<Client, spinlock>>(&Client::fd_or_odirs_lock)},
[TIMER_LOCK] = {IM_LOCK, false, false, false, 50, std::make_shared<LockWrapper<Client, Mutex>>(&Client::timer_lock)},
[CR_INODE_LOCK] = {IM_LOCK, false, false, false, 50, std::make_shared<LockWrapper<Client, Mutex>>(&Client::cr_inode_lock)},
[FREE_INODE_MAP_LOCK] = {IM_LOCK, false, false, true, 50, std::make_shared<LockWrapper<Client, spinlock>>(&Client::free_inode_map_lock)},
[GENERIC_LOCK] = {IM_LOCK, false, true, false, 50, std::make_shared<LockWrapper<Client, RWLock>>(&Client::generic_lock)},
[QOS_LOCK] = {IM_LOCK, false, false, false, 50, std::make_shared<LockWrapper<QoSClient, Mutex>>(&QoSClient::qos_lock)},
[MDS_SESSIONS_LOCK] = {MDSMAP_LOCK, false, true, false, 60, std::make_shared<LockWrapper<Client, RWLock>>(&Client::mds_sessions_lock)},
[SESSION_LOCK] = {MDS_SESSIONS_LOCK, false, false, true, 70, std::make_shared<LockWrapper<MetaSession, std::shared_ptr<spinlock>>>(&MetaSession::session_lock)},
[MDS_REQUESTS_LOCK] = {SESSION_LOCK, false, true, false, 80, std::make_shared<LockWrapper<Client, RWLock>>(&Client::mds_requests_lock)},
[DN_LOCK] = {SESSION_LOCK, false, false, false, 80, std::make_shared<LockWrapper<Dentry, Mutex>>(&Dentry::dn_lock)},
[OPENED_SESSION_CAPS_LOCK] = {SESSION_LOCK, false, false, false, 80, std::make_shared<LockWrapper<Client, Mutex>>(&Client::opened_session_caps_lock)},
[DELAYED_REMOVE_CAPS_LOCK] = {SESSION_LOCK, false, false, true, 80, std::make_shared<LockWrapper<Client, spinlock>>(&Client::delayed_remove_caps_lock)},
[SNAP_REALMS_LOCK] = {SESSION_LOCK, false, false, true, 80, std::make_shared<LockWrapper<Client, spinlock>>(&Client::snap_realms_lock)},
[SNAPREALM_LOCK] = {SNAP_REALMS_LOCK, true, false, false, 90, std::make_shared<LockWrapper<SnapRealm, Mutex>>(&SnapRealm::snaprealm_lock)}
};
该状态机最左列是一维数组键,从client_lock到snaprealm_lock逐次键值增大(表示优先级降低)。数组内容中,第一列表示larger,也就是比本锁大的第一个锁,其中client_lock和doing_lock最大所以larger是0。第二列表示此锁是否是回环锁,Client模块目前只有im_lock和snaprealm_lock是回环锁,其余均非回环锁。第三列表示此锁是否是读写锁。第四列表示此锁是否是自旋锁。第五列表示锁的权重值,同优先级的锁权重相等。第六列表示锁指针,检测算法检测通过后直接通过模板函数调用加锁和解锁。
(三)锁管理器
为活用上面的锁状态机,定义了一个锁管理器模块LockManager,管理Client端所有的锁。
LockManager中定义了一个threadlocal,该变量作用就是记录自己线程的状态,也就是统计信息。同时提供了register_tl_static,destroy_tl_static,acquireLock和dropLock这些接口。
因为Client模块其实是中间层,对上无论对接ceph-fuse还是ganesha,自己都只是一个中间层。所以除了mds等发送过来的处理消息模块是自己启的线程之外,其他基本上都是上层的fuse或ganesha启动的线程。所以想要跟踪这些线程在Client中的状态,就必须在进入Client模块中先调用register_tl_static注册本线程自己的tl_static,然后将需要加锁的地方使用acquireLock进行加锁,解锁的地方使用dropLock进行解锁,加锁和解锁举例如下:
acquireLock(INODE_MAP_LOCK, *this, READ);
acquireLock(IM_LOCK, *in);
acquireLock(SESSION_LOCK, *s);
dropLock(SESSION_LOCK, *s);
dropLock(INODE_MAP_LOCK, *this, READ);
//....
最后在线程结束返回上层结果前,调用destroy_tl_static销毁自己线程的统计对象。
还实现了锁管理器自动注册销毁器,LockManagerScope类,用这个实现自动注册和销毁更方便。
(四)LockGuard区域上锁器
为支持函数或区块加锁,实现了一个区域上锁器,用法和std::lock_guard类似。区块内使用:
LockGuard<Inode, IM_LOCK> guard(lock_manager, *in);
以这样的方式,泛型类传上锁对象类名和锁名,上锁内容是类对象。自动进行上锁和函数退出时的放锁。
(五)自动加锁器
LockManager中实现了一个自动加锁器功能,支持接收不定长参数,但必须是4的整数倍的长度的参数。每4个参数中分别是锁类型、加锁对象引用、加锁模式、是否跳过回环锁检查。即便不涉及加锁模式、是否跳过回环锁检查两个参数也要传默认参数进去。该功能所做的就是将传入的所有种类的锁的最终状态都变成加锁状态。调用示例:
autoLock(IM_LOCK, std::ref(*in), -1, false,
INODE_MAP_LOCK, std::ref(*client), WRITE, false);
比如调用前IM_LOCK是加锁的,INODE_MAP_LOCK是未加锁的,这样调用后,就会变为INODE_MAP_LOCK和IM_LOCK均加锁的状态。
此功能多用于代码中需要一次性加多个锁的地方,或者用于需要调换加锁顺序的场景。
三、检测方式
再调用了acquireLock或dropLock后,相应会进行加锁合法性和解锁合法性检查。
(一) acquireLock
实现如下:
template <typename T>
void acquireLock(LockType lock, T& object, int mode = -1, bool skip_recur_lock_check = false) {
can_lock(lock); // 综合检查
if (!skip_recur_lock_check)
can_recursive_lock(lock, object); // 回环锁合法性检查
lock_state[lock].lock_info->lock(static_cast<void*>(&object), mode); // 加锁操作
}
其中的can_lock就是检查上锁合法性。如果又涉及到回环锁,还会触发can_recursive_lock的检查,截至目前,Client模块只有固定几处地方涉及到回环锁加2次锁,可以保证其正确性,所以可以将skip_recur_lock_check在这几个固定地方设置为true不启用回环锁检查。但后续如果有新增代码涉及回环锁加锁2次的话,最好还是进行检查。检查通过后,调用模板接口lock进行加锁。
其中can_lock的检测核心逻辑是权重值的计算与比较。
void LockManager::can_lock(LockType lock) {
if (!lock_state[lock].recursive && (lock_static->lock_count.count(lock) > 0)) { // 非回环锁加锁次数不能大于1
lderr(client->cct) << "repeated lock, lock type: " << get_lock_name(lock) << dendl;
ceph_assert("repeated lock" == 0);
}
if (lock_static->last_weight <= lock_state[lock].weight) { // 待加锁权重值>上次权重值,可加
if ((lock_static->last_weight == lock_state[lock].weight) && // 待加锁权重值=上次权重值,只有回环锁允许加
(!lock_state[lock].recursive)) {
lderr(client->cct) << "lock type: " << get_lock_name(lock) << " is no interaction" << dendl;
goto fail;
}
auto it = lock_static->lock_count.find(lock);
if (it != lock_static->lock_count.end()) { //加锁记录表有加锁记录,检查数量
if (!lock_state[lock].recursive) { //非回环锁加锁记录必须最多为1
lderr(client->cct) << "lock type: " << get_lock_name(lock)
<< " do not allow non-recursive to exceed the lock layer!" << dendl;
ceph_assert("not recursive exceed layer" == 0);
}
if (it->second >= RECUR_LOCK_MAX_NUM) { //回环锁加锁记录最多为2
lderr(client->cct) << "lock type: " << get_lock_name(lock)
<< " do not allow recursive to exceed the lock layers!" << dendl;
ceph_assert(it->second == BASIC_LOCK_MAX_NUM);
}
it->second++; //数量检查通过,对于回环锁记录数量+1
} else { //无论是否回环锁,首次加锁,加入记录中,记录为1
lock_static->lock_count[lock] = BASIC_LOCK_MAX_NUM;
lock_static->total_weight += lock_state[lock].weight; // 叠加总权重,回环锁第二次加锁权重为0
}
lock_static->last_weight = lock_state[lock].weight; //记录本次加锁权重到last_weight中
return;
}
fail:
lderr(client->cct) << "lock has last_weight: " << lock_static->last_weight // 表示上面的检查不通过,加锁逆序或重复加锁,断言
<< " but acquire lock last_weight: " << lock_state[lock].weight
<< " current lock: " << get_lock_name(lock)
<< dendl;
ceph_assert("lock order error!" == 0);
}
而回环锁检测实现如下:
template<typename T>
void can_recursive_lock(LockType lock, T& object) {
if (!lock_state[lock].recursive)
return;
auto it = lock_static->lock_count.find(lock);
if (it != lock_static->lock_count.end() && it->second == RECUR_LOCK_MAX_NUM) { //回环锁第二次加锁
if (lock == IM_LOCK) { //调用Client模块检查是否是父目录->子的顺序
ceph_assert(lock_static->recur_static.p_inode != nullptr);
ceph_assert(lock_static->recur_static.p_inode != static_cast<void*>(&object));
ceph_assert(client->inode_relation_check(static_cast<void*>(lock_static->recur_static.p_inode), static_cast<void*>(&object)));
} else if (lock == SNAPREALM_LOCK) { //调用Client模块检查是否是子snaprealm->父snaprealm的顺序
ceph_assert(lock_static->recur_static.c_realm != nullptr);
ceph_assert(lock_static->recur_static.c_realm != static_cast<void*>(&object));
ceph_assert(client->realm_relation_check(static_cast<void*>(&object), static_cast<void*>(lock_static->recur_static.c_realm)));
}
}
recursive_lock_record(lock, object); //检查通过,记录
}
加锁步骤实现的特例模板如下,调用lock时自动对锁类型判断并调用相应所类型的lock操作:
struct LockWrapperBase {
virtual void lock(void* object, int mode) const = 0;
virtual void unlock(void* object, int mode) const = 0;
virtual bool is_locked(void* object, int mode) const = 0;
virtual bool is_locked_by_me(void* object) const = 0;
virtual ~LockWrapperBase() = default;
};
template<typename T, typename LockType>
struct LockWrapper : public LockWrapperBase { // 适配自定义的锁类,比如Mutex
LockType T::*lock_ptr;
LockWrapper(LockType T::*ptr) : lock_ptr(ptr) {}
void lock(void* object, int mode) const override {
(static_cast<T*>(object)->*lock_ptr).Lock();
}
void unlock(void* object, int mode) const override {
(static_cast<T*>(object)->*lock_ptr).Unlock();
}
bool is_locked(void* object, int mode) const override {
return (static_cast<T*>(object)->*lock_ptr).is_locked();
}
bool is_locked_by_me(void* object) const override {
return (static_cast<T*>(object)->*lock_ptr).is_locked_by_me();
}
};
template<typename T>
struct LockWrapper<T, RWLock> : public LockWrapperBase { // 适配读写锁类
RWLock T::*lock_ptr;
LockWrapper(RWLock T::*ptr) : lock_ptr(ptr) {}
void lock(void* object, int mode) const override {
if (mode == RWMode::READ) {
(static_cast<T*>(object)->*lock_ptr).get_read();
} else {
(static_cast<T*>(object)->*lock_ptr).get_write();
}
}
void unlock(void* object, int mode) const override {
if (mode == RWMode::READ) {
(static_cast<T*>(object)->*lock_ptr).put_read();
} else {
(static_cast<T*>(object)->*lock_ptr).put_write();
}
}
bool is_locked(void* object, int mode) const override {
if (mode == -1) {
return (static_cast<T*>(object)->*lock_ptr).is_locked();
} else if (mode == RWMode::WRITE) {
return (static_cast<T*>(object)->*lock_ptr).is_wlocked();
}
}
bool is_locked_by_me(void* object) const override {
return false;
}
};
// 还有其他模板特类spinlock、shard_ptr<spinlock>等。
(二) dropLock
实现如下:
template <typename T>
void dropLock(LockType lock, T& object, int mode = -1, bool skip_recur_lock_check = false) {
if (!skip_recur_lock_check)
can_drop_recursive_lock(lock, object); //回环锁解锁更新和检查
can_drop_lock(lock); //综合检查
lock_state[lock].lock_info->unlock(static_cast<void*>(&object), mode); //执行解锁操作
}
与acquireLock相反,解锁时如果是回环锁先更新回环锁的解锁记录,再判断解锁合法性。对于回环锁解锁实现:
template<typename T>
void can_drop_recursive_lock(LockType lock, T& object) {
if (lock == IM_LOCK) {
ceph_assert(lock_static->recur_static.p_inode != nullptr);
if (static_cast<void*>(&object) == lock_static->recur_static.p_inode) {// 更新回环锁的记录,如pinode-cinode -> pinode-nullptr或cinode-nullptr -> nullptr-nullptr
recursive_unlock_record(lock, true);
} else {
recursive_unlock_record(lock, false);
}
} else if (lock == SNAPREALM_LOCK) {
ceph_assert(lock_static->recur_static.c_realm != nullptr);
if (static_cast<void*>(&object) == lock_static->recur_static.c_realm) {// 更新回环锁的记录,如crealm-prealm -> crealm-nullptr或prealm-nullptr -> nullptr-nullptr
recursive_unlock_record(lock, true);
} else {
recursive_unlock_record(lock, false);
}
}
}
can_drop实现:
void LockManager::can_drop_lock(LockType lock) {
auto it = lock_static->lock_count.find(lock);
if (it != lock_static->lock_count.end()) {
int count = --it->second; // 减少锁数量的记录
if (count == NONE_LOCK_STATUS) { // 当前锁已释放完,清除记录并减去权重。回环锁的第二次上锁相对应的解锁不占权重,不用减权重也不用清记录。
lock_static->lock_count.erase(it);
lock_static->total_weight -= lock_state[lock].weight;
}
ceph_assert(lock_static->total_weight >= 0); // 总权重小于0了,表示上锁解锁不平衡。
if (lock_static->total_weight <= lock_static->last_weight) {// 总权重比last_weight还小,现阶段加的锁总和小于上次加(解)锁的权重,更新last_weight至少为现在的total值
lock_static->last_weight = lock_static->total_weight;
}
if (count == NONE_LOCK_STATUS) {
if (lock_static->last_weight <= lock_state[lock].weight) {//last_weight小于等于当前正在放锁的锁权重,表示小锁放完了,只剩大锁。
if (lock_static->lock_count.size() > 1) { //如果大锁数量不止1个,则需要选取当前记录中最大的锁的权重作为last_weight。如果数量就只有1个,则last_weight就是那个锁的权重。
auto new_it = std::prev(lock_static->lock_count.end());
lock_static->last_weight == lock_state[new_it->first].weight;
}
}
}
} else { //解锁时没找到加锁记录,表示有漏锁情况。
lderr(client->cct) << "unlock is not allowed when unlocked! lock type: " << get_lock_name(lock) << dendl;
ceph_assert("cannot unlocked repeated!" == 0);
}
}
然后调用实际的模板进行具体的解锁操作。
(三) isLocked和isLockedByMe
实现如下:
template <typename T>
bool isLocked(LockType lock, T& object, int mode = -1) {
return lock_state[lock].lock_info->is_locked(static_cast<void*>(&object), mode);
}
template <typename T>
bool isLockedByMe(LockType lock, T& object) {
return lock_state[lock].lock_info->is_locked_by_me(static_cast<void*>(&object));
}
(四) 关于回环锁的记录
当前回环锁,Client目前只有保护Inode的IM_LOCK和保护snaprealm的SNAPREALM_LOCK是回环锁。正常来说,回环锁并没有太多加锁解锁的要求和条件,因为回环锁本质就是可以对同一个资源加多次锁的,要靠调用者自己能保证不会死锁。但是在Client业务中,肯定是不能允许对一个资源不加条件的允许回环锁的存在的。就拿IM_LOCK来说,比如要对根目录Inode 0x1对象加IM_LOCK,同时需要对其下子文件/file1(0x100001)对象加IM_LOCK是可以的,但如果不加规则,另一线程先对/file1(0x100001)加锁再对0x1加锁,一样也是造成死锁。所以就必须也有一套规则对回环锁的加锁顺序加以规定。
对于IM_LOCK,规定一次最多允许加2层IM_LOCK,且顺序必须是从父目录到子目录/文件这样的顺序,不允许子到父这样逆序加锁也不允许同一线程对毫不相关的两个Inode同时加锁。
对于SNAPREALM_LOCK,规定一次最多允许加2层SNAPREALM_LOCK,且顺序必须是从子realm到父realm这样的顺序,不允许父到子逆序加锁也不允许同一线程对毫不相关的两个realm同时加锁。
所以回环锁加锁检查中的记录,各自只记录一个parent和一个children。比如加IM_LOCK的两种情况:
| parent | children | |
|---|---|---|
| 1 Lock(P) | p_inode | nullptr |
| 2 Lock(C) | p_inode | c_inode |
| 3 Unlock(C) | p_inode | nullptr |
| 4 Unlock(P) | nullptr | nullptr |
| parent | children | |
|---|---|---|
| 1 Lock(P) | p_inode | nullptr |
| 2 Lock(C) | p_inode | c_inode |
| 3 Unlock(P) | c_inode | nullptr |
| 4 Lock(CC) | c_inode | cc_inode |
| 5 Unlock(C) | cc_inode | nullptr |
| 6 Unlock(CC) | nullptr | nullptr |
当然,每次加锁不仅检查IM_LOCK自身是否符合回环锁规则,也同时检查了是否符合和其他锁的关系规则。
加锁记录是在can_recursive_lock和can_drop_recursive_lock中判断锁合法性之后的recursive_lock_record和recursive_unlock_record中:
template<typename T>
void recursive_lock_record(LockType lock, T& object) {
if (lock == IM_LOCK) {
if (!lock_static->recur_static.p_inode)
lock_static->recur_static.p_inode = static_cast<void*>(&object);
else
lock_static->recur_static.c_inode = static_cast<void*>(&object);
} else if (lock == SNAPREALM_LOCK) {
if (!lock_static->recur_static.c_realm)
lock_static->recur_static.c_realm = static_cast<void*>(&object);
else
lock_static->recur_static.p_realm = static_cast<void*>(&object);
}
}
void recursive_unlock_record(LockType lock, bool adjust_record) {
if (lock == IM_LOCK) {
if (adjust_record) {
if (lock_static->recur_static.c_inode) {
lock_static->recur_static.p_inode = lock_static->recur_static.c_inode;
lock_static->recur_static.c_inode = nullptr;
} else {
lock_static->recur_static.p_inode = nullptr;
}
} else {
lock_static->recur_static.c_inode = nullptr;
}
} else if (lock == SNAPREALM_LOCK) {
if (adjust_record) {
if (lock_static->recur_static.p_realm) {
lock_static->recur_static.c_realm = lock_static->recur_static.p_realm;
lock_static->recur_static.p_realm = nullptr;
} else {
lock_static->recur_static.c_realm = nullptr;
}
} else {
lock_static->recur_static.p_realm = nullptr;
}
}
}
其中在can_recursive_lock中判断合法性时,会调用到Client模块内部实现的接口判断当前的对象是否是父子关系,因为这个规定本身就是业务规定,所以必须调用到业务层接口进行检查。
(五) 区域上锁LockGuard
区域上锁的实现方式如下:
template <typename T, LockType lock>
class LockGuard {
public:
LockGuard(LockManager& manager, T& object, int mode = -1, bool skip_recur_lock_check = false)
: mgr(manager), obj(object), md(mode), sk_recu_lock_check(skip_recur_lock_check) {
ceph_assert(&mgr);
mgr.acquireLock(lock, obj, md, sk_recu_lock_check);
}
LockGuard(const LockGuard&) = delete;
LockGuard& operator=(const LockGuard&) = delete;
LockGuard(LockGuard&& other) noexcept
: mgr(other.manager), obj(other.obj), md(other.mode), sk_recu_lock_check(other.skip_recur_lock_check) {
other.obj = nullptr;
}
LockGuard& operator=(LockGuard&& other) noexcept {
if (this != &other) {
mgr.dropLock(lock, obj, md, sk_recu_lock_check);
mgr = other.mgr;
obj = other.obj;
md = other.md;
sk_recu_lock_check = other.sk_recu_lock_check;
other.obj = nullptr;
}
return *this;
}
~LockGuard() {
mgr.dropLock(lock, obj, md, sk_recu_lock_check);
}
private:
LockManager& mgr;
T& obj;
int md;
int sk_recu_lock_check;
};
其实就是构造时自动调用acquireLock,析构时自动调用dropLock。
(六) 自动上锁autoLock
自动上锁的实现如下:
template <typename... LockArgs>
void autoLock(LockArgs&&... args) {
static_assert(sizeof...(args) % 4 == 0, "Lock args must be in sets of four: (LockType, object, mode, skip_recur_lock_check)"); // 传入的参数长度必须是4的倍数个
auto lock_tuple = std::make_tuple(std::forward<LockArgs>(args)...);
std::map<int, std::unique_ptr<auto_lock_base>> lock_order;
dropLocks(lock_order, lock_tuple, std::make_index_sequence<sizeof...(args) / 4>()); // 检查传入的锁是否有未释放锁的,有则放锁
acquireLocks(lock_order); // 按序将需要上锁的锁全部上锁
}
其中,dropLocks使用了C++17中支持的折叠参数特性:
template <typename Tuple, std::size_t I>
void dropLockSet(std::map<int, std::unique_ptr<auto_lock_base>>& lock_order, Tuple& lock_tuple, std::integral_constant<std::size_t, I>) {
LockType lock = std::get<4 * I>(lock_tuple);
auto& object = std::get<4 * I + 1>(lock_tuple);
int mode = std::get<4 * I + 2>(lock_tuple);
bool skip_recur_lock_check = std::get<4 * I + 3>(lock_tuple);
if (lock_static->lock_count.count(lock) > 0)
dropLock(lock, object, mode, skip_recur_lock_check); // 提取参数后需要放锁的调用dropLock放锁
lock_order.emplace(static_cast<int>(lock), std::make_unique<auto_lock<std::decay_t<decltype(object)>>>(object, mode, skip_recur_lock_check)); //同时加到map中排序
}
template <typename Tuple, std::size_t... I>
void dropLocks(std::map<int, std::unique_ptr<auto_lock_base>>& lock_order, Tuple& lock_tuple, std::index_sequence<I...>) {
(dropLockSet(lock_order, lock_tuple, std::integral_constant<std::size_t, I>{}), ...); // 自动遍历tuple内容提取参数
}
然后调用acquireLocks进行上锁:
void acquireLocks(std::map<int, std::unique_ptr<auto_lock_base>>& lock_order) {
for (const auto& lo : lock_order) { // 遍历排好序的上锁集合
if (lo.second) {
int lock_class = lock_state[lo.first].lock_class;
auto_lock_base *base_record = lo.second.get();
switch(lock_class) {
case CLIENT: // 转型为其原始的类型,调用aquireLock进行加锁。
if (auto record = dynamic_cast<auto_lock<Client>*>(base_record)) {
auto& obj = record->object;
acquireLock(static_cast<LockType>(lo.first), obj, record->mode, record->skip_recur_lock_check);
} else {
ceph_assert("convert to Client failed!" == 0);
}
break;
case FH:
if (auto record = dynamic_cast<auto_lock<Fh>*>(base_record)) {
auto& obj = record->object;
acquireLock(static_cast<LockType>(lo.first), obj, record->mode, record->skip_recur_lock_check);
} else {
ceph_assert("convert to Fh failed!" == 0);
}
break;
//...
}
}
}
}
四、权重计算举例
如下表所示是举了一个极端的上锁和解锁示例(并非实际业务行为),演示其中各个值的变化以及判断逻辑过程,其中红色是检查失败的操作行为。
| 操作 | weight | last_weight | total_weight | lock_count | recur_lock(p c) | result |
|---|---|---|---|---|---|---|
| acquireLock(CLIENT_LOCK) | 10 | 10 | 10 | [CLIENT_LOCK]=1 | - | 0<10->Y |
| acquireLock(TRIM_LOCK) | 20 | 20 | 30 | [TRIM_LOCK]=1 | - | 10<20->Y |
| acquireLock(IM_LOCK) (P) | 40 | 40 | 70 | [IM_LOCK]=1 | P n | 20<40->Y |
| acquireLock(IM_LOCK) (C) | 40 | 40 | 70 | [IM_LOCK]=2 | P C | 40==40&&recur&&can_recur->Y |
| acquireLock(DN_LOCK) | 80 | 80 | 150 | [DN_LOCK]=1 | - | 40<80->Y |
| dropLock(IM_LOCK) (P) | 40 | 80 | 150 | [IM_LOCK]=1 | C n | can_drop_recur->Y |
| acquireLock(IM_LOCK) (P)(逆序) | 40 | 80 | 150 | × | × | 80>40->N |
| dropLock(DN_LOCK) | 80 | 40 | 70 | [DN_LOCK]=0,erase | - | Y |
| acquireLock(IM_LOCK) (P)(逆父子) | 40 | 40 | 70 | × | × | !can_recur->N |
| acquireLock(SNAP_REALMS_LOCK) | 80 | 80 | 150 | [SNAP_REALMS_LOCK]=1 | - | 40<80->Y |
| acquireLock(SNAPREALM_LOCK)(CS) | 90 | 90 | 240 | [SNAPREALM_LOCK]=1 | CS n | 80<90->Y |
| dropLock(SNAP_REALMS_LOCK) | 80 | 90 | 160 | [SNAP_REALMS_LOCK]=0,erase | - | Y |
| dropLock(TRIM_LOCK) | 20 | 90 | 140 | [TRIM_LOCK]=0,erase | - | Y |
| acquireLock(IM_LOCK)(CC)(逆序) | 40 | 90 | 140 | × | × | 90>40->N |
| acquireLock(SNAPREALM_LOCK)(PS) | 90 | 90 | 140 | [SNAPREALM_LOCK]=2 | CS PS | 90==90&&recur&&can_recur->Y |
| dropLock(SNAPREALM_LOCK)(PS) | 90 | 90 | 140 | [SNAPREALM_LOCK]=1 | CS n | Y |
| dropLock(SNAPREALM_LOCK)(CS) | 90 | 40 | 50 | [SNAPREALM_LOCK]=0 | n n |
50(total)<90(last_weight)->last_weight=50 50(last_weight)<90(weight)->lock_count==2>1 find max_weight=40->last_weight=40 Y |
| acquireLock(MDSMAP_LOCK) | 50 | 50 | 100 | [MDSMAP_LOCK]=1 | - | 40<50->Y |
| acquireLock(ROOT_LOCK) | 50 | 50 | × | × | × | 50==50&&!recur->N |
| dropLock(IM_LOCK) | 40 | 50 | 60 | [IM_LOCK]=0,erase | n n | Y |
| dropLock(MDSMAP_LOCK) | 50 | 10 | 10 | [MDSMAP_LOCK]=0,erase | - |
10(total)<50(last_weight)->last_weight=10 10(last_weight)<50(weight)->lock_count==1 last_weight no need find==10 Y |
| dropLock(CLIENT_LOCK) | 10 | 0 | 0 | [CLIENT_LOCK]=0,erase | - | 0(total)<10(last_weight)->last_weight=0->Y |
五、 注意点
(1) 当前Client代码中,涉及到IM_LOCK和SNAPREALM_LOCK回环加锁方式的,目前只有固定的地方使用到,可以跳过回环锁检查。
其中IM_LOCK这样加锁的接口有:
link接口和unlink接口。在link中业务是知晓这两个Inode即将成为父子关系的,所以在允许link前一步这样加锁,与之相对,unlink之后两个Inode解除父子关系后也不再允许同时加锁。
其中SNAPREALM_LOCK这样加锁的接口有:
adjust_realm_parent接口。在这个接口中会明确的从realm中取其父realm进行操作,能够明确知道realm之间的关系。
以上这种在能够明确父子关系的情况下,可以传skip_recur_lock_check为true跳过父子关系检查,毕竟父子检查需要遍历父目录下的子条目,这样的检查是会耗性能的,所以这种情况下跳过检查可以不损耗过多的性能。但是对于以后新增接口新增功能等需要涉及到多个Inode同时加锁的场景里,这种检查就很必要。
(2) 在上层业务调用下来的接口开始就需要注册register_tl_static,结束时销毁destroy_tl_static,或使用自动注册LockManagerScope。
因为线程控制权在上层,想要跟踪线程的状态,就需要手动注册和销毁统计对象。这个就类似于现有的统计性能耗时的perf,在大部分接口中刚进入时进行start计时结束时再记一下时间记录到logger中类似。如果不注册就直接调用lock_manager进行上锁放锁会导致断言。
测试阶段只想测试部分锁的话,在待测试的锁进行上锁和放锁前可以使用has_register_tl_static。
(3) 对于代码中使用的信号量停等处不需要加入统计。
因为当前的检测机制基于单线程,信号量等待时,针对自己线程来说就是一个放锁并停止的状态,并唤醒或等待超时醒来后拿锁继续后续过程,在这个放锁停止过程中不可能涉及到去拿其他锁的场景,所以这种时候也完全不需要将这个放锁和拿锁记录到统计中。

21

被折叠的 条评论
为什么被折叠?



