目的
分析并解决 backfill_toolfull 故障
前提
执行了 ceph 扩容
出现下面故障信息
# ceph -s
cluster xxxxxxxx-8792-4948-b68f-2fcea75f53b9
health HEALTH_WARN 13 pgs backfill_toofull; 1 pgs degraded; 1 pgs stuck degraded; 13 pgs stuck unclean; 9 requests are blocked > 32 sec; recovery 190/54152986 objects degraded (0.000%); 47030/54152986 objects misplaced (0.087%); 2 near full osd(s); clock skew detected on mon.cephsvr25-128075
monmap e3: 5 mons at {cephsvr15-128055=xxx.30.128.55:6789/0,cephsvr17-128057=xxx.30.128.57:6789/0,cephsvr24-128074=xxx.30.128.74:6789/0,cephsvr25-128075=xxx.30.128.75:6789/0,cephsvr26-128076=xxx.30.128.76:6789/0}, election epoch 168, quorum 0,1,2,3,4 cephsvr15-128055,cephsvr17-128057,cephsvr24-128074,cephsvr25-128075,cephsvr26-128076
osdmap e23216: 100 osds: 100 up, 100 in
pgmap v11159189: 20544 pgs, 2 pools, 70024 GB data, 17620 kobjec

本文旨在分析并解决Ceph集群中遇到的backfill_toofull故障。在执行Ceph扩容后,系统显示12个active+remapped+backfill_toofull和1个active+degraded+remapped+backfill_toofull的错误。问题根源在于OSD.24磁盘空间不足,约40GB的数据无法写入,而磁盘已使用86%接近设定的osd near full阈值。解决方案包括添加新的Ceph存储节点以扩容,或暂时将osd_near_full参数调整到.95,但风险较高。

507

被折叠的 条评论
为什么被折叠?



