ceph osd slow ops 检测

本文介绍了如何检测和处理Ceph存储系统中OSD(Object Storage Daemon)的慢操作问题,包括message layer、osd prepares、filestore问题、与本地磁盘相关的OSD事件,以及获取和修改OSD配置信息的方法。

目的

常用的方法检测 ceph slow 问题

参考

yceph -s
  cluster:
    id:     22908555-e596-4c2d-a1f6-34fcf4d3e935
    health: HEALTH_WARN
            Degraded data redundancy: 46384/12805029 objects degraded (0.362%), 145 pgs degraded, 122 pgs undersized
            309 slow ops, oldest one blocked for 252 sec, daemons [osd.0,osd.10,osd.101,osd.105,osd.106,osd.107,osd.110,osd.111,osd.112,osd.116]... have slow ops.

  services:
    mon: 3 daemons, quorum gd15-ceph-mon-dbbackup-003,gd15-ceph-mon-dbbackup-001,gd15-ceph-mon-dbbackup-002 (age 4d)
    mgr: gd15-ceph-mon-dbbackup-001(active, since 4d), standbys: gd15-ceph-mon-dbbackup-003
    mds: dba_fs:1 {0=gd15-ceph-mds-dbbackup-002=up:active} 2 up:standby
    osd: 152 osds: 152 up (since 28m), 152 in (since 51m); 122 remapped pgs

  data:
    pools:   3 pools, 4353 pgs
    objects: 4.27M objects, 16 TiB
    usage:   75 TiB used, 784 TiB / 860 TiB avail
    pgs:     46384/12805029 objects degraded (0.362%)
             1260/12805029 objects misplaced (0.010%)
             4205 active+clean
             119  active+recovery_wait+undersized+degraded+remapped
             24   active+recovery_wait+degraded
             2    active+recovering+undersized+remapped
             1    active+recovering+degraded
             1    active+recovery_wait
             1    active+recovering+undersized+degraded+remapped

  io:
    client:   7.5 GiB/s wr, 0 op/s rd, 2.04k op/s wr
    recovery: 10 MiB/s, 2 objects/s

检测 OSD slow 信息

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok dump_ops_in_flight 
ceph daemon /var/run/ceph/vip-ceph-osd.0.asok dump_historic_ops 

返回信息提示

message layer

信息解释
header_readWhen the messenger first started reading the message off the wire.
throttledWhen the messenger tried to acquire memory throttle space to read the message into memory.
all_readWhen the messenger finished reading the message off the wire.
dispatchedWhen the messenger gave the message to the OSD.
initiatedThis is identical to header_read. The existence of both is a historical oddity.

osd prepares

信息解释
queued_for_pgThe op has been put into the queue for processing by its PG.
reached_pgThe PG has started doing the op.
waiting for *The op is waiting for some other work to complete before it can proceed (e.g. a new OSDMap; for its object target to scrub; for the PG to finish peering; all as specified in the message).
startedThe op has been accepted as something the OSD should do and is now being performed.
waiting for subops fromThe op has been sent to replica OSDs.

filestore problem

信息解释
commit_queued_for_journal_writeThe op has been given to the FileStore.
write_thread_in_journal_bufferThe op is in the journal’s buffer and waiting to be persisted (as the next disk write).
journaled_completion_queuedThe op was journaled to disk and its callback queued for invocation.

osd 事件,与本地盘相关

信息解释
op_commitThe op has been committed by the primary OSD.
op_appliedThe op has been write()’en to the backing FS on the primary.
sub_op_applied: op_appliedFor a replica’s “subop”.
sub_op_committed: op_commitFor a replica’s sub-op (only for EC pools).
sub_op_commit_rec/sub_op_apply_rec from The primary marks this when it hears about the above, but for a particular replica (i.e. ).
commit_sentWe sent a reply back to the client (or primary OSD, for sub ops).

获取 osd 配置信息方法

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok config  show

修改方法

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok config  set name value
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Terry_Tsang

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值