目标
在当前现有的 ceph 环境下添加 mds (cephfs) 服务

理解
客户端可以通过 nfsV4, cephfs 的方法对 cephfs 进行访问
使用通用 posfix 标准
要创建 cephfs 你必须在 ceph rados 下创建两个 POOL
data pool 用于存储数据
metadata pool 用于存储数据的元数据 ( 可以连接为存储了文件的索引节点信息)
当客户端要访问 cephfs 上的文件时, 首先要连接 mds 服务
假如客户端需要对文件执行操作,需要先连接至 MDS server, mds 记录了客户端的操作日志,通过 metadata 中获取 innode 信息, 返回至客户端, 然后客户端要转去 data pool 访问文件数据
环境
ceph 状态
# ceph -s
cluster:
id: 7e720238-7xxxxxxxxxxxxxxd9d9a49ac4e4
health: HEALTH_OK
services:
mon: 3 daemons, quorum ns-storage-020100,ns-storage-020101,ns-storage-020102
mgr: ns-storage-020100(active), standbys: ns-storage-020101, ns-storage-020102
osd: 18 osds: 18 up, 18 in
data:
pools: 3 pools, 1152 pgs
objects: 250 objects, 631 MB
usage: 40584 MB used, 66966 GB / 67006 GB avail
pgs: 1152 active+clean
ceph osd 状态
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-12 24.00000 root noah
-9 8.00000 host ns-storage-020100.vclound.com
12 hdd 4.00000 osd.12 up 1.00000 1.00000
13 hdd 4.00000 osd.13 up 1.00000 1.00000
-10 8.00000 host ns-storage-020101.vclound.com
14 hdd 4.00000 osd.14 up 1.00000 1.00000
15 hdd 4.00000 osd.15 up 1.00000 1.00000
-11 8.00000 host ns-storage-020102.vclound.com
16 4.00000 osd.16 up 1.00000 1.00000
17 4.00000 osd.17 up 1.00000 1.00000
-1 47.63620 root default
-2 15.63620 host ns-storage-020100
0 hdd 3.63620 osd.0 up 1.00000 1.00000
1 hdd 4.00000 osd.1 up 1.00000 1.00000
2 hdd 4.00000 osd.2 up 1.00000 1.00000
3 hdd 4.00000 osd.3 up 1.00000 1.00000
-3 16.00000 host ns-storage-020101
4 hdd 4.00000 osd.4 up 1.00000 1.00000
5 hdd 4.00000 osd.5 up 1.00000 1.00000
6 hdd 4.00000 osd.6 up 1.00000 1.00000
7 hdd 4.00000 osd.7 up 1.00000 1.00000
-4 16.00000 host ns-storage-020102
8 hdd 4.00000 osd.8 up 1.00000 1.00000
9 hdd 4.00000 osd.9 up 1.00000 1.00000
10 hdd 4.00000 osd.10 up 1.00000 1.00000
11 hdd 4.00000 osd.11 up 1.00000 1.00000
创建 mds
分别在每个节点上创建对应目录, 注意这个 ID 不可以直接用数字, 这里取 hostname 为 id 值
ex: mkdir -p /var/lib/ceph/mds/ceph-{id}
在每个机器上分别执行
mkdir -p /var/lib/ceph/mds/ceph-$(hostname -s)
为每个机器创建 keyrings
分别在每个机器上执行
ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-{id}/keyring --gen-key -n mds.0
执行:
# ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-$(hostname -s)/keyring --gen-key -n mds.$(hostname -s )
creating /var/lib/ceph/mds/ceph-ns-storage-020100/keyring
]# ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-$(hostname -s)/keyring --gen-key -n mds.$(hostname -s )
creating /var/lib/ceph/mds/ceph-ns-storage-020102/keyring
# ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-$(hostname -s)/keyring --gen-key -n mds.$(hostname -s )
creating /var/lib/ceph/mds/ceph-ns-storage-020102/keyring
为每个机器授权
分别在每个机器上执行
# ceph auth add mds.0 osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
added key for mds.0
# ceph auth add mds.1 osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
added key for mds.1
]# ceph auth add mds.2 osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
added key for mds.2
在每个服务器上添加配置
ceph.conf
[mds.0]
host = ns-storage-020100
[mds.1]
host = ns-storage-020101
[mds.2]
host = ns-storage-020102
切记更改用户权限
chown ceph:ceph /var/lib/ceph/mds -R
启动服务
# cp /usr/lib/systemd/system/ceph-mds@.service /usr/lib/systemd/system/ceph-mds@$(hostname -s )
# systemctl start ceph-mds@$(hostname -s )
# systemctl status ceph-mds@$(hostname -s)
● ceph-mds@ns-storage-020100.service - Ceph metadata server daemon
Loaded: loaded (/usr/lib/systemd/system/ceph-mds@ns-storage-020100.service; disabled; vendor preset: disabled)
Active: active (running) since 四 2020-12-10 09:53:52 CST; 6s ago
Main PID: 60814 (ceph-mds)
CGroup: /system.slice/system-ceph\x2dmds.slice/ceph-mds@ns-storage-020100.service
└─60814 /usr/bin/ceph-mds -f --cluster ceph --id ns-storage-020100 --setuser ceph --setgroup ceph
12月 10 09:53:52 ns-storage-020100.vclound.com systemd[1]: Started Ceph metadata server daemon.
12月 10 09:53:52 ns-storage-020100.vclound.com systemd[1]: Starting Ceph metadata server daemon...
12月 10 09:53:52 ns-storage-020100.vclound.com ceph-mds[60814]: starting mds.ns-storage-020100 at -
管理 mds
cephfs 状态 (由于没有 fs, 因此全部 mds 都是出于 standby 状态
# ceph fs status
+-------------------+
| Standby MDS |
+-------------------+
| ns-storage-020102 |
| ns-storage-020100 |
| ns-storage-020101 |
+-------------------+
ceph 状态, (由于还没有创建 fs, 因此在 services 中无法识别 mds 信息
# ceph -s
cluster:
id: 7e72023xxxxxxxxxxxxxxxxxxxxxxxd9d9a49ac4e4
health: HEALTH_OK
services:
mon: 3 daemons, quorum ns-storage-020100,ns-storage-020101,ns-storage-020102
mgr: ns-storage-020100(active), standbys: ns-storage-020101, ns-storage-020102
osd: 18 osds: 18 up, 18 in
data:
pools: 3 pools, 1152 pgs
objects: 250 objects, 631 MB
usage: 40605 MB used, 66966 GB / 67006 GB avail
pgs: 1152 active+clean
创建 cephfs 专用 pool
# ceph osd pool create cephfs_data 256 256 (存储数据专用)
pool 'cephfs_data' created
# ceph osd pool create cephfs_metadata 256 256 (存储 metadata 专用)
pool 'cephfs_metadata' created
把 pool 定义为 cephfs 专用
# ceph osd pool application enable cephfs_metadata cephfs
enabled application 'cephfs' on pool 'cephfs_metadata'
# ceph osd pool application enable cephfs_data cephfs
enabled application 'cephfs' on pool 'cephfs_data'
参考创建 cephfs 方法 (语法)
fs new <fs_name> <metadata> <data> {--force} {--allow-dangerous-metadata-overlay}
创建 cephfs
# ceph fs new noah_fs cephfs_metadata cephfs_data
new fs with metadata pool 5 and data pool 4
再次查询 cephfs 服务状态
# ceph -s
cluster:
id: 7e72xxxxxxxxxxxxxxxxxxxxxxxxxx9d9a49ac4e4
health: HEALTH_OK
services:
mon: 3 daemons, quorum ns-storage-020100,ns-storage-020101,ns-storage-020102
mgr: ns-storage-020100(active), standbys: ns-storage-020101, ns-storage-020102
mds: noah_fs-1/1/1 up {0=ns-storage-020101=up:active}, 2 up:standby
osd: 18 osds: 18 up, 18 in
data:
pools: 5 pools, 1664 pgs
objects: 271 objects, 631 MB
usage: 40595 MB used, 66966 GB / 67006 GB avail
pgs: 1664 active+clean
查询 cephfs 信息
# ceph fs ls
name: noah_fs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
检测服务状态
# ceph fs status
noah_fs - 0 clients
=======
+------+--------+-------------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+-------------------+---------------+-------+-------+
| 0 | active | ns-storage-020101 | Reqs: 0 /s | 0 | 1 |
+------+--------+-------------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
| Pool | type | used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 2246 | 45.1T |
| cephfs_data | data | 0 | 45.1T |
+-----------------+----------+-------+-------+
+-------------------+
| Standby MDS |
+-------------------+
| ns-storage-020102 |
| ns-storage-020100 |
+-------------------+
MDS version: ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
由于 admin 之前没有对 mds 授权, 因此对 admin 进行授权
client.admin
key: AQD6FlpdpOhJHxAAzuWwYHkYC9NUKrT4GgM8iQ==
auid: 0
caps: [mds] allow
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *
授权方法
# ceph auth caps client.admin mon 'allow *' osd 'allow *' mds 'allow *' mgr 'allow *'
updated caps for client.admin
检验 admin 权限
client.admin
key: AQD6FlpdpOhJHxAAzuWwYHkYC9NUKrT4GgM8iQ==
auid: 0
caps: [mds] allow *
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *
对客户端进行 auth 授权
# ceph fs authorize noah_fs client.terry /terry rw /backupdata rw
[client.terry]
key = AQCCl9FfljvBOBAAf+JKomWC8djGk3qjUqyQFA==
由于默认没有 terry 目录因此必须要创建一个可以访问 / 的用户
# ceph fs authorize noah_fs client.mary / rw
[client.mary]
key = AQBNwdFfZZGMERAAJ/CdMbLy7BvMqt49R2ywXg==
查询授权
# ceph auth list | grep -A 6 terry
installed auth entries:
client.terry
key: AQCClxxxxxxxxxxxxxxxxxxxxxdjGk3qjUqyQFA==
caps: [mds] allow rw path=/terry, allow rw path=/backupdata
caps: [mon] allow r
caps: [osd] allow rw pool=cephfs_data
# ceph auth list | grep -A 6 mary
client.mary
key: AQBNwdFxxxxxxxxxxxxxxxxxxxxxxxBvMqt49R2ywXg==
caps: [mds] allow rw
caps: [mon] allow r
caps: [osd] allow rw pool=cephfs_data
cephfs 配置参考
客户端 cephfs 使用
客户端配置
客户端需要配置 ceph.conf 用于连接 ceph
[global]
fsid = 7e7202xxxxxxxxxxxxxxxxx9a49ac4e4
mon initial members = ns-storage-020100,ns-storage-020101,ns-storage-020102
mon host = IPADDR,IPADDR,IPADDR
public network = 1.1.1.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 2048
filestore xattr use omap = true
osd pool default size = 3
osd pool default min size = 1
osd pool default pg num = 256
osd pool default pgp num = 256
osd crush chooseleaf type = 1
[osd]
osd journal size = 2048
osd heartbeat grace = 20
osd heartbeat interval = 5
[mds.0]
host = ns-storage-020100
[mds.1]
host = ns-storage-020101
[mds.2]
host = ns-storage-020102
配置 secret key
获取 mary 客户端 keyring , 并把该文件存放至客户端
ceph auth get client.mary -o ceph.client.mary.keyring
当目录 /terry /backupdata 不存在时, 客户 terry 是无法进行挂载并报下面错误
mount error 2 = No such file or directory
挂载
使用客户 mary 挂载 cephfs 并创建 /terry 目录
# mount -t ceph IPADDR:6789,IPADDR:6789,IPADDR:6789:/ /mnt -o name=mary,secret=AQBNwdFfxxxxxxxxxxxxxxxxxxxXg==
检测一下挂载状态
# mount | grep mnt
IPADDR:6789,IPADDR:6789,IPADDR:6789:/ on /mnt type ceph (rw,relatime,name=mary,secret=<hidden>,acl,wsize=16777216)
创建 noah_fs 中 terry 目录
# mkdir /mnt/terry
# umount /mnt
使用 secret key file
假如不希望明文地输入secret key
保存 key
echo "xxxxxxxxxyour_key_stringxxxxxxxxxxxxxx" > mary.key
挂载命令改变为
# mount -t ceph X.X.X.X:6789,X.X.X.X:6789,X.X.X.X:6789:/ /mnt -o name=mary,secretfile=./ceph.client.mary.keyring
测试其他用户
测试用户 terry 权限
# ceph auth get client.terry -o ceph.client.terry.keyring
exported keyring for client.terry
# cat /tmp/terry.keyring
[client.terry]
key = AQA4vkNigzV9GBAAGHNifIRTCiFMsdzwzZQVmQ==
caps mds = "allow rw path=/terry, allow rw path=/backupdata"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs_data"
创建 keyfile
echo "AQA4vkNigzV9GBAAGHNifIRTCiFMsdzwzZQVmQ==" > terry.eky
客户端测试连接
# ceph fs ls -c ./ceph.conf -n client.terry -k ceph.client.terry.keyring
name: noah_fs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
挂载
# mount -t ceph IPADDR:6789,IPADDR:6789,IPADDR:6789:/terry /mnt/ -o name=terry,secretfile=./terry.key
# mount | grep mnt
IPADDR:6789,IPADDR:6789,IPADDR:6789:/terry on /mnt type ceph (rw,relatime,name=terry,secret=<hidden>,acl,wsize=16777216)
FAQ
cephfs 使用了多少副本, 如何管理 cephfs 底层
其实 cephfs 数据都存放在 rados 中
cephfs 对应了 metadata, data pool
对上述两个 pool 进行管理即可
副本
查询当前 pool 副本
# ceph osd pool get cephfs_data size
size: 3
# ceph osd pool get cephfs_metadata size
size: 3
如果希望修改 ( 3 就是副本数量 )
# ceph osd pool set your_pool_name size 3
rule
查询当前 pool 信息
# ceph osd dump | grep "^pool" | grep "crush"
pool 1 'volumes' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 263 flags hashpspool stripe_width 0 application rbd
pool 2 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 59 flags hashpspool stripe_width 0 application rbd
pool 3 'noahpool' replicated size 3 min_size 1 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 350 flags hashpspool stripe_width 0 application rbd
pool 4 'cephfs_data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 17798 flags hashpspool stripe_width 0 application cephfs
pool 5 'cephfs_metadata' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 17798 flags hashpspool stripe_width 0 application cephfs
把 cephfs_data 与 cephfs_metadata 存放到不同的 rule root 下
查询当前 ceph 的 osd tree
计划把 cephfs_data 存放到 noah 下
假话吧 cephfs_metadata 存放到 default 下 (默认,不需要修改)
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-12 24.00000 root noah
-9 8.00000 host ns-storage-020100.vclound.com
12 hdd 4.00000 osd.12 up 1.00000 1.00000
13 hdd 4.00000 osd.13 up 1.00000 1.00000
-10 8.00000 host ns-storage-020101.vclound.com
14 hdd 4.00000 osd.14 up 1.00000 1.00000
15 hdd 4.00000 osd.15 up 1.00000 1.00000
-11 8.00000 host ns-storage-020102.vclound.com
16 4.00000 osd.16 up 1.00000 1.00000
17 4.00000 osd.17 up 1.00000 1.00000
-1 47.63620 root default
-2 15.63620 host ns-storage-020100
0 hdd 3.63620 osd.0 up 1.00000 1.00000
1 hdd 4.00000 osd.1 up 1.00000 1.00000
2 hdd 4.00000 osd.2 up 1.00000 1.00000
3 hdd 4.00000 osd.3 up 1.00000 1.00000
-3 16.00000 host ns-storage-020101
4 hdd 4.00000 osd.4 up 1.00000 1.00000
5 hdd 4.00000 osd.5 up 1.00000 1.00000
6 hdd 4.00000 osd.6 up 1.00000 1.00000
7 hdd 4.00000 osd.7 up 1.00000 1.00000
-4 16.00000 host ns-storage-020102
8 hdd 4.00000 osd.8 up 1.00000 1.00000
9 hdd 4.00000 osd.9 up 1.00000 1.00000
10 hdd 4.00000 osd.10 up 1.00000 1.00000
11 hdd 4.00000 osd.11 up 1.00000 1.00000
之前设定了两个 crush 规则
# ceph osd crush rule ls
replicated_rule ( 默认 default root )
noah_rule ( 使用 noah root )
把 cephfs_data 存放至 noah root 下
# ceph osd pool set cephfs_data crush_rule noah_rule
set pool 4 crush_rule to noah_rule
确认一下规则
# ceph osd dump | grep "^pool" | grep "crush" | grep cephfs
pool 4 'cephfs_data' replicated size 3 min_size 1 crush_rule 1 object_hash rjenkins pg_num 256 pgp_num 256 last_change 17801 flags hashpspool stripe_width 0 application cephfs
pool 5 'cephfs_metadata' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 17799 flags hashpspool stripe_width 0 application cephfs
当你执行了 CRUSH RULE 迁移操作,那么数据自然会执行 OSD 之间的迁移
ceph -s
。。。
。。。
data:
pools: 5 pools, 1664 pgs
objects: 1295 objects, 4727 MB
usage: 53298 MB used, 66954 GB / 67006 GB avail
pgs: 4726/3885 objects degraded (121.647%)
1437 active+clean
223 active+recovery_wait+degraded
4 active+recovering+degraded
io:
recovery: 40888 kB/s, 9 objects/s
本文档详述如何在现有Ceph环境中手动创建并配置CephFS服务,包括理解CephFS工作原理、环境准备、创建MDs、设置存储池、客户端配置与使用、故障排查等步骤。在创建CephFS时,涉及数据池和元数据池的创建、MDs服务器的启动与管理、客户端挂载与权限设置,以及副本和规则的管理。

4444

被折叠的 条评论
为什么被折叠?



