hdfs - balancer学习

本文详细记录了Hadoop HDFS中数据均衡的过程,包括设置平衡阈值、调整带宽和执行balancer命令。讨论了balancer如何选择源和目标节点,以及在遇到standby节点问题时的处理。同时,分析了数据节点间DFSUsed%不一致的原因,并探讨了平衡的目标——DFSUsed占ConfiguredCapacity的比例。

balance是啥,顾名思义 是个平衡器

主要是平衡各个datanode之间的使用

 网上的文档一个比一个写的6结果,有的命令都拼错了。。。而且你知道究竟平衡的是啥么

直接上官网

Apache Hadoop 3.2.2 – HDFS Commands Guidehttps://hadoop.apache.org/docs/r3.2.2/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer

--查看balance 也就是集群之间转移数据的速度

hdfs dfsadmin -getBalancerBandwidth node17:9867 

Balancer bandwidth is 10485760 bytes per second.  --10M嫌慢 设置20M

这里权限有点问题。。。认证hdfs

hdfs dfsadmin -setBalancerBandwidth 20971520

[root@worker01 /home/devuser]# hdfs dfsadmin -setBalancerBandwidth  20971520
NumberFormatException: For input string: " 20971520"
Usage: hdfs dfsadmin [-setBalancerBandwidth <bandwidth in bytes per second>]
[root@worker01 /home/devuser]# hdfs dfsadmin -setBalancerBandwidth 20971520
setBalancerBandwidth: Access denied for user hive. Superuser privilege is required
[root@worker01 /home/devuser]# kinit hdfs 
Password for hdfs@CDH.COM: 
[root@worker01 /home/devuser]# hdfs dfsadmin -setBalancerBandwidth 20971520
[root@worker01 /home/devuser]# hdfs dfsadmin -setBalancerBandwidth 20971520
Balancer bandwidth is set to 20971520 for master.data.com/9.134.64.234:8020
Balancer bandwidth is set to 20971520 for node01.data.com/9.134.66.48:8020

--这里的时候我遇到一个问题 ip我知道,这个端口是啥。。但是我注意这个是ipc_port 

开始准备balancer

[root@worker01 /home/devuser]# hdfs balancer --help
Usage: hdfs balancer
	[-policy <policy>]	the balancing policy: datanode or blockpool
	[-threshold <threshold>]	Percentage of disk capacity
	[-exclude [-f <hosts-file> | <comma-separated list of hosts>]]	Excludes the specified datanodes.
	[-include [-f <hosts-file> | <comma-separated list of hosts>]]	Includes only the specified datanodes.
	[-source [-f <hosts-file> | <comma-separated list of hosts>]]	Pick only the specified datanodes as source nodes.
	[-blockpools <comma-separated list of blockpool ids>]	The balancer will only run on blockpools included in this list.
	[-idleiterations <idleiterations>]	Number of consecutive idle iterations (-1 for Infinite) before exit.
	[-runDuringUpgrade]	Whether to run the balancer during an ongoing HDFS upgrade.This is usually not desired since it will not affect used space on over-utilized machines.

Generic options supported are:
-conf <configuration file>        specify an application configuration file
-D <property=value>               define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port>  specify a ResourceManager
-files <file1,...>                specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...>               specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...>          specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

 这时候要考虑一个问题?怎么样才算平衡?

比如10个dn,每个100G容量,共计1T, 总共使用了200G 其中 dn1使用了1k dn2使用了99G

那么我要怎么平衡? dn1和dn2平衡到20G还是50G。我也不知道。

开始实验

threshold 

hdfs balancer -threshold 5  --阈值=5 也就是容忍datanode数据的差距是5%

[root@worker01 /home/devuser]# hdfs balancer -threshold 5
22/06/27 11:29:18 INFO balancer.Balancer: Using a threshold of 5.0
22/06/27 11:29:18 INFO balancer.Balancer: namenodes  = [hdfs://s2cluster]
22/06/27 11:29:18 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 5.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]
22/06/27 11:29:18 INFO balancer.Balancer: included nodes = []
22/06/27 11:29:18 INFO balancer.Balancer: excluded nodes = []
22/06/27 11:29:18 INFO balancer.Balancer: source nodes = []
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
22/06/27 11:29:19 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
22/06/27 11:29:19 INFO block.BlockTokenSecretManager: Setting block keys
22/06/27 11:29:19 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
22/06/27 11:29:19 INFO block.BlockTokenSecretManager: Setting block keys
22/06/27 11:29:19 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
22/06/27 11:29:19 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.117.90:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.68.200:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.80.60:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.81.221:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.124.14:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.122.87:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.83.33:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.163.60:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.115.141:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.124.36:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.71.192:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.123.37:1004
22/06/27 11:29:19 INFO net.NetworkTopology: Adding a new node: /default/9.134.117.73:1004
22/06/27 11:29:19 INFO
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值