最近遇到的问题,需要删除指定列族下的指定列,考虑到原理就是扫描hbase表,获取rowkey,进行删除,value值的返回没有意义。但是怎么能快速扫描呢?高性能肯定是第一考虑点,废话不说,上代码:
public static void delete(Configuration config, String tableName, String cf, String qualifier) {
Connection connection = null;
HTable hTable = null;
try {
connection = ConnectionFactory.createConnection(config);
hTable = new HTable(TableName.valueOf(tableName), connection);
Scan scan = new Scan();
// 参数根据情况自行设置 以下三个参数最好设置 提高速度
scan.setCaching(10000);
scan.setCacheBlocks(false);
// KeyOnlyFilter只返回行键列族和列 不返回值 对于此种情况删除来说不需要值
scan.setFilter(new KeyOnlyFilter());
scan.setFilter(new FamilyFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(cf))));
scan.setFilter(new QualifierFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(qualifier))));
Iterator<Result> iterator = hTable.getScanner(scan).iterator();
for (int i = 0; i < 1000; i++) {
ArrayList<Delete> deletes = new ArrayList<>();
while (iterator.hasNext() && deletes.size() <= 100000) {
Result next = iterator.next();
Delete delete = new Delete(next.getRow());
if (next.containsColumn(cf.getBytes(), qualifier.getBytes())) {
delete.deleteColumn(cf.getBytes(), qualifier.getBytes());
deletes.add(delete);
}
}
if (deletes.size() != 0){
hTable.delete(deletes);
}else {
break;
}
}
hTable.close();
connection.close();
} catch (IOException e) {
e.printStackTrace();
}
}

288

被折叠的 条评论
为什么被折叠?



