关于HFile的思考--创建和解析HFile

最新推荐文章于 2026-06-15 15:31:53 发布

转载最新推荐文章于 2026-06-15 15:31:53 发布 · 1k 阅读

标签

#分布式应用 #cassandra #hbase #null #file

hbase 专栏收录该内容

26 篇文章

订阅专栏

转自：http://hbase.info/2011/07/22/think_about_hfile

原文链接：http://blog.data-works.org/2011/07/关于HFile的思考/

原文作者郭鹏，国内Cassandra领域的先驱者和实践者。资深软件开发工程师，擅长分布式应用程序的开发和使用，实践经验极其丰富。新浪微博：@逖靖寒

————————————– 毫无理由的分割线 ———————————

0.90.x版本的HBase中的文件是存储在HFile中的。

关于HFile文件的详细介绍，可以查看这篇文章：http://www.data-works.org/download/hfile.pdf

这篇文章中介绍了以下五点内容：

HFile的作用。
HFile的格式。
HFile的性能。
HFile的使用注意事项。
HFile的编程接口。
HFile中有一个很重要的参数，那就是block size。如果我们写入hfile中的某一个value的值大于block size会怎么样？

于是有如下的测试代码：

 
        // create local file system
       
        FileSystem fs = 
        new 
         RawLocalFileSystem();
       
        fs.setConf(
        new 
         Configuration());
       
        // block size = 1kb
       
        HFile.Writer hwriter = 
        new 
         HFile.Writer(fs,
       
        new 
         Path(
        "hfile"
        ), 
        1
        , (Compression.Algorithm) 
        null
        , 
        null
        );
       
        // create key & value, the value is 8kb, larger than 1kb
       
        byte
        [] key = 
        "www.data-works.org"
        .getBytes();
       
        byte
        [] value = 
        new 
         byte
        [
        8 
         * 
        1024
        ];
       
        for 
         (
        int 
         i = 
        0
        ; i < 
        8 
         * 
        1024
        ; i++) {
       
        value[i] = 
        '0'
        ;
       
        }
       
        // add values to hfile
       
        for 
         (
        int 
         i = 
        0
        ; i < 
        10
        ; i++) {
       
        hwriter.append(key, value);
       
        }
       
        // close hfile
       
        hwriter.close();

上面的代码可以看出来，每一个value的值都是8kb，已经超过了hfile预设的1kb的block size。

实际的写入情况是如果value大于block size，那么就按照实际的情况来写。

上面的测试用例执行完毕以后，整个hile文件只有1个data block。

这个hfile的读取代码如下：

 
        // create local file system
       
        FileSystem fs = 
        new 
         RawLocalFileSystem();
       
        fs.initialize(URI.create(
        "file:///"
        ), 
        new 
         Configuration());
       
        fs.setConf(
        new 
         Configuration());
       
        HFile.Reader hreader = 
        new 
         HFile.Reader(fs,
       
        new 
         Path(
        "hfile"
        ), 
        null
        , 
        false
        );
       
        // loadFileInfo
       
        hreader.loadFileInfo();
       
        HFileScanner hscanner = hreader.getScanner(
        false
        , 
        false
        );
       
        // seek to the start position of the hfile.
       
        hscanner.seekTo();
       
        // print values.
       
        int 
         index = 
        1
        ;
       
        while 
         (hscanner.next()) {
       
        System.out.println(
        "index: " 
         + index++);
       
        System.out.println(
        "key: " 
         + hscanner.getKeyString());
       
        System.out.println(
        "value: " 
         + hscanner.getValueString());
       
        }
       
        // close hfile.
       
        hreader.close();