h5write和h5read可以处理哪类数据?和效率如何?今天想做个试验。
总体上看,对Float64,压缩和读取效率相当惊人。不管是Vector,还是Matrix.
using HDF5;
不能读Dict{ASCIIString,Array{ASCIIString,1}}()格式
A=Dict{ASCIIString,Array{ASCIIString,1}}();
for i =1:100000
temp =ASCIIString[]
for j =1:100
push!(temp,string(j))
end
setindex!(A,temp,string(i))
end
println("HDF5=>A: $(typeof(A))")
println("write data: ")
@time h5write("C://Users//Administrator//Desktop//test1.h5", "mygroup2/A", A)
println("read data:")
@time h5read("C://Users//Administrator//Desktop//test1.h5","mygroup2/A")
可以读字符串数组:
B =ASCIIString[]
for i =1:10000000
push!(B,string(i))
end
println("HDF5=>B :$(typeof(B))")
println("write data: ")
@time h5write("C://Users//Administrator//Desktop//test2.h5", "mygroup2/B", B)
println("read data:")
@time h5read("C://Users//Administrator//Desktop//test2.h5","mygroup2/B")
可以读Float64[], 以1亿条数据为例
C=Float64[]
for i =1:100000000
push!(C,i*1.0)
end
println("HDF5=>C :$(typeof(C))")
println("write data: ")
@time h5write("C://Users//Administrator//Desktop//test3.h5", "mygroup2/C", C)
println("read data:")
@time h5read("C://Users//Administrator//Desktop//test3.h5","mygroup2/C")
可以读Array{Float64,2} ,以1千万x 11的记录为例
rd =Float64[];
for i =1:10000000
push!(rd,(10000+i)*1.0)
end
arrdata =[rd rd.+1.0 rd.+2.0 rd.+3.0 rd.+4.0 rd.+5.0 rd.+6.0 rd.+7.0 rd.+8.0 rd.+9.0 rd.+10.0]
println("HDF5=>arrdata :$(typeof(arrdata))")
println("write data: $(size(arrdata))")
@time h5write("C://Users//Administrator//Desktop//test8.h5", "mygroup2/test4", arrdata)
println("read data:")
@time h5read("C://Users//Administrator//Desktop//test8.h5","mygroup2/test")
结果输出:A报错。
HDF5=>A: Dict{ASCIIString,Array{ASCIIString,1}}
write data:
HDF5=>B :Array{ASCIIString,1}
write data:
2.375719 seconds (10.00 M allocations: 228.875 MB)
read data:
11.216828 seconds (60.00 M allocations: 1.788 GB, 37.99% gc time)`
HDF5=>C :Array{Float64,1}
write data:
9.025300 seconds (18 allocations: 656 bytes)
read data:
3.681719 seconds (55 allocations: 762.941 MB, 84.90% gc time)
HDF5=>arrdata :Array{Float64,2}
write data: (10000000,11)
8.3300 seconds (18 allocations: 656 bytes)
read data:
0.911403 seconds (78 allocations: 839.236 MB, 3.43% gc time)
本文通过实验评估了H5格式对于不同类型数据的读写性能。测试表明,对于Float64类型的数据,无论是向量还是矩阵,其压缩与读取效率都非常高;而对于ASCII字符串数组及字典类型的ASCII字符串也进行了尝试,结果显示可以成功存储和读取。

1795

被折叠的 条评论
为什么被折叠?



