OCR数据生成之SynthText场景文本

最新推荐文章于 2026-04-28 09:16:15 发布

原创

最新推荐文章于 2026-04-28 09:16:15 发布 · 4.7k 阅读

·

5

·

标签

#OCR #SynthText #数据生成

自然场景的文字识别的数据生成至关重要，可以大量降低人工标注的成本，这里详细介绍SynthText的安装和使用，并生成自己的bg数据集对应的图片以及优化引入生成垂直文本的功能。

SynthText官方示例生成效果

https://github.com/ankush-me/SynthText

直接下载工程，和工程开源的SynthText.h5等数据，直接python gen.py即可。

samples

我这里使用的是工程中的python3分支

Adding New Images

Segmentation and depth-maps are required to use new images as background. Sample scripts for obtaining these are available here.

predict_depth.m MATLAB script to regress a depth mask for a given RGB image; uses the network of Liu etal. However, more recent works (e.g., this) might give better results.
run_ucm.m and floodFill.py for getting segmentation masks using gPb-UCM.

For an explanation of the fields in dset.h5 (e.g.: seg,area,label), please check this comment.

要想使用自己的bg数据，需要先得到depth和seg, 然后合并成dset.h5文件，之后调用SynthText中的gen.py生成数据；

生成depth

https://bitbucket.org/fayao/dcnf-fcsp/src/master/

我这里使用的自己的window10的笔记本电脑中的matlab环境(Matlab2016b)

使用改代码时，需要在libs下MatConvNet和VLFeat，虽然原工程已经配置了这两个，但是matconvnet_20141015需要换成matconvnet-1.0-beta9版本(下载地址：https://www.vlfeat.org/matconvnet/download/)，不然程序运行时会报错, VLFeat不用修改；同时修改./demo/demo_DCNF_FCSP_depths_pre

最低0.47元/天解锁文章

评论 4

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。