PaddleOCR:Win10上安装使用PPOCRLabel标注工具

该文章已生成可运行项目,

前言

环境要求

Package                  Version
------------------------ -----------
aiohappyeyeballs         2.6.1
aiohttp                  3.13.2
aiosignal                1.4.0
aistudio-sdk             0.3.8
albucore                 0.0.24
albumentations           2.0.8
annotated-types          0.7.0
anyio                    4.11.0
async-timeout            4.0.3
attrs                    25.4.0
bce-python-sdk           0.9.57
beautifulsoup4           4.14.3
cachetools               6.2.4
certifi                  2025.10.5
chardet                  5.2.0
charset-normalizer       3.4.4
click                    8.3.1
colorama                 0.4.6
colorlog                 6.10.1
cssselect                1.3.0
cssutils                 2.11.1
Cython                   3.2.3
dataclasses-json         0.6.7
distro                   1.9.0
einops                   0.8.1
et_xmlfile               2.0.0
exceptiongroup           1.3.0
filelock                 3.20.1
frozenlist               1.8.0
fsspec                   2025.12.0
ftfy                     6.3.1
future                   1.0.0
greenlet                 3.3.0
h11                      0.16.0
hf-xet                   1.2.0
httpcore                 1.0.9
httpx                    0.28.1
httpx-sse                0.4.3
huggingface_hub          1.2.3
idna                     3.11
ImageIO                  2.37.2
imagesize                1.4.1
Jinja2                   3.1.6
jiter                    0.12.0
joblib                   1.5.3
jsonpatch                1.33
jsonpointer              3.0.0
langchain                0.3.27
langchain-community      0.3.31
langchain-core           0.3.81
langchain-openai         0.3.35
langchain-text-splitters 0.3.11
langsmith                0.5.2
lazy_loader              0.4
lmdb                     1.7.5
lxml                     6.0.2
MarkupSafe               3.0.3
marshmallow              3.26.2
modelscope               1.33.0
more-itertools           10.8.0
multidict                6.7.0
mypy_extensions          1.1.0
networkx                 3.4.2
numpy                    2.2.6
nvidia-cublas-cu11       11.11.3.6
nvidia-cuda-nvrtc-cu11   11.8.89
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cudnn-cu11        8.9.4.19
nvidia-cufft-cu11        10.9.0.58
nvidia-curand-cu11       10.3.0.86
nvidia-cusolver-cu11     11.4.1.48
nvidia-cusparse-cu11     11.7.5.86
openai                   2.14.0
opencv-contrib-python    4.10.0.84
opencv-python            4.12.0.88
opencv-python-headless   4.12.0.88
openpyxl                 3.1.5
opt-einsum               3.3.0
orjson                   3.11.5
packaging                25.0
paddleocr                3.3.2
paddlepaddle-gpu         3.2.0
paddlex                  3.3.12
pandas                   2.3.3
pillow                   11.2.1
pip                      23.0.1
PPOCRLabel               3.1.4
premailer                3.10.0
prettytable              3.17.0
propcache                0.4.1
protobuf                 6.31.1
psutil                   7.2.1
py-cpuinfo               9.0.0
pyclipper                1.4.0
pycryptodome             3.23.0
pydantic                 2.12.5
pydantic_core            2.41.5
pydantic-settings        2.12.0
pypdfium2                5.2.0
PyQt5                    5.15.11
PyQt5-Qt5                5.15.2
PyQt5_sip                12.17.2
python-bidi              0.6.7
python-dateutil          2.9.0.post0
python-docx              1.2.0
python-dotenv            1.2.1
pytz                     2025.2
PyYAML                   6.0.2
RapidFuzz                3.14.3
regex                    2025.11.3
requests                 2.32.5
requests-toolbelt        1.0.0
ruamel.yaml              0.18.17
ruamel.yaml.clib         0.2.15
safetensors              0.7.0
scikit-image             0.25.2
scikit-learn             1.7.2
scipy                    1.15.3
sentencepiece            0.2.1
setuptools               65.5.0
shapely                  2.1.2
shellingham              1.5.4
simsimd                  6.5.12
six                      1.17.0
sniffio                  1.3.1
soupsieve                2.8.1
SQLAlchemy               2.0.45
stringzilla              4.6.0
tenacity                 9.1.2
threadpoolctl            3.6.0
tifffile                 2025.5.10
tiktoken                 0.12.0
tokenizers               0.22.1
tqdm                     4.67.1
typer-slim               0.21.0
typing_extensions        4.15.0
typing-inspect           0.9.0
typing-inspection        0.4.2
tzdata                   2025.3
ujson                    5.11.0
urllib3                  2.6.2
uuid_utils               0.12.0
wcwidth                  0.2.14
yarl                     1.22.0
zstandard                0.25.0

相关介绍

  • Python是一种跨平台的计算机程序设计语言。是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越多被用于独立的、大型项目的开发。
  • PyTorch 是一个深度学习框架,封装好了很多网络和深度学习相关的工具方便我们调用,而不用我们一个个去单独写了。它分为 CPU 和 GPU 版本,其他框架还有 TensorFlow、Caffe 等。PyTorch 是由 Facebook 人工智能研究院(FAIR)基于 Torch 推出的,它是一个基于 Python 的可续计算包,提供两个高级功能:1、具有强大的 GPU 加速的张量计算(如 NumPy);2、构建深度神经网络时的自动微分机制。
  • PPOCRLabel是一个半自动形注释的工具,适合于OCR领域内建PP-OCR型自动检测和重新认识到的数据。 它是写在Python3和PyQT5,支持矩框表,不规则的文本和关键的信息标注的模式。 注释可以直接用于训练的PP-OCR检测和识别模型。

安装使用PPOCRLabel标注工具

下载PPOCRLabel项目

pip install PPOCRLabel -i https://mirrors.aliyun.com/pypi/simple

如果没有报错,则安装成功。
在这里插入图片描述

运行PPOCRLabel标注工具

PPOCRLabel --lang ch
# 或者
python venv\lib\site-packages\PPOCRLabel\PPOCRLabel.py --lang ch

在这里插入图片描述

准备数据集

在这里插入图片描述

OCR文本标注

读取图片文件夹

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

文本标注

在这里插入图片描述

导出PaddleOCR格式

在这里插入图片描述

在这里插入图片描述
在这里插入图片描述

myocr_data/2026-01-04_162613_038.png	[{"transcription": "CSDn", "points": [[112, 18], [183, 18], [183, 43], [112, 43]], "difficult": false}, {"transcription": "在线机", "points": [[442, 17], [503, 17], [503, 43], [442, 43]], "difficult": false}, {"transcription": "Q搜索", "points": [[1121, 18], [1188, 18], [1188, 44], [1121, 44]], "difficult": false}, {"transcription": "Al搜索", "points": [[1244, 18], [1310, 18], [1310, 43], [1244, 43]], "difficult": false}, {"transcription": "O.O", "points": [[1451, 24], [1477, 24], [1477, 35], [1451, 35]], "difficult": false}, {"transcription": "会员中心低价", "points": [[1524, 18], [1648, 16], [1649, 41], [1524, 43]], "difficult": false}, {"transcription": "消息", "points": [[1668, 16], [1713, 16], [1713, 43], [1668, 43]], "difficult": false}, {"transcription": "十创作", "points": [[1755, 16], [1825, 16], [1825, 45], [1755, 45]], "difficult": false}, {"transcription": "首页", "points": [[78, 87], [118, 87], [118, 112], [78, 112]], "difficult": false}, {"transcription": "全部", "points": [[371, 83], [417, 83], [417, 112], [371, 112]], "difficult": false}, {"transcription": "资讯", "points": [[456, 83], [500, 83], [500, 112], [456, 112]], "difficult": false}, {"transcription": "MCP", "points": [[539, 81], [588, 81], [588, 108], [539, 108]], "difficult": false}, {"transcription": "DeepSeek", "points": [[630, 82], [720, 82], [720, 108], [630, 108]], "difficult": false}, {"transcription": "运维", "points": [[760, 84], [806, 84], [806, 112], [760, 112]], "difficult": false}, {"transcription": "操作系统", "points": [[846, 83], [926, 83], [926, 112], [846, 112]], "difficult": false}, {"transcription": "人工智能", "points": [[966, 83], [1046, 83], [1046, 112], [966, 112]], "difficult": false}, {"transcription": "Java", "points": [[1085, 82], [1131, 85], [1130, 108], [1084, 105]], "difficult": false}, {"transcription": "C++", "points": [[1173, 83], [1215, 83], [1215, 106], [1173, 106]], "difficult": false}, {"transcription": "Python", "points": [[1257, 83], [1319, 83], [1319, 109], [1257, 109]], "difficult": false}, {"transcription": "数据结构与算法", "points": [[1362, 86], [1490, 83], [1490, 107], [1362, 110]], "difficult": false}, {"transcription": "前端", "points": [[1533, 83], [1577, 83], [1577, 112], [1533, 112]], "difficult": false}, {"transcription": "后端", "points": [[1619, 83], [1663, 83], [1663, 112], [1619, 112]], "difficult": false}, {"transcription": "HarmonyOsc)", "points": [[1726, 82], [1843, 82], [1843, 108], [1726, 108]], "difficult": false}, {"transcription": "国", "points": [[33, 136], [60, 139], [58, 160], [30, 157]], "difficult": false}, {"transcription": "博客", "points": [[76, 135], [119, 135], [119, 162], [76, 162]], "difficult": false}, {"transcription": "资讯头条", "points": [[358, 131], [453, 131], [453, 160], [358, 160]], "difficult": false}, {"transcription": "更多资讯>", "points": [[1295, 132], [1390, 132], [1390, 157], [1295, 157]], "difficult": false}, {"transcription": "广告×", "points": [[1784, 131], [1845, 128], [1846, 155], [1785, 157]], "difficult": false}, {"transcription": "4", "points": [[32, 187], [57, 187], [57, 209], [32, 209]], "difficult": false}, {"transcription": "下载", "points": [[79, 187], [117, 185], [119, 209], [81, 212]], "difficult": false}, {"transcription": "日0", "points": [[797, 206], [830, 206], [830, 217], [797, 217]], "difficult": false}, {"transcription": "“极客头条“", "points": [[1200, 203], [1342, 203], [1342, 235], [1200, 235]], "difficult": false}, {"transcription": "CSDN镜像创作福利", "points": [[1473, 198], [1828, 198], [1828, 235], [1473, 235]], "difficult": false}, {"transcription": "学习", "points": [[77, 234], [117, 234], [117, 261], [77, 261]], "difficult": false}, {"transcription": "技术人的新间圈!", "points": [[1215, 237], [1320, 237], [1320, 254], [1215, 254]], "difficult": false}, {"transcription": "因老板的“业余项目”踩雷,", "points": [[375, 263], [575, 263], [575, 285], [375, 285]], "difficult": false}, {"transcription": "前Oracle工程师被裁失业两", "points": [[636, 264], [851, 264], [851, 285], [636, 285]], "difficult": false}, {"transcription": "ListenHub 完成 200 万美", "points": [[896, 263], [1102, 263], [1102, 285], [896, 285]], "difficult": false}, {"transcription": "雷军回应拆车原因:希望大", "points": [[1162, 263], [1376, 262], [1376, 283], [1162, 285]], "difficult": false}, {"transcription": "社区", "points": [[75, 282], [120, 285], [118, 312], [73, 309]], "difficult": false}, {"transcription": "创作官方指定镜像", "points": [[1483, 273], [1739, 273], [1739, 305], [1483, 305]], "difficult": false}, {"transcription": "间", "points": [[31, 289], [57, 285], [61, 305], [35, 310]], "difficult": false}, {"transcription": "IT团队被迫跨年加班:所..", "points": [[374, 288], [585, 288], [585, 309], [374, 309]], "difficult": false}, {"transcription": "年,40岁靠淘日货倒卖糊..", "points": [[636, 287], [844, 287], [844, 308], [636, 308]], "difficult": false}, {"transcription": "元天使+轮融资:以“万物..", "points": [[899, 288], [1113, 288], [1113, 309], [899, 309]], "difficult": false}, {"transcription": "家能说一些公道话;苹果..", "points": [[1161, 286], [1375, 286], [1375, 310], [1161, 310]], "difficult": false}, {"transcription": "必得30-80元现金奖励", "points": [[1479, 318], [1823, 320], [1822, 354], [1479, 352]], "difficult": false}, {"transcription": "对话玉伯:“前端之神”的A新战事|万有引力", "points": [[360, 353], [728, 354], [728, 375], [360, 374]], "difficult": false}, {"transcription": "GPU编程新机遇!TritonNext 2026大会来袭,首批嘉宾与议..", "points": [[903, 353], [1382, 353], [1382, 376], [903, 376]], "difficult": false}, {"transcription": "GPU算力", "points": [[77, 363], [154, 363], [154, 386], [77, 386]], "difficult": false}, {"transcription": "被库克怒告泄密,他直接“摆烂”:折叠屏iPhone全细节曝光...", "points": [[372, 400], [851, 401], [851, 425], [372, 424]], "difficult": false}, {"transcription": "AI搜索", "points": [[76, 410], [136, 410], [136, 439], [76, 439]], "difficult": false}, {"transcription": "AI一封感谢信惹怒程序员圈:Go创始人连飙脏话,Python之..", "points": [[888, 403], [1385, 403], [1385, 423], [888, 423]], "difficult": false}, {"transcription": "立即参与>", "points": [[1496, 402], [1613, 404], [1612, 436], [1495, 433]], "difficult": false}, {"transcription": "GPU", "points": [[1707, 415], [1773, 381], [1786, 408], [1720, 441]], "difficult": false}, {"transcription": "2025 美团技术团队热门技术文章汇总", "points": [[362, 454], [668, 454], [668, 472], [362, 472]], "difficult": false}, {"transcription": "■涌现、AI带来裁员的结果都是必然", "points": [[887, 451], [1181, 451], [1181, 475], [887, 475]], "difficult": false}, {"transcription": "GitCode", "points": [[79, 461], [147, 461], [147, 484], [79, 484]], "difficult": false}, {"transcription": "对话张笑宇|万有引力", "points": [[1214, 452], [1389, 452], [1389, 474], [1214, 474]], "difficult": false}, {"transcription": "InsCode", "points": [[77, 512], [148, 512], [148, 534], [77, 534]], "difficult": false}, {"transcription": "开源项目", "points": [[358, 533], [451, 533], [451, 558], [358, 558]], "difficult": false}, {"transcription": "更多开源项目>", "points": [[1266, 534], [1388, 534], [1388, 555], [1266, 555]], "difficult": false}, {"transcription": "B", "points": [[37, 567], [50, 567], [50, 581], [37, 581]], "difficult": false}, {"transcription": "技术会议", "points": [[75, 561], [152, 561], [152, 586], [75, 586]], "difficult": false}, {"transcription": "Langflow:这个拖拽式AI工作流神器", "points": [[544, 595], [873, 595], [873, 618], [544, 618]], "difficult": false}, {"transcription": "CHATERM AI:开启云资源氛围管理", "points": [[1048, 593], [1376, 594], [1376, 618], [1047, 617]], "difficult": false}, {"transcription": "ク", "points": [[451, 610], [520, 610], [520, 667], [451, 667]], "difficult": false}, {"transcription": "盟", "points": [[1814, 607], [1852, 607], [1852, 645], [1814, 645]], "difficult": false}, {"transcription": "正在颠覆传统编程", "points": [[543, 619], [707, 621], [707, 647], [543, 644]], "difficult": false}, {"transcription": "新篇章!", "points": [[1047, 619], [1123, 619], [1123, 647], [1047, 647]], "difficult": false}, {"transcription": "同", "points": [[32, 639], [58, 639], [58, 663], [32, 663]], "difficult": false}, {"transcription": "订阅", "points": [[75, 637], [119, 637], [119, 665], [75, 665]], "difficult": false}, {"transcription": "查看详情→", "points": [[796, 653], [901, 656], [901, 681], [796, 678]], "difficult": false}, {"transcription": "查看详情-", "points": [[1301, 654], [1394, 651], [1395, 679], [1302, 682]], "difficult": false}, {"transcription": "社区推荐", "points": [[1447, 645], [1544, 645], [1544, 674], [1447, 674]], "difficult": false}, {"transcription": "⚫人工智能", "points": [[542, 656], [636, 656], [636, 681], [542, 681]], "difficult": false}, {"transcription": " 60.5K", "points": [[657, 655], [730, 655], [730, 680], [657, 680]], "difficult": false}, {"transcription": "⚫人工智能", "points": [[1046, 655], [1140, 655], [1140, 680], [1046, 680]], "difficult": false}, {"transcription": "59.4K", "points": [[1163, 657], [1236, 657], [1236, 679], [1163, 679]], "difficult": false}, {"transcription": "更多)", "points": [[1794, 648], [1850, 645], [1851, 668], [1795, 671]], "difficult": false}, {"transcription": "&", "points": [[32, 682], [61, 682], [61, 717], [32, 717]], "difficult": false}, {"transcription": "关注", "points": [[76, 686], [120, 686], [120, 716], [76, 716]], "difficult": false}, {"transcription": "Qualceww", "points": [[1451, 710], [1488, 710], [1488, 720], [1451, 720]], "difficult": false}, {"transcription": "高通开发者中文社区", "points": [[1509, 706], [1670, 706], [1670, 726], [1509, 726]], "difficult": false}, {"transcription": "闪", "points": [[35, 739], [56, 739], [56, 766], [35, 766]], "difficult": false}, {"transcription": "收藏", "points": [[77, 735], [121, 738], [119, 767], [75, 764]], "difficult": false}, {"transcription": "Better Auth:一个面向 TypeScript", "points": [[541, 744], [864, 745], [863, 769], [541, 768]], "difficult": false}, {"transcription": "MiniMax-M2.1:MiniMax-AI开源大", "points": [[1049, 744], [1368, 746], [1368, 769], [1048, 767]], "difficult": false}, {"transcription": "H", "points": [[451, 767], [515, 767], [515, 812], [451, 812]], "difficult": false}, {"transcription": "的全面身份验证库", "points": [[541, 770], [708, 770], [708, 797], [541, 797]], "difficult": false}, {"transcription": "模型,赋能高效智能应用开发", "points": [[1050, 772], [1312, 772], [1312, 795], [1050, 795]], "difficult": false}, {"transcription": "HarmonyOs开发者社区", "points": [[1510, 766], [1700, 766], [1700, 786], [1510, 786]], "difficult": false}, {"transcription": ">", "points": [[1835, 767], [1847, 767], [1847, 782], [1835, 782]], "difficult": false}, {"transcription": "历史", "points": [[77, 787], [119, 787], [119, 815], [77, 815]], "difficult": false}, {"transcription": "●服务器", "points": [[541, 803], [619, 803], [619, 832], [541, 832]], "difficult": false}, {"transcription": "G41.4K", "points": [[638, 805], [713, 805], [713, 830], [638, 830]], "difficult": false}, {"transcription": "查看详情 →", "points": [[797, 805], [900, 805], [900, 830], [797, 830]], "difficult": false}, {"transcription": "• Python", "points": [[1048, 807], [1123, 807], [1123, 829], [1048, 829]], "difficult": false}, {"transcription": "查看详情-", "points": [[1303, 805], [1396, 805], [1396, 830], [1303, 830]], "difficult": false}, {"transcription": "M", "points": [[28, 833], [61, 833], [61, 866], [28, 866]], "difficult": false}, {"transcription": "鲲鹏昇腾开发者社区", "points": [[1509, 825], [1669, 825], [1669, 847], [1509, 847]], "difficult": false}, {"transcription": "会员中心", "points": [[78, 836], [153, 836], [153, 866], [78, 866]], "difficult": false}, {"transcription": "`", "points": [[1837, 828], [1846, 828], [1846, 839], [1837, 839]], "difficult": false}, {"transcription": "", "points": [[35, 880], [60, 886], [52, 916], [27, 910]], "difficult": false}, {"transcription": "创作中心", "points": [[78, 888], [153, 888], [153, 913], [78, 913]], "difficult": false}, {"transcription": "intel", "points": [[1456, 889], [1482, 889], [1482, 901], [1456, 901]], "difficult": false}, {"transcription": "英特尔开发人员专区", "points": [[1510, 885], [1669, 885], [1669, 906], [1510, 906]], "difficult": false}, {"transcription": "-", "points": [[1837, 889], [1846, 889], [1846, 901], [1837, 901]], "difficult": false}]

更多功能

参考

[1] https://github.com/PFCCLab/PPOCRLabel.git
[2] https://github.com/PaddlePaddle/PaddleOCR.git

本文章已经生成可运行项目
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

FriendshipT

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值