PaddleOCR:Win10上安装使用PPOCRLabel标注工具
前言
- 由于本人水平有限,难免出现错漏,敬请批评改正。
- 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏、人工智能混合编程实践专栏或我的个人主页查看
- 人工智能混合编程实践:C++调用Python ONNX进行YOLOv8推理
- 人工智能混合编程实践:C++调用封装好的DLL进行YOLOv8实例分割
- 人工智能混合编程实践:C++调用Python ONNX进行图像超分重建
- 人工智能混合编程实践:C++调用Python AgentOCR进行文本识别
- 通过计算实例简单地理解PatchCore异常检测
- Python将YOLO格式实例分割数据集转换为COCO格式实例分割数据集
- YOLOv8 Ultralytics:使用Ultralytics框架训练RT-DETR实时目标检测模型
- 基于DETR的人脸伪装检测
- YOLOv7训练自己的数据集(口罩检测)
- YOLOv8训练自己的数据集(足球检测)
- YOLOv5:TensorRT加速YOLOv5模型推理
- YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
- 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
- YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
- YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
- Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
- YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
- 使用Kaggle GPU资源免费体验Stable Diffusion开源项目
- Stable Diffusion:在服务器上部署使用Stable Diffusion WebUI进行AI绘图(v2.0)
- Stable Diffusion:使用自己的数据集微调训练LoRA模型(v2.0)
环境要求
Package Version
------------------------ -----------
aiohappyeyeballs 2.6.1
aiohttp 3.13.2
aiosignal 1.4.0
aistudio-sdk 0.3.8
albucore 0.0.24
albumentations 2.0.8
annotated-types 0.7.0
anyio 4.11.0
async-timeout 4.0.3
attrs 25.4.0
bce-python-sdk 0.9.57
beautifulsoup4 4.14.3
cachetools 6.2.4
certifi 2025.10.5
chardet 5.2.0
charset-normalizer 3.4.4
click 8.3.1
colorama 0.4.6
colorlog 6.10.1
cssselect 1.3.0
cssutils 2.11.1
Cython 3.2.3
dataclasses-json 0.6.7
distro 1.9.0
einops 0.8.1
et_xmlfile 2.0.0
exceptiongroup 1.3.0
filelock 3.20.1
frozenlist 1.8.0
fsspec 2025.12.0
ftfy 6.3.1
future 1.0.0
greenlet 3.3.0
h11 0.16.0
hf-xet 1.2.0
httpcore 1.0.9
httpx 0.28.1
httpx-sse 0.4.3
huggingface_hub 1.2.3
idna 3.11
ImageIO 2.37.2
imagesize 1.4.1
Jinja2 3.1.6
jiter 0.12.0
joblib 1.5.3
jsonpatch 1.33
jsonpointer 3.0.0
langchain 0.3.27
langchain-community 0.3.31
langchain-core 0.3.81
langchain-openai 0.3.35
langchain-text-splitters 0.3.11
langsmith 0.5.2
lazy_loader 0.4
lmdb 1.7.5
lxml 6.0.2
MarkupSafe 3.0.3
marshmallow 3.26.2
modelscope 1.33.0
more-itertools 10.8.0
multidict 6.7.0
mypy_extensions 1.1.0
networkx 3.4.2
numpy 2.2.6
nvidia-cublas-cu11 11.11.3.6
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cudnn-cu11 8.9.4.19
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.3.0.86
nvidia-cusolver-cu11 11.4.1.48
nvidia-cusparse-cu11 11.7.5.86
openai 2.14.0
opencv-contrib-python 4.10.0.84
opencv-python 4.12.0.88
opencv-python-headless 4.12.0.88
openpyxl 3.1.5
opt-einsum 3.3.0
orjson 3.11.5
packaging 25.0
paddleocr 3.3.2
paddlepaddle-gpu 3.2.0
paddlex 3.3.12
pandas 2.3.3
pillow 11.2.1
pip 23.0.1
PPOCRLabel 3.1.4
premailer 3.10.0
prettytable 3.17.0
propcache 0.4.1
protobuf 6.31.1
psutil 7.2.1
py-cpuinfo 9.0.0
pyclipper 1.4.0
pycryptodome 3.23.0
pydantic 2.12.5
pydantic_core 2.41.5
pydantic-settings 2.12.0
pypdfium2 5.2.0
PyQt5 5.15.11
PyQt5-Qt5 5.15.2
PyQt5_sip 12.17.2
python-bidi 0.6.7
python-dateutil 2.9.0.post0
python-docx 1.2.0
python-dotenv 1.2.1
pytz 2025.2
PyYAML 6.0.2
RapidFuzz 3.14.3
regex 2025.11.3
requests 2.32.5
requests-toolbelt 1.0.0
ruamel.yaml 0.18.17
ruamel.yaml.clib 0.2.15
safetensors 0.7.0
scikit-image 0.25.2
scikit-learn 1.7.2
scipy 1.15.3
sentencepiece 0.2.1
setuptools 65.5.0
shapely 2.1.2
shellingham 1.5.4
simsimd 6.5.12
six 1.17.0
sniffio 1.3.1
soupsieve 2.8.1
SQLAlchemy 2.0.45
stringzilla 4.6.0
tenacity 9.1.2
threadpoolctl 3.6.0
tifffile 2025.5.10
tiktoken 0.12.0
tokenizers 0.22.1
tqdm 4.67.1
typer-slim 0.21.0
typing_extensions 4.15.0
typing-inspect 0.9.0
typing-inspection 0.4.2
tzdata 2025.3
ujson 5.11.0
urllib3 2.6.2
uuid_utils 0.12.0
wcwidth 0.2.14
yarl 1.22.0
zstandard 0.25.0
相关介绍
- Python是一种跨平台的计算机程序设计语言。是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越多被用于独立的、大型项目的开发。
- PyTorch 是一个深度学习框架,封装好了很多网络和深度学习相关的工具方便我们调用,而不用我们一个个去单独写了。它分为 CPU 和 GPU 版本,其他框架还有 TensorFlow、Caffe 等。PyTorch 是由 Facebook 人工智能研究院(FAIR)基于 Torch 推出的,它是一个基于 Python 的可续计算包,提供两个高级功能:1、具有强大的 GPU 加速的张量计算(如 NumPy);2、构建深度神经网络时的自动微分机制。
- PPOCRLabel是一个半自动形注释的工具,适合于OCR领域内建PP-OCR型自动检测和重新认识到的数据。 它是写在Python3和PyQT5,支持矩框表,不规则的文本和关键的信息标注的模式。 注释可以直接用于训练的PP-OCR检测和识别模型。
安装使用PPOCRLabel标注工具
下载PPOCRLabel项目
pip install PPOCRLabel -i https://mirrors.aliyun.com/pypi/simple
如果没有报错,则安装成功。

运行PPOCRLabel标注工具
PPOCRLabel --lang ch
# 或者
python venv\lib\site-packages\PPOCRLabel\PPOCRLabel.py --lang ch

准备数据集

OCR文本标注
读取图片文件夹



文本标注

导出PaddleOCR格式



myocr_data/2026-01-04_162613_038.png [{"transcription": "CSDn", "points": [[112, 18], [183, 18], [183, 43], [112, 43]], "difficult": false}, {"transcription": "在线机", "points": [[442, 17], [503, 17], [503, 43], [442, 43]], "difficult": false}, {"transcription": "Q搜索", "points": [[1121, 18], [1188, 18], [1188, 44], [1121, 44]], "difficult": false}, {"transcription": "Al搜索", "points": [[1244, 18], [1310, 18], [1310, 43], [1244, 43]], "difficult": false}, {"transcription": "O.O", "points": [[1451, 24], [1477, 24], [1477, 35], [1451, 35]], "difficult": false}, {"transcription": "会员中心低价", "points": [[1524, 18], [1648, 16], [1649, 41], [1524, 43]], "difficult": false}, {"transcription": "消息", "points": [[1668, 16], [1713, 16], [1713, 43], [1668, 43]], "difficult": false}, {"transcription": "十创作", "points": [[1755, 16], [1825, 16], [1825, 45], [1755, 45]], "difficult": false}, {"transcription": "首页", "points": [[78, 87], [118, 87], [118, 112], [78, 112]], "difficult": false}, {"transcription": "全部", "points": [[371, 83], [417, 83], [417, 112], [371, 112]], "difficult": false}, {"transcription": "资讯", "points": [[456, 83], [500, 83], [500, 112], [456, 112]], "difficult": false}, {"transcription": "MCP", "points": [[539, 81], [588, 81], [588, 108], [539, 108]], "difficult": false}, {"transcription": "DeepSeek", "points": [[630, 82], [720, 82], [720, 108], [630, 108]], "difficult": false}, {"transcription": "运维", "points": [[760, 84], [806, 84], [806, 112], [760, 112]], "difficult": false}, {"transcription": "操作系统", "points": [[846, 83], [926, 83], [926, 112], [846, 112]], "difficult": false}, {"transcription": "人工智能", "points": [[966, 83], [1046, 83], [1046, 112], [966, 112]], "difficult": false}, {"transcription": "Java", "points": [[1085, 82], [1131, 85], [1130, 108], [1084, 105]], "difficult": false}, {"transcription": "C++", "points": [[1173, 83], [1215, 83], [1215, 106], [1173, 106]], "difficult": false}, {"transcription": "Python", "points": [[1257, 83], [1319, 83], [1319, 109], [1257, 109]], "difficult": false}, {"transcription": "数据结构与算法", "points": [[1362, 86], [1490, 83], [1490, 107], [1362, 110]], "difficult": false}, {"transcription": "前端", "points": [[1533, 83], [1577, 83], [1577, 112], [1533, 112]], "difficult": false}, {"transcription": "后端", "points": [[1619, 83], [1663, 83], [1663, 112], [1619, 112]], "difficult": false}, {"transcription": "HarmonyOsc)", "points": [[1726, 82], [1843, 82], [1843, 108], [1726, 108]], "difficult": false}, {"transcription": "国", "points": [[33, 136], [60, 139], [58, 160], [30, 157]], "difficult": false}, {"transcription": "博客", "points": [[76, 135], [119, 135], [119, 162], [76, 162]], "difficult": false}, {"transcription": "资讯头条", "points": [[358, 131], [453, 131], [453, 160], [358, 160]], "difficult": false}, {"transcription": "更多资讯>", "points": [[1295, 132], [1390, 132], [1390, 157], [1295, 157]], "difficult": false}, {"transcription": "广告×", "points": [[1784, 131], [1845, 128], [1846, 155], [1785, 157]], "difficult": false}, {"transcription": "4", "points": [[32, 187], [57, 187], [57, 209], [32, 209]], "difficult": false}, {"transcription": "下载", "points": [[79, 187], [117, 185], [119, 209], [81, 212]], "difficult": false}, {"transcription": "日0", "points": [[797, 206], [830, 206], [830, 217], [797, 217]], "difficult": false}, {"transcription": "“极客头条“", "points": [[1200, 203], [1342, 203], [1342, 235], [1200, 235]], "difficult": false}, {"transcription": "CSDN镜像创作福利", "points": [[1473, 198], [1828, 198], [1828, 235], [1473, 235]], "difficult": false}, {"transcription": "学习", "points": [[77, 234], [117, 234], [117, 261], [77, 261]], "difficult": false}, {"transcription": "技术人的新间圈!", "points": [[1215, 237], [1320, 237], [1320, 254], [1215, 254]], "difficult": false}, {"transcription": "因老板的“业余项目”踩雷,", "points": [[375, 263], [575, 263], [575, 285], [375, 285]], "difficult": false}, {"transcription": "前Oracle工程师被裁失业两", "points": [[636, 264], [851, 264], [851, 285], [636, 285]], "difficult": false}, {"transcription": "ListenHub 完成 200 万美", "points": [[896, 263], [1102, 263], [1102, 285], [896, 285]], "difficult": false}, {"transcription": "雷军回应拆车原因:希望大", "points": [[1162, 263], [1376, 262], [1376, 283], [1162, 285]], "difficult": false}, {"transcription": "社区", "points": [[75, 282], [120, 285], [118, 312], [73, 309]], "difficult": false}, {"transcription": "创作官方指定镜像", "points": [[1483, 273], [1739, 273], [1739, 305], [1483, 305]], "difficult": false}, {"transcription": "间", "points": [[31, 289], [57, 285], [61, 305], [35, 310]], "difficult": false}, {"transcription": "IT团队被迫跨年加班:所..", "points": [[374, 288], [585, 288], [585, 309], [374, 309]], "difficult": false}, {"transcription": "年,40岁靠淘日货倒卖糊..", "points": [[636, 287], [844, 287], [844, 308], [636, 308]], "difficult": false}, {"transcription": "元天使+轮融资:以“万物..", "points": [[899, 288], [1113, 288], [1113, 309], [899, 309]], "difficult": false}, {"transcription": "家能说一些公道话;苹果..", "points": [[1161, 286], [1375, 286], [1375, 310], [1161, 310]], "difficult": false}, {"transcription": "必得30-80元现金奖励", "points": [[1479, 318], [1823, 320], [1822, 354], [1479, 352]], "difficult": false}, {"transcription": "对话玉伯:“前端之神”的A新战事|万有引力", "points": [[360, 353], [728, 354], [728, 375], [360, 374]], "difficult": false}, {"transcription": "GPU编程新机遇!TritonNext 2026大会来袭,首批嘉宾与议..", "points": [[903, 353], [1382, 353], [1382, 376], [903, 376]], "difficult": false}, {"transcription": "GPU算力", "points": [[77, 363], [154, 363], [154, 386], [77, 386]], "difficult": false}, {"transcription": "被库克怒告泄密,他直接“摆烂”:折叠屏iPhone全细节曝光...", "points": [[372, 400], [851, 401], [851, 425], [372, 424]], "difficult": false}, {"transcription": "AI搜索", "points": [[76, 410], [136, 410], [136, 439], [76, 439]], "difficult": false}, {"transcription": "AI一封感谢信惹怒程序员圈:Go创始人连飙脏话,Python之..", "points": [[888, 403], [1385, 403], [1385, 423], [888, 423]], "difficult": false}, {"transcription": "立即参与>", "points": [[1496, 402], [1613, 404], [1612, 436], [1495, 433]], "difficult": false}, {"transcription": "GPU", "points": [[1707, 415], [1773, 381], [1786, 408], [1720, 441]], "difficult": false}, {"transcription": "2025 美团技术团队热门技术文章汇总", "points": [[362, 454], [668, 454], [668, 472], [362, 472]], "difficult": false}, {"transcription": "■涌现、AI带来裁员的结果都是必然", "points": [[887, 451], [1181, 451], [1181, 475], [887, 475]], "difficult": false}, {"transcription": "GitCode", "points": [[79, 461], [147, 461], [147, 484], [79, 484]], "difficult": false}, {"transcription": "对话张笑宇|万有引力", "points": [[1214, 452], [1389, 452], [1389, 474], [1214, 474]], "difficult": false}, {"transcription": "InsCode", "points": [[77, 512], [148, 512], [148, 534], [77, 534]], "difficult": false}, {"transcription": "开源项目", "points": [[358, 533], [451, 533], [451, 558], [358, 558]], "difficult": false}, {"transcription": "更多开源项目>", "points": [[1266, 534], [1388, 534], [1388, 555], [1266, 555]], "difficult": false}, {"transcription": "B", "points": [[37, 567], [50, 567], [50, 581], [37, 581]], "difficult": false}, {"transcription": "技术会议", "points": [[75, 561], [152, 561], [152, 586], [75, 586]], "difficult": false}, {"transcription": "Langflow:这个拖拽式AI工作流神器", "points": [[544, 595], [873, 595], [873, 618], [544, 618]], "difficult": false}, {"transcription": "CHATERM AI:开启云资源氛围管理", "points": [[1048, 593], [1376, 594], [1376, 618], [1047, 617]], "difficult": false}, {"transcription": "ク", "points": [[451, 610], [520, 610], [520, 667], [451, 667]], "difficult": false}, {"transcription": "盟", "points": [[1814, 607], [1852, 607], [1852, 645], [1814, 645]], "difficult": false}, {"transcription": "正在颠覆传统编程", "points": [[543, 619], [707, 621], [707, 647], [543, 644]], "difficult": false}, {"transcription": "新篇章!", "points": [[1047, 619], [1123, 619], [1123, 647], [1047, 647]], "difficult": false}, {"transcription": "同", "points": [[32, 639], [58, 639], [58, 663], [32, 663]], "difficult": false}, {"transcription": "订阅", "points": [[75, 637], [119, 637], [119, 665], [75, 665]], "difficult": false}, {"transcription": "查看详情→", "points": [[796, 653], [901, 656], [901, 681], [796, 678]], "difficult": false}, {"transcription": "查看详情-", "points": [[1301, 654], [1394, 651], [1395, 679], [1302, 682]], "difficult": false}, {"transcription": "社区推荐", "points": [[1447, 645], [1544, 645], [1544, 674], [1447, 674]], "difficult": false}, {"transcription": "⚫人工智能", "points": [[542, 656], [636, 656], [636, 681], [542, 681]], "difficult": false}, {"transcription": " 60.5K", "points": [[657, 655], [730, 655], [730, 680], [657, 680]], "difficult": false}, {"transcription": "⚫人工智能", "points": [[1046, 655], [1140, 655], [1140, 680], [1046, 680]], "difficult": false}, {"transcription": "59.4K", "points": [[1163, 657], [1236, 657], [1236, 679], [1163, 679]], "difficult": false}, {"transcription": "更多)", "points": [[1794, 648], [1850, 645], [1851, 668], [1795, 671]], "difficult": false}, {"transcription": "&", "points": [[32, 682], [61, 682], [61, 717], [32, 717]], "difficult": false}, {"transcription": "关注", "points": [[76, 686], [120, 686], [120, 716], [76, 716]], "difficult": false}, {"transcription": "Qualceww", "points": [[1451, 710], [1488, 710], [1488, 720], [1451, 720]], "difficult": false}, {"transcription": "高通开发者中文社区", "points": [[1509, 706], [1670, 706], [1670, 726], [1509, 726]], "difficult": false}, {"transcription": "闪", "points": [[35, 739], [56, 739], [56, 766], [35, 766]], "difficult": false}, {"transcription": "收藏", "points": [[77, 735], [121, 738], [119, 767], [75, 764]], "difficult": false}, {"transcription": "Better Auth:一个面向 TypeScript", "points": [[541, 744], [864, 745], [863, 769], [541, 768]], "difficult": false}, {"transcription": "MiniMax-M2.1:MiniMax-AI开源大", "points": [[1049, 744], [1368, 746], [1368, 769], [1048, 767]], "difficult": false}, {"transcription": "H", "points": [[451, 767], [515, 767], [515, 812], [451, 812]], "difficult": false}, {"transcription": "的全面身份验证库", "points": [[541, 770], [708, 770], [708, 797], [541, 797]], "difficult": false}, {"transcription": "模型,赋能高效智能应用开发", "points": [[1050, 772], [1312, 772], [1312, 795], [1050, 795]], "difficult": false}, {"transcription": "HarmonyOs开发者社区", "points": [[1510, 766], [1700, 766], [1700, 786], [1510, 786]], "difficult": false}, {"transcription": ">", "points": [[1835, 767], [1847, 767], [1847, 782], [1835, 782]], "difficult": false}, {"transcription": "历史", "points": [[77, 787], [119, 787], [119, 815], [77, 815]], "difficult": false}, {"transcription": "●服务器", "points": [[541, 803], [619, 803], [619, 832], [541, 832]], "difficult": false}, {"transcription": "G41.4K", "points": [[638, 805], [713, 805], [713, 830], [638, 830]], "difficult": false}, {"transcription": "查看详情 →", "points": [[797, 805], [900, 805], [900, 830], [797, 830]], "difficult": false}, {"transcription": "• Python", "points": [[1048, 807], [1123, 807], [1123, 829], [1048, 829]], "difficult": false}, {"transcription": "查看详情-", "points": [[1303, 805], [1396, 805], [1396, 830], [1303, 830]], "difficult": false}, {"transcription": "M", "points": [[28, 833], [61, 833], [61, 866], [28, 866]], "difficult": false}, {"transcription": "鲲鹏昇腾开发者社区", "points": [[1509, 825], [1669, 825], [1669, 847], [1509, 847]], "difficult": false}, {"transcription": "会员中心", "points": [[78, 836], [153, 836], [153, 866], [78, 866]], "difficult": false}, {"transcription": "`", "points": [[1837, 828], [1846, 828], [1846, 839], [1837, 839]], "difficult": false}, {"transcription": "①", "points": [[35, 880], [60, 886], [52, 916], [27, 910]], "difficult": false}, {"transcription": "创作中心", "points": [[78, 888], [153, 888], [153, 913], [78, 913]], "difficult": false}, {"transcription": "intel", "points": [[1456, 889], [1482, 889], [1482, 901], [1456, 901]], "difficult": false}, {"transcription": "英特尔开发人员专区", "points": [[1510, 885], [1669, 885], [1669, 906], [1510, 906]], "difficult": false}, {"transcription": "-", "points": [[1837, 889], [1846, 889], [1846, 901], [1837, 901]], "difficult": false}]
更多功能
- 更多功能可查阅官方项目代码中的相关文档,自行探索。
参考
[1] https://github.com/PFCCLab/PPOCRLabel.git
[2] https://github.com/PaddlePaddle/PaddleOCR.git
- 由于本人水平有限,难免出现错漏,敬请批评改正。
- 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏、人工智能混合编程实践专栏或我的个人主页查看
- 人工智能混合编程实践:C++调用Python ONNX进行YOLOv8推理
- 人工智能混合编程实践:C++调用封装好的DLL进行YOLOv8实例分割
- 人工智能混合编程实践:C++调用Python ONNX进行图像超分重建
- 人工智能混合编程实践:C++调用Python AgentOCR进行文本识别
- 通过计算实例简单地理解PatchCore异常检测
- Python将YOLO格式实例分割数据集转换为COCO格式实例分割数据集
- YOLOv8 Ultralytics:使用Ultralytics框架训练RT-DETR实时目标检测模型
- 基于DETR的人脸伪装检测
- YOLOv7训练自己的数据集(口罩检测)
- YOLOv8训练自己的数据集(足球检测)
- YOLOv5:TensorRT加速YOLOv5模型推理
- YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
- 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
- YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
- YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
- Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
- YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
- 使用Kaggle GPU资源免费体验Stable Diffusion开源项目
- Stable Diffusion:在服务器上部署使用Stable Diffusion WebUI进行AI绘图(v2.0)
- Stable Diffusion:使用自己的数据集微调训练LoRA模型(v2.0)

401

被折叠的 条评论
为什么被折叠?



