marsena commited on
Commit
266e861
·
verified ·
1 Parent(s): 2957604

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +125 -0
README.md ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - onnx
5
+ - paddleocr
6
+ - ocr
7
+ - computer-vision
8
+ - text-recognition
9
+ - text-detection
10
+ library_name: onnxruntime
11
+ pipeline_tag: image-to-text
12
+ base_model:
13
+ - PaddlePaddle/PP-OCRv5_server_det
14
+ - PaddlePaddle/PP-OCRv5_server_rec
15
+ ---
16
+
17
+ # PaddleOCR ONNX Models
18
+
19
+ PaddleOCR PP-OCRv5 models converted to ONNX format for efficient OCR inference.
20
+
21
+ ## Model Files
22
+
23
+ | File | Description |
24
+ |------|-------------|
25
+ | `PP-OCRv5_server_det_infer.onnx` | Text detection model |
26
+ | `PP-OCRv5_server_rec_infer.onnx` | Text recognition model |
27
+ | `PP-LCNet_x1_0_textline_ori_infer.onnx` | Text orientation classification |
28
+ | `PP-LCNet_x1_0_doc_ori_infer.onnx` | Document orientation correction |
29
+ | `UVDoc_infer.onnx` | Document unwarping |
30
+ | `PP-OCRv5_server_rec_infer.yml` | Character dictionary config |
31
+
32
+ ## Source Models
33
+
34
+ These ONNX models are converted from official PaddlePaddle PP-OCRv5 models:
35
+
36
+ - **Detection Model**: [PaddlePaddle/PP-OCRv5_server_det](https://huggingface.co/PaddlePaddle/PP-OCRv5_server_det)
37
+ - **Recognition Model**: [PaddlePaddle/PP-OCRv5_server_rec](https://huggingface.co/PaddlePaddle/PP-OCRv5_server_rec)
38
+ - **Official Documentation**: [PP-OCRv5 Introduction](https://paddlepaddle.github.io/PaddleOCR/main/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5.html)
39
+
40
+ ## Usage
41
+
42
+ ### Download Specific Model
43
+
44
+ ```python
45
+ from huggingface_hub import hf_hub_download
46
+
47
+ # Download detection model
48
+ det_model_path = hf_hub_download(
49
+ repo_id="marsena/paddleocr-onnx-models",
50
+ filename="PP-OCRv5_server_det_infer.onnx"
51
+ )
52
+
53
+ # Download recognition model
54
+ rec_model_path = hf_hub_download(
55
+ repo_id="marsena/paddleocr-onnx-models",
56
+ filename="PP-OCRv5_server_rec_infer.onnx"
57
+ )
58
+ ```
59
+
60
+ ### Download All Models
61
+
62
+ ```python
63
+ from huggingface_hub import snapshot_download
64
+
65
+ # Download all model files to local directory
66
+ snapshot_download(
67
+ repo_id="marsena/paddleocr-onnx-models",
68
+ local_dir="./paddleocr_onnx"
69
+ )
70
+ ```
71
+
72
+ ### ONNX Runtime Inference
73
+
74
+ ```python
75
+ import onnxruntime as ort
76
+ import numpy as np
77
+
78
+ # Load model
79
+ session = ort.InferenceSession("PP-OCRv5_server_det_infer.onnx")
80
+
81
+ # Run inference
82
+ input_name = session.get_inputs()[0].name
83
+ output = session.run(None, {input_name: input_data})
84
+ ```
85
+
86
+ ## Model Specifications
87
+
88
+ - **Languages**: Simplified Chinese, Traditional Chinese, English, Japanese
89
+ - **Text Types**: Printed text, handwriting, vertical text, rotated text
90
+ - **Input Format**: Images (JPEG, PNG)
91
+ - **Output Format**: Bounding boxes + recognized text
92
+ - **Runtime**: ONNX Runtime 1.16+
93
+ - **Hardware**: CPU and GPU inference supported
94
+
95
+ ## License
96
+
97
+ These models follow the **Apache License 2.0**, consistent with the original PaddleOCR project.
98
+
99
+ - **PaddleOCR Repository**: https://github.com/PaddlePaddle/PaddleOCR
100
+ - **License Details**: [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
101
+
102
+ ## Conversion Information
103
+
104
+ - **Conversion Tool**: Paddle2ONNX
105
+ - **ONNX Version**: 1.12+
106
+ - **Source Framework**: PaddlePaddle 2.5+
107
+ - **Conversion Date**: January 2025
108
+
109
+ ## Citation
110
+
111
+ If you use these models in your research, please cite the original PaddleOCR paper:
112
+
113
+ ```bibtex
114
+ @misc{paddleocr2020,
115
+ title={PaddleOCR: Awesome multilingual OCR toolkits},
116
+ author={PaddlePaddle Authors},
117
+ howpublished = {\url{https://github.com/PaddlePaddle/PaddleOCR}},
118
+ year={2020}
119
+ }
120
+ ```
121
+
122
+ ## Issues
123
+
124
+ For model usage issues, please report to the original PaddleOCR repository:
125
+ - **PaddleOCR Issues**: https://github.com/PaddlePaddle/PaddleOCR/issues