thanhtantran commited on
Commit
e812464
·
verified ·
1 Parent(s): b16f305

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -57
README.md CHANGED
@@ -1,6 +1,4 @@
1
  ---
2
- base_model:
3
- - openbmb/MiniCPM-V-2_6
4
  tags:
5
  - rknn
6
  - rkllm
@@ -39,63 +37,46 @@ python multiprocess_inference.py
39
 
40
  If the performance is not satisfactory, you can change the CPU scheduler to keep the CPU running at the highest frequency, and bind the inference program to the big core cluster (`taskset -c 4-7 python multiprocess_inference.py`).
41
 
42
- test.jpg:
43
  ![man.jpg](./man.jpg)
44
 
45
- >```
46
- >Start loading language model (size: 7810.02 MB)
47
- >
48
- >I rkllm: rkllm-runtime version: 1.1.2, rknpu driver version: 0.9.8, platform: RK3588
49
- >
50
- >W rknn-toolkit-lite2 version: 2.2.0
51
- >Start loading vision encoder model (size: 942.29 MB)
52
- >Vision encoder loaded in 10.22 seconds
53
- >I RKNN: [02:28:20.939] RKNN Runtime Information, librknnrt version: 2.1.0 (967d001cc8@2024-08-07T19:28:19)
54
- >I RKNN: [02:28:20.939] RKNN Driver Information, version: 0.9.8
55
- >I RKNN: [02:28:20.940] RKNN Model Information, version: 6, toolkit version: 2.2.0(compiler version: 2.2.0 (c195366594@2024-09-14T12:24:14)), target: RKNPU v2, target platform: rk3588, framework name: ONNX, framework layout: NCHW, model inference type: dynamic_shape
56
- >W RKNN: [02:28:20.940] RKNN Model version: 2.2.0 not match with rknn runtime version: 2.1.0
57
- >Received ready signal: vision_ready
58
- >Language model loaded in 29.21 seconds
59
- >Received ready signal: llm_ready
60
- >All models loaded, starting interactive mode...
61
- >
62
- >Enter your input (3 empty lines to start inference, Ctrl+C to exit, for example:
63
- >详细描述一下{{./test.jpg}}这张图片
64
- >What is the weather in {{./test.jpg}}?
65
- >How many people are in {{./test.jpg}}?
66
- >):
67
- >
68
- >Describe the image: {{test.jpg}} in every detail.
69
- >
70
- >
71
- >
72
- >Start vision inference...
73
- >
74
- >Vision encoder inference time: 3.26 seconds
75
- >
76
- >Time to first token: 1.72 seconds
77
- >
78
- >The image depicts an urban street scene with various elements that contribute to its bustling atmosphere.
79
- >
80
- >A person, likely male based on appearance, is walking across the crosswalk carrying a blue and white checked umbrella. He's dressed casually yet stylishly, wearing a beige jacket over what appears to be dark pants or leggings paired with patterned slip-on shoes in shades of gray, black, and yellow.
81
- >
82
- >The street itself features multiple lanes filled with vehicles; there are cars visible on both sides, including a prominent SUV that is parked by the roadside. The presence of these automobiles adds to the sense of movement and activity within this urban setting.
83
- >
84
- >In terms of infrastructure, the crosswalk has clear pedestrian markings for safety, and an adjacent railing provides support or boundary along one side of the street. Beyond the immediate foreground where pedestrians traverse, there's a sidewalk lined with lush green trees which add natural beauty to the otherwise concrete-dominated environment.
85
- >
86
- >The sky is visible in parts through breaks in clouds above, indicating fair weather conditions that contribute positively to outdoor activities like walking down this cityscape path.
87
- >
88
- >Overall, it appears as though an ordinary day unfolds within this urban setting, capturing moments of daily life and movement.
89
- >
90
- >(finished)
91
- >
92
- >--------------------------------------------------------------------------------------
93
- > Stage Total Time (ms) Tokens Time per Token (ms) Tokens per Second
94
- >--------------------------------------------------------------------------------------
95
- > Prefill 1714.78 94 18.24 54.82
96
- > Generate 58689.71 236 249.75 4.00
97
- >--------------------------------------------------------------------------------------
98
- >```
99
 
100
  ## Model Conversion
101
 
 
1
  ---
 
 
2
  tags:
3
  - rknn
4
  - rkllm
 
37
 
38
  If the performance is not satisfactory, you can change the CPU scheduler to keep the CPU running at the highest frequency, and bind the inference program to the big core cluster (`taskset -c 4-7 python multiprocess_inference.py`).
39
 
40
+ man.jpg:
41
  ![man.jpg](./man.jpg)
42
 
43
+ ```bash
44
+ admin@orangepi5:~/MiniCPM-V-2_6-rkllm$ python multiprocess_inference.py
45
+ Start loading language model (size: 7810.02 MB)
46
+ I rkllm: rkllm-runtime version: 1.1.4, rknpu driver version: 0.9.8, platform: RK3588
47
+
48
+ W rknn-toolkit-lite2 version: 2.3.0
49
+ Start loading vision encoder model (size: 942.29 MB)
50
+ Vision encoder loaded in 4.95 seconds
51
+ I RKNN: [13:13:11.477] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33)
52
+ I RKNN: [13:13:11.477] RKNN Driver Information, version: 0.9.8
53
+ I RKNN: [13:13:11.478] RKNN Model Information, version: 6, toolkit version: 2.2.0(compiler version: 2.2.0 (c195366594@2024-09-14T12:24:14)), target: RKNPU v2, target platform: rk3588, framework name: ONNX, framework layout: NCHW, model inference type: dynamic_shape
54
+ Received ready signal: vision_ready
55
+ Language model loaded in 30.56 seconds
56
+ Received ready signal: llm_ready
57
+ All models loaded, starting interactive mode...
58
+
59
+ Enter your input :
60
+
61
+ Describe the person in the image {{./man.jpg}}?
62
+ How many people are in he image {{./man.jpg}}?
63
+
64
+
65
+
66
+ Start vision inference...
67
+ Vision encoder inference time: 3.61 seconds
68
+ Time to first token: 2.09 seconds
69
+ There is one person in the image the second image . The focus of this photograph seems to be on an individual, as there are no other discernible figures present. Please note that I can only provide descriptions based on what's visible within the images you've shared. If you have any specific questions about the content or need further details, feel free to ask!
70
+
71
+ (finished)
72
+
73
+ --------------------------------------------------------------------------------------
74
+ Stage Total Time (ms) Tokens Time per Token (ms) Tokens per Second
75
+ --------------------------------------------------------------------------------------
76
+ Prefill 2040.37 112 18.22 54.89
77
+ Generate 22301.36 72 310.92 3.22
78
+ --------------------------------------------------------------------------------------
79
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
  ## Model Conversion
82