CLLBJ16 nielsr HF Staff commited on
Commit
1cd9ef9
·
verified ·
1 Parent(s): c1ea139

Add link to project page (#1)

Browse files

- Add link to project page (51108235e5f7598f4dc60264f2e8eccd32363b8e)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +19 -15
README.md CHANGED
@@ -1,22 +1,21 @@
1
  ---
2
- license: mit
3
- pipeline_tag: image-text-to-text
4
- library_name: transformers
5
  base_model:
6
- - OpenGVLab/InternViT-300M-448px
7
- - internlm/internlm2-chat-1_8b
8
- base_model_relation: merge
9
  language:
10
- - multilingual
 
 
 
11
  tags:
12
- - internvl
13
- - custom_code
 
14
  ---
15
 
16
  # CoMemo-2B
17
 
18
- [\[📂 GitHub\]](https://github.com/LALBJ/CoMemo) [\[📜 Paper\]](https://arxiv.org/pdf/2506.06279) [\[🚀 Quick Start\]](#quick-start)
19
-
20
 
21
  ## Introduction
22
 
@@ -145,13 +144,15 @@ pixel_values = pixel_values.to(torch.bfloat16).cuda()
145
  generation_config = dict(max_new_tokens=1024, do_sample=True)
146
 
147
  # single-image single-round conversation (单图单轮对话)
148
- question = '<image>\nPlease describe the image shortly.'
 
149
  target_aspect_ratio = [target_aspect_ratio]
150
  # Use RoPE-DHR
151
  response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
152
  # # Use Original Rope
153
  # response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
154
- print(f'User: {question}\nAssistant: {response}')
 
155
 
156
  # multi-image single-round conversation, separate images (多图多轮对话,独立图像)
157
  pixel_values1, target_aspect_ratio1 = load_image('./assets/image1.jpg', max_num=12)
@@ -162,14 +163,17 @@ pixel_values = torch.cat((pixel_values1, pixel_values2), dim=0)
162
  target_aspect_ratio = [target_aspect_ratio1, target_aspect_ratio2]
163
  num_patches_list = [pixel_values1.size(0), pixel_values2.size(0)]
164
 
165
- question = 'Image-1: <image>\nImage-2: <image>\nWhat are the similarities and differences between these two images.'
 
 
166
  # Use RoPE-DHR
167
  response = model.chat(tokenizer, pixel_values, question, generation_config,
168
  num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
169
  # # Use Original RoPE
170
  # response = model.chat(tokenizer, pixel_values, question, generation_config,
171
  # num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
172
- print(f'User: {question}\nAssistant: {response}')
 
173
  ```
174
 
175
  ## License
 
1
  ---
 
 
 
2
  base_model:
3
+ - OpenGVLab/InternViT-300M-448px
4
+ - internlm/internlm2-chat-1_8b
 
5
  language:
6
+ - multilingual
7
+ library_name: transformers
8
+ license: mit
9
+ pipeline_tag: image-text-to-text
10
  tags:
11
+ - internvl
12
+ - custom_code
13
+ base_model_relation: merge
14
  ---
15
 
16
  # CoMemo-2B
17
 
18
+ [\[📂 GitHub\]](https://github.com/LALBJ/CoMemo) [\[📜 Paper\]](https://arxiv.org/pdf/2506.06279) [\[🚀 Quick Start\]](#quick-start) [\[🌐 Project Page\]](https://lalbj.github.io/projects/CoMemo/)
 
19
 
20
  ## Introduction
21
 
 
144
  generation_config = dict(max_new_tokens=1024, do_sample=True)
145
 
146
  # single-image single-round conversation (单图单轮对话)
147
+ question = '<image>
148
+ Please describe the image shortly.'
149
  target_aspect_ratio = [target_aspect_ratio]
150
  # Use RoPE-DHR
151
  response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
152
  # # Use Original Rope
153
  # response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
154
+ print(f'User: {question}
155
+ Assistant: {response}')
156
 
157
  # multi-image single-round conversation, separate images (多图多轮对话,独立图像)
158
  pixel_values1, target_aspect_ratio1 = load_image('./assets/image1.jpg', max_num=12)
 
163
  target_aspect_ratio = [target_aspect_ratio1, target_aspect_ratio2]
164
  num_patches_list = [pixel_values1.size(0), pixel_values2.size(0)]
165
 
166
+ question = 'Image-1: <image>
167
+ Image-2: <image>
168
+ What are the similarities and differences between these two images.'
169
  # Use RoPE-DHR
170
  response = model.chat(tokenizer, pixel_values, question, generation_config,
171
  num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
172
  # # Use Original RoPE
173
  # response = model.chat(tokenizer, pixel_values, question, generation_config,
174
  # num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
175
+ print(f'User: {question}
176
+ Assistant: {response}')
177
  ```
178
 
179
  ## License