CLLBJ16 nielsr HF Staff commited on
Commit
facb795
·
verified ·
1 Parent(s): 16f8125

Add project page link (#1)

Browse files

- Add project page link (ce03a181f7e6e6badfc553d087ef5665d7fbce47)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +19 -14
README.md CHANGED
@@ -1,21 +1,21 @@
1
  ---
2
- license: mit
3
- pipeline_tag: image-text-to-text
4
- library_name: transformers
5
  base_model:
6
- - OpenGVLab/InternViT-300M-448px
7
- - internlm/internlm2-chat-7b
8
- base_model_relation: merge
9
  language:
10
- - multilingual
 
 
 
11
  tags:
12
- - internvl
13
- - custom_code
 
14
  ---
15
 
16
  # CoMemo-9B
17
 
18
- [\[📂 GitHub\]](https://github.com/LALBJ/CoMemo) [\[📜 Paper\]](https://arxiv.org/pdf/2506.06279) [\[🚀 Quick Start\]](#quick-start)
19
 
20
 
21
  ## Introduction
@@ -148,13 +148,15 @@ pixel_values = pixel_values.to(torch.bfloat16).cuda()
148
  generation_config = dict(max_new_tokens=1024, do_sample=True)
149
 
150
  # single-image single-round conversation (单图单轮对话)
151
- question = '<image>\nPlease describe the image shortly.'
 
152
  target_aspect_ratio = [target_aspect_ratio]
153
  # Use RoPE-DHR
154
  response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
155
  # # Use Original Rope
156
  # response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
157
- print(f'User: {question}\nAssistant: {response}')
 
158
 
159
  # multi-image single-round conversation, separate images (多图多轮对话,独立图像)
160
  pixel_values1, target_aspect_ratio1 = load_image('./assets/image1.jpg', max_num=12)
@@ -165,14 +167,17 @@ pixel_values = torch.cat((pixel_values1, pixel_values2), dim=0)
165
  target_aspect_ratio = [target_aspect_ratio1, target_aspect_ratio2]
166
  num_patches_list = [pixel_values1.size(0), pixel_values2.size(0)]
167
 
168
- question = 'Image-1: <image>\nImage-2: <image>\nWhat are the similarities and differences between these two images.'
 
 
169
  # Use RoPE-DHR
170
  response = model.chat(tokenizer, pixel_values, question, generation_config,
171
  num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
172
  # # Use Original RoPE
173
  # response = model.chat(tokenizer, pixel_values, question, generation_config,
174
  # num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
175
- print(f'User: {question}\nAssistant: {response}')
 
176
  ```
177
 
178
  ## License
 
1
  ---
 
 
 
2
  base_model:
3
+ - OpenGVLab/InternViT-300M-448px
4
+ - internlm/internlm2-chat-7b
 
5
  language:
6
+ - multilingual
7
+ library_name: transformers
8
+ license: mit
9
+ pipeline_tag: image-text-to-text
10
  tags:
11
+ - internvl
12
+ - custom_code
13
+ base_model_relation: merge
14
  ---
15
 
16
  # CoMemo-9B
17
 
18
+ [\[📂 GitHub\]](https://github.com/LALBJ/CoMemo) [\[📜 Paper\]](https://arxiv.org/pdf/2506.06279) [\[🌐 Project Page\]](https://lalbj.github.io/projects/CoMemo/) [\[🚀 Quick Start\]](#quick-start)
19
 
20
 
21
  ## Introduction
 
148
  generation_config = dict(max_new_tokens=1024, do_sample=True)
149
 
150
  # single-image single-round conversation (单图单轮对话)
151
+ question = '<image>
152
+ Please describe the image shortly.'
153
  target_aspect_ratio = [target_aspect_ratio]
154
  # Use RoPE-DHR
155
  response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
156
  # # Use Original Rope
157
  # response = model.chat(tokenizer, pixel_values, question, generation_config, target_aspect_ratio=target_aspect_ratio)
158
+ print(f'User: {question}
159
+ Assistant: {response}')
160
 
161
  # multi-image single-round conversation, separate images (多图多轮对话,独立图像)
162
  pixel_values1, target_aspect_ratio1 = load_image('./assets/image1.jpg', max_num=12)
 
167
  target_aspect_ratio = [target_aspect_ratio1, target_aspect_ratio2]
168
  num_patches_list = [pixel_values1.size(0), pixel_values2.size(0)]
169
 
170
+ question = 'Image-1: <image>
171
+ Image-2: <image>
172
+ What are the similarities and differences between these two images.'
173
  # Use RoPE-DHR
174
  response = model.chat(tokenizer, pixel_values, question, generation_config,
175
  num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
176
  # # Use Original RoPE
177
  # response = model.chat(tokenizer, pixel_values, question, generation_config,
178
  # num_patches_list=num_patches_list, target_aspect_ratio=target_aspect_ratio)
179
+ print(f'User: {question}
180
+ Assistant: {response}')
181
  ```
182
 
183
  ## License