nielsr HF staff commited on
Commit
23171d2
·
verified ·
1 Parent(s): a959e37

Add link to paper in model card

Browse files

This PR links the model card to the paper, ensuring users can easily find it at [Qwen2.5-VL Technical Report](https://huggingface.co/papers/2502.13923).

Files changed (1) hide show
  1. README.md +10 -9
README.md CHANGED
@@ -1,16 +1,15 @@
1
-
2
  ---
 
 
 
 
 
3
  license: other
4
  license_name: qwen
5
  license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct-AWQ/blob/main/LICENSE
6
- language:
7
- - en
8
  pipeline_tag: image-text-to-text
9
  tags:
10
  - multimodal
11
- library_name: transformers
12
- base_model:
13
- - Qwen/Qwen2.5-VL-72B-Instruct
14
  ---
15
 
16
  # Qwen2.5-VL-72B-Instruct-AWQ
@@ -18,6 +17,8 @@ base_model:
18
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
19
  </a>
20
 
 
 
21
  ## Introduction
22
 
23
  In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.
@@ -84,7 +85,7 @@ KeyError: 'qwen2_5_vl'
84
  We offer a toolkit to help you handle various types of visual input more conveniently, as if you were using an API. This includes base64, URLs, and interleaved images and videos. You can install it using the following command:
85
 
86
  ```bash
87
- # It's highly recommanded to use `[decord]` feature for faster video loading.
88
  pip install qwen-vl-utils[decord]==0.0.8
89
  ```
90
 
@@ -95,7 +96,7 @@ If you are not using Linux, you might not be able to install `decord` from PyPI.
95
  Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
96
 
97
  ```python
98
- from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
99
  from qwen_vl_utils import process_vision_info
100
 
101
  # default: Load the model on the available device(s)
@@ -210,7 +211,7 @@ The model supports a wide range of resolution inputs. By default, it uses the na
210
  min_pixels = 256 * 28 * 28
211
  max_pixels = 1280 * 28 * 28
212
  processor = AutoProcessor.from_pretrained(
213
- "Qwen/Qwen2.5-VL-72B-Instruct-AWQ", min_pixels=min_pixels, max_pixels=max_pixels
214
  )
215
  ```
216
 
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-VL-72B-Instruct
4
+ language:
5
+ - en
6
+ library_name: transformers
7
  license: other
8
  license_name: qwen
9
  license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct-AWQ/blob/main/LICENSE
 
 
10
  pipeline_tag: image-text-to-text
11
  tags:
12
  - multimodal
 
 
 
13
  ---
14
 
15
  # Qwen2.5-VL-72B-Instruct-AWQ
 
17
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
18
  </a>
19
 
20
+ This repository contains the model described in the paper [Qwen2.5-VL Technical Report](https://huggingface.co/papers/2502.13923).
21
+
22
  ## Introduction
23
 
24
  In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.
 
85
  We offer a toolkit to help you handle various types of visual input more conveniently, as if you were using an API. This includes base64, URLs, and interleaved images and videos. You can install it using the following command:
86
 
87
  ```bash
88
+ # It's highly recommended to use `[decord]` feature for faster video loading.
89
  pip install qwen-vl-utils[decord]==0.0.8
90
  ```
91
 
 
96
  Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
97
 
98
  ```python
99
+ from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
100
  from qwen_vl_utils import process_vision_info
101
 
102
  # default: Load the model on the available device(s)
 
211
  min_pixels = 256 * 28 * 28
212
  max_pixels = 1280 * 28 * 28
213
  processor = AutoProcessor.from_pretrained(
214
+ "Qwen/Qwen2.5-VL-7B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels
215
  )
216
  ```
217