ZhangYuanhan
commited on
Commit
•
cc1e179
1
Parent(s):
ecc105e
Update README.md
Browse files
README.md
CHANGED
@@ -117,7 +117,7 @@ base_model:
|
|
117 |
- lmms-lab/llava-onevision-qwen2-7b-si
|
118 |
---
|
119 |
|
120 |
-
# LLaVA-Video-7B-Qwen2
|
121 |
|
122 |
## Table of Contents
|
123 |
|
@@ -130,7 +130,7 @@ base_model:
|
|
130 |
|
131 |
## Model Summary
|
132 |
|
133 |
-
The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
|
134 |
|
135 |
- **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
|
136 |
- **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
|
@@ -141,7 +141,7 @@ The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](
|
|
141 |
|
142 |
### Intended use
|
143 |
|
144 |
-
The model was trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and have the ability to interact with images, multi-image and videos, but specific to videos.
|
145 |
|
146 |
**Feel free to share your generations in the Community tab!**
|
147 |
|
@@ -182,7 +182,7 @@ def load_video(self, video_path, max_frames_num,fps=1,force_sample=False):
|
|
182 |
spare_frames = vr.get_batch(frame_idx).asnumpy()
|
183 |
# import pdb;pdb.set_trace()
|
184 |
return spare_frames,frame_time,video_time
|
185 |
-
pretrained = "lmms-lab/LLaVA-Video-7B-Qwen2"
|
186 |
model_name = "llava_qwen"
|
187 |
device = "cuda"
|
188 |
device_map = "auto"
|
|
|
117 |
- lmms-lab/llava-onevision-qwen2-7b-si
|
118 |
---
|
119 |
|
120 |
+
# LLaVA-NeXT-Video-7B-Qwen2
|
121 |
|
122 |
## Table of Contents
|
123 |
|
|
|
130 |
|
131 |
## Model Summary
|
132 |
|
133 |
+
The LLaVA-NeXT-Video models are 7/72B parameter models trained on [LLaVA-NeXT-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
|
134 |
|
135 |
- **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
|
136 |
- **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
|
|
|
141 |
|
142 |
### Intended use
|
143 |
|
144 |
+
The model was trained on [LLaVA-NeXT-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and have the ability to interact with images, multi-image and videos, but specific to videos.
|
145 |
|
146 |
**Feel free to share your generations in the Community tab!**
|
147 |
|
|
|
182 |
spare_frames = vr.get_batch(frame_idx).asnumpy()
|
183 |
# import pdb;pdb.set_trace()
|
184 |
return spare_frames,frame_time,video_time
|
185 |
+
pretrained = "lmms-lab/LLaVA-NeXT-Video-7B-Qwen2"
|
186 |
model_name = "llava_qwen"
|
187 |
device = "cuda"
|
188 |
device_map = "auto"
|