Update README.md
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ The primary use of LLaVA-UHD v2 is research on large multimodal models and chatb
|
|
43 |
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
|
44 |
|
45 |
## Training dataset
|
46 |
-
-
|
47 |
- Pretrain: LLaVA-Pretrain 558K (filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.)
|
48 |
- SFT: 858k-mixed dataset in https://huggingface.co/datasets/YipengZhang/LLaVA-UHD-v2-SFT-Data
|
49 |
|
|
|
43 |
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
|
44 |
|
45 |
## Training dataset
|
46 |
+
- VDIM Pretrain: MS-COCO stuff 2017
|
47 |
- Pretrain: LLaVA-Pretrain 558K (filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.)
|
48 |
- SFT: 858k-mixed dataset in https://huggingface.co/datasets/YipengZhang/LLaVA-UHD-v2-SFT-Data
|
49 |
|