Spaces:
Running
Running
Merge branch 'main' of hf.co:spaces/HuggingFaceM4/FineVision
Browse files
app/src/content/article.mdx
CHANGED
@@ -421,7 +421,7 @@ Even though our first proposal to judge the quality of multimodal data on a per-
|
|
421 |
---
|
422 |
|
423 |
### Should you train in multiple stages?
|
424 |
-
The standard training procedure of a VLM usually follows at least two stages. First, you train only the connecting module, potentially in addition the image encoder, and then you train the whole model in a second stage. Some work has even introduced an additional Stage 2.5, where you train the full model on a smaller subset of higher quality data. To investigate this on small models, we experiment both with single, dual and triple stage training.
|
425 |
|
426 |
---
|
427 |
#### 1 Stage vs 2 Stages
|
|
|
421 |
---
|
422 |
|
423 |
### Should you train in multiple stages?
|
424 |
+
The standard training procedure of a VLM usually follows at least two stages. First, you train only the connecting module, potentially in addition the image encoder, and then you train the whole model in a second stage. Some work has even introduced an additional Stage 2.5 [@li2025eagle2buildingposttraining], where you train the full model on a smaller subset of higher quality data. To investigate this on small models, we experiment both with single, dual and triple stage training.
|
425 |
|
426 |
---
|
427 |
#### 1 Stage vs 2 Stages
|
app/src/content/bibliography.bib
CHANGED
@@ -2089,4 +2089,14 @@
|
|
2089 |
archivePrefix={arXiv},
|
2090 |
primaryClass={cs.CV},
|
2091 |
url={https://arxiv.org/abs/2004.01804},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2092 |
}
|
|
|
2089 |
archivePrefix={arXiv},
|
2090 |
primaryClass={cs.CV},
|
2091 |
url={https://arxiv.org/abs/2004.01804},
|
2092 |
+
}
|
2093 |
+
|
2094 |
+
@misc{li2025eagle2buildingposttraining,
|
2095 |
+
title={Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models},
|
2096 |
+
author={Zhiqi Li and Guo Chen and Shilong Liu and Shihao Wang and Vibashan VS and Yishen Ji and Shiyi Lan and Hao Zhang and Yilin Zhao and Subhashree Radhakrishnan and Nadine Chang and Karan Sapra and Amala Sanjay Deshmukh and Tuomas Rintamaki and Matthieu Le and Ilia Karmanov and Lukas Voegtle and Philipp Fischer and De-An Huang and Timo Roman and Tong Lu and Jose M. Alvarez and Bryan Catanzaro and Jan Kautz and Andrew Tao and Guilin Liu and Zhiding Yu},
|
2097 |
+
year={2025},
|
2098 |
+
eprint={2501.14818},
|
2099 |
+
archivePrefix={arXiv},
|
2100 |
+
primaryClass={cs.CV},
|
2101 |
+
url={https://arxiv.org/abs/2501.14818},
|
2102 |
}
|