Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning
Paper
•
2503.13383
•
Published
Multi-modal SFT data selection method that first scales to million-level datapool, achieving 99.1% perf with 30% of LLaVA-OVSI. (in construction)