aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-finetuning Image-Text-to-Text • Updated about 4 hours ago • 2
aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-pretrain Image-Text-to-Text • Updated about 4 hours ago
aimagelab/LLaVA_MORE-llama_3_1-8B-S2-finetuning Image-Text-to-Text • Updated about 4 hours ago • 1
aimagelab/LLaVA_MORE-llama_3_1-8B-siglip-finetuning Image-Text-to-Text • Updated about 4 hours ago • 4 • 1
aimagelab/LLaVA_MORE-llama_3_1-8B-siglip-pretrain Image-Text-to-Text • Updated about 4 hours ago • 3
aimagelab/LLaVA_MORE-llama_3_1-8B-finetuning Image-Text-to-Text • Updated about 4 hours ago • 215 • 9
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published 13 days ago • 121
Running 2.51k 2.51k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
ReflectiVA Collection Model and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 2 items • Updated 12 days ago
ReflectiVA Collection Models and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 3 items • Updated 19 days ago
ReflectiVA Collection Models and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 3 items • Updated 19 days ago