Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage Paper • 2412.15484 • Published 6 days ago • 13
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published 7 days ago • 73
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 3 days ago • 32
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published 4 days ago • 19
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 3 days ago • 34
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale Paper • 2412.05237 • Published 19 days ago • 46
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published 20 days ago • 45
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published 20 days ago • 48
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 19 days ago • 121
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs Paper • 2412.04144 • Published 21 days ago • 4
Maya: An Instruction Finetuned Multilingual Multimodal Model Paper • 2412.07112 • Published 16 days ago • 25
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 17 days ago • 68
VisionArena: 230K Real World User-VLM Conversations with Preference Labels Paper • 2412.08687 • Published 14 days ago • 13
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens Paper • 2412.10208 • Published 13 days ago • 19
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 9 days ago • 40