B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 3 days ago • 34
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling Paper • 2412.15084 • Published 6 days ago • 12
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 9 days ago • 41
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 9 days ago • 41
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 9 days ago • 41 • 3
ColorFlow: Retrieval-Augmented Image Sequence Colorization Paper • 2412.11815 • Published 10 days ago • 26
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published 15 days ago • 35
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 11 days ago • 25
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 10 days ago • 33
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 19 days ago • 121
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12 • 15
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1 • 75
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published Oct 12 • 47
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published Oct 12 • 47
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1 • 75