Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models Paper • 2505.02686 • Published 19 days ago • 15
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information Paper • 2505.06046 • Published 15 days ago • 13
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper • 2504.10342 • Published Apr 14 • 11
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published Apr 9 • 73
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 164
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages Paper • 2503.23542 • Published Mar 30 • 10
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper • 2503.21460 • Published Mar 27 • 77
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content Paper • 2503.16031 • Published Mar 20 • 3
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper • 2503.12963 • Published Mar 17 • 7