GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published Apr 3 • 57
VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models Paper • 2504.15279 • Published Apr 21 • 74
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation Paper • 2505.13215 • Published 5 days ago • 27
LightLab: Controlling Light Sources in Images with Diffusion Models Paper • 2505.09608 • Published 10 days ago • 29
Improving Editability in Image Generation with Layer-wise Memory Paper • 2505.01079 • Published 23 days ago • 28
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published 19 days ago • 31
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published 19 days ago • 31
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 4 days ago • 52