MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote Sensing Paper • 2503.24219 • Published 3 days ago • 1
m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models Paper • 2504.00869 • Published 2 days ago • 7
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published 3 days ago • 12
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization Paper • 2503.23733 • Published 4 days ago • 10
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning Paper • 2504.01005 • Published 2 days ago • 12
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems? Paper • 2504.00509 • Published 3 days ago • 15
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published 6 days ago • 17
CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis Paper • 2503.23145 • Published 5 days ago • 29
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 3 days ago • 57
Unicorn: Text-Only Data Synthesis for Vision Language Model Training Paper • 2503.22655 • Published 6 days ago • 32
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization Paper • 2503.20286 • Published 9 days ago • 3
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code Paper • 2503.18809 • Published 10 days ago • 9
TeleAntiFraud-28k: A Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection Paper • 2503.24115 • Published 3 days ago • 9
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published 5 days ago • 39