Submitted by learn3r 84 WebSailor: Navigating Super-human Reasoning for Web Agent · 19 authors 1.86k 3
Submitted by Warrieryes 72 Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers · 15 authors 645 2
Submitted by Liuff23 52 LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion · 7 authors 200 1
Submitted by chrisliu298 47 Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy · 12 authors 49 7
Submitted by ai-alanov 38 Heeding the Inner Voice: Aligning ControlNet Training via Intermediate Features Feedback · 4 authors 1
Submitted by amanchadha 38 Energy-Based Transformers are Scalable Learners and Thinkers · 10 authors 116 7
Submitted by siqisun 34 IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction · 7 authors 88 3
Submitted by jinjiajie 20 Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search · 8 authors 1
Submitted by yilunzhao 18 Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers · 5 authors 1
Submitted by hba123 14 Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving · 6 authors 1
Submitted by SivilTaram 10 ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention · 9 authors 1
Submitted by kenhktsui 9 Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs · 1 authors 2 3
Submitted by Facico 7 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models · 3 authors 3 1
Submitted by JJ-TMT 5 AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training · 19 authors 1
Submitted by yxl66666 2 CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation · 8 authors 9 1
Submitted by SivanSX 2 HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation · 6 authors 1