Submitted by learn3r 60 WebSailor: Navigating Super-human Reasoning for Web Agent · 19 authors 1.32k 2
Submitted by Liuff23 46 LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion · 7 authors 102 1
Submitted by ai-alanov 34 Heeding the Inner Voice: Aligning ControlNet Training via Intermediate Features Feedback · 4 authors 1
Submitted by siqisun 32 IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction · 7 authors 55 3
Submitted by chrisliu298 32 Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy · 12 authors 23 6
Submitted by Warrieryes 28 Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers · 15 authors 554 2
Submitted by jinjiajie 15 Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search · 8 authors 1
Submitted by hba123 12 Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving · 6 authors 1
Submitted by yilunzhao 12 Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers · 5 authors 1
Submitted by amanchadha 9 Energy-Based Transformers are Scalable Learners and Thinkers · 10 authors 1
Submitted by kenhktsui 6 Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs · 1 authors 2 3
Submitted by Facico 6 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models · 3 authors 2 1
Submitted by JJ-TMT 5 AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training · 19 authors 1
Submitted by SivilTaram 5 ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention · 9 authors 1
Submitted by yxl66666 2 CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation · 8 authors 6 1
Submitted by SivanSX 1 HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation · 6 authors 1