Collections
Discover the best community collections!
Collections including paper arxiv:2507.00432
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 129 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 116 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 54 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 165
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 44 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 64 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 29 -
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 24
-
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 44
-
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Mathesis: Towards Formal Theorem Proving from Natural Languages
Paper • 2506.07047 • Published • 5 -
Pre-trained Large Language Models Learn Hidden Markov Models In-context
Paper • 2506.07298 • Published • 25 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 44
-
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 61 -
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Paper • 2410.09604 • Published -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 9 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 7
-
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 44 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 52 -
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments
Paper • 2408.10945 • Published • 11 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 54
-
Instruction Following without Instruction Tuning
Paper • 2409.14254 • Published • 31 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 20
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 56 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 26 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 73 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78
-
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Mathesis: Towards Formal Theorem Proving from Natural Languages
Paper • 2506.07047 • Published • 5 -
Pre-trained Large Language Models Learn Hidden Markov Models In-context
Paper • 2506.07298 • Published • 25 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 44
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 129 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 116 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 54 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 165
-
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 61 -
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Paper • 2410.09604 • Published -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 9 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 7
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 44 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 44 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 52 -
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments
Paper • 2408.10945 • Published • 11 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 54
-
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 64 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 29 -
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 24
-
Instruction Following without Instruction Tuning
Paper • 2409.14254 • Published • 31 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 20
-
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
Solving Inequality Proofs with Large Language Models
Paper • 2506.07927 • Published • 20 -
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Paper • 2507.00432 • Published • 44
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 56 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 26 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 73 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78