Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.00432

about 23 hours ago

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published 28 days ago • 124
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 129
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 116
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 165

about 18 hours ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 99
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

Paper • 2502.03544 • Published Feb 5 • 44
FoNE: Precise Single-Token Number Embeddings via Fourier Features

Paper • 2502.09741 • Published Feb 13 • 15
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published Feb 27 • 22

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Paper • 2411.03884 • Published Nov 6, 2024 • 29
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models

Paper • 2502.00698 • Published Feb 2 • 24

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published 24 days ago • 20
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published 24 days ago • 20
Mathesis: Towards Formal Theorem Proving from Natural Languages

Paper • 2506.07047 • Published 26 days ago • 5
Pre-trained Large Language Models Learn Hidden Markov Models In-context

Paper • 2506.07298 • Published 25 days ago • 25
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 61
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Paper • 2410.09604 • Published Oct 12, 2024
Geospatial Mechanistic Interpretability of Large Language Models

Paper • 2505.03368 • Published May 6 • 9
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5 • 7

about 21 hours ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 52
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments

Paper • 2408.10945 • Published Aug 20, 2024 • 11
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 54

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21, 2024 • 31
Baichuan Alignment Technical Report

Paper • 2410.14940 • Published Oct 19, 2024 • 52
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 61
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 20

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14, 2024 • 56
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19, 2024 • 26
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15, 2024 • 73
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 78

about 23 hours ago

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published 28 days ago • 124
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published 24 days ago • 20
Mathesis: Towards Formal Theorem Proving from Natural Languages

Paper • 2506.07047 • Published 26 days ago • 5
Pre-trained Large Language Models Learn Hidden Markov Models In-context

Paper • 2506.07298 • Published 25 days ago • 25
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 129
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 116
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 165

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 61
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Paper • 2410.09604 • Published Oct 12, 2024
Geospatial Mechanistic Interpretability of Large Language Models

Paper • 2505.03368 • Published May 6 • 9
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5 • 7

about 18 hours ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 99
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

Paper • 2502.03544 • Published Feb 5 • 44
FoNE: Precise Single-Token Number Embeddings via Fourier Features

Paper • 2502.09741 • Published Feb 13 • 15
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published Feb 27 • 22

about 21 hours ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 52
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments

Paper • 2408.10945 • Published Aug 20, 2024 • 11
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 54

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Paper • 2411.03884 • Published Nov 6, 2024 • 29
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models

Paper • 2502.00698 • Published Feb 2 • 24

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21, 2024 • 31
Baichuan Alignment Technical Report

Paper • 2410.14940 • Published Oct 19, 2024 • 52
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 61
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 20

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published 24 days ago • 20
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14, 2024 • 56
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19, 2024 • 26
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15, 2024 • 73
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 78

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs