-
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 70 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 149 -
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Paper • 2502.06781 • Published • 61 -
LIMO: Less is More for Reasoning
Paper • 2502.03387 • Published • 61
Collections
Discover the best community collections!
Collections including paper arxiv:2503.24290
-
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper • 2501.00192 • Published • 30 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
Xmodel-2 Technical Report
Paper • 2412.19638 • Published • 27 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 101
-
Phi-4 Technical Report
Paper • 2412.08905 • Published • 116 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper • 2412.05210 • Published • 51 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 49 -
Yi-Lightning Technical Report
Paper • 2412.01253 • Published • 29
-
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
Paper • 2410.09008 • Published • 17 -
answerdotai/ModernBERT-base
Fill-Mask • Updated • 3.29M • 819 -
answerdotai/ModernBERT-large
Fill-Mask • Updated • 55.7k • 379 -
microsoft/phi-4
Text Generation • Updated • 555k • • 1.97k
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 29 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 41 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 43 -
A Survey on Data Selection for LLM Instruction Tuning
Paper • 2402.05123 • Published • 3 -
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Paper • 2409.12941 • Published • 25 -
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 59
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 18 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 39 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 54 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 24
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 36 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 26