Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published 2 days ago • 42
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Paper • 2506.18841 • Published 3 days ago • 45
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published 10 days ago • 239
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering Paper • 2505.23604 • Published 28 days ago • 24
From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition Paper • 2505.16972 • Published May 22 • 9
From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition Paper • 2505.16972 • Published May 22 • 9 • 2
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 132
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16, 2024 • 32
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 74
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent Paper • 2407.21646 • Published Jul 31, 2024 • 18
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25, 2024 • 34
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25, 2024 • 34
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25, 2024 • 34 • 4
Learning Multi-Step Reasoning by Solving Arithmetic Tasks Paper • 2306.01707 • Published Jun 2, 2023 • 1
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Paper • 2407.11963 • Published Jul 16, 2024 • 45