Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper • 2503.00808 • Published 8 days ago • 52
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper • 2503.00808 • Published 8 days ago • 52
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data Paper • 2302.12822 • Published Feb 24, 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2
FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation Paper • 2408.12168 • Published Aug 22, 2024
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 341
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published Dec 23, 2024 • 46
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published Dec 23, 2024 • 46
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published Dec 23, 2024 • 43
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 62
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation Paper • 2304.05977 • Published Apr 12, 2023 • 2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving Paper • 2407.13690 • Published Jun 18, 2024 • 2
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning Paper • 2402.09136 • Published Feb 14, 2024 • 1
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12, 2024 • 16
Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models Paper • 2406.12182 • Published Jun 18, 2024