LLM+Self-Play RL - a VoladorLuYu Collection

VoladorLuYu 's Collections

Research on LLM

Generative Multiple Modality

Super Alignment

Foundation Machine Learning

Graph Foundation Multimodal Models

Symbolic LLM Reasoning

Data-efficient LLMs

Understanding LLM

synthetic code generation

Diffusion Models

LLM+Architecture

LLM+Self-Play RL

LLM+Self-Play RL

updated 22 days ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 139
Recursive Introspection: Teaching Language Model Agents How to Self-Improve

Paper • 2407.18219 • Published Jul 25, 2024 • 3
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Paper • 2408.16293 • Published Aug 29, 2024 • 27
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models

Paper • 2409.04787 • Published Sep 7, 2024 • 1
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives

Paper • 2401.02009 • Published Jan 4, 2024 • 1
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 45
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Paper • 2504.19162 • Published Apr 27 • 15