Self-Distillation - a lasgroup Collection

lasgroup 's Collections

Self-Distillation

Test-Time Curricula for Targeted RL

Test-Time Model Merging (TTMM)

Test-Time Curricula for Targeted RL (Qwen3-8B)

Test-Time Curricula for Targeted RL (Qwen3-4B-Instruct-2507)

Test-Time Curricula for Targeted RL (Qwen3-8B-Base)

Test-Time Curricula for Targeted RL (AIME25)

Self-Distillation

updated 17 days ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 43
Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 27
Aligning Language Models from User Interactions

Paper • 2603.12273 • Published Feb 18