view article Article Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial By open-r1 • Jan 31 • 50
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU By edbeeching and 5 others • Mar 9, 2023 • 54
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • Feb 10 • 58