Alifian candra
Alian95
AI & ML interests
None yet
Recent Activity
replied to
burtenshaw's
post
27 days ago
NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.
🔗 https://huggingface.co/reasoning-course
This unit is super useful if you’re tuning models with reinforcement learning. It will help with:
- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions
This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.
📣 Shout out to @ShirinYamani who wrote the unit. Follow for more great content.
new activity
28 days ago
huggingface/InferenceSupport:Alian95/Alian95
published
a model
28 days ago
Alian95/Alian95
Organizations
None yet