Dominic Nyambane's picture
3 36

Dominic Nyambane

coderfpv
·

AI & ML interests

Reinforcement Learning, Robotics

Recent Activity

Organizations

Masakhane NLP's profile picture noverdesk's profile picture Smol Community's profile picture

coderfpv's activity

reacted to burtenshaw's post with 🤗 5 days ago
view post
Post
2340
NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

🔗 reasoning-course

This unit is super useful if you’re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

📣 Shout out to @ShirinYamani who wrote the unit. Follow for more great content.
  • 1 reply
·
updated a collection 3 months ago