AI & ML interests

None defined yet.

Recent Activity

argilla-warehouse's activity

burtenshawย 
posted an update 2 days ago
view post
Post
1245
MCP course is now LIVE! We just dropped quizzes, videos, and live streams to make it a fully interactive course:

๐Ÿ”— join in now: mcp-course

- Itโ€™s still free!
- Video 1 walks you through onboarding to the course
- The first live session is next week!
- You can now get a certificate via exam app
- We improved and written material with interactive quizzes

If youโ€™re studying MCP and want a live, interactive, visual, certified course, then join us on the hub!
loubnabnlย 
posted an update 8 days ago
burtenshawย 
posted an update 8 days ago
view post
Post
2941
We're thrilled to announce the launch of our comprehensive Model Context Protocol (MCP) Course! This free program is designed to take learners from foundational understanding to practical application of MCP in AI.

Follow the course on the hub: mcp-course

In this course, you will:
๐Ÿ“– Study Model Context Protocol in theory, design, and practice.
๐Ÿง‘โ€๐Ÿ’ป Learn to use established MCP SDKs and frameworks.
๐Ÿ’พ Share your projects and explore applications created by the community.
๐Ÿ† Participate in challenges and evaluate your MCP implementations.
๐ŸŽ“ Earn a certificate of completion.

At the end of this course, you'll understand how MCP works and how to build your own AI applications that leverage external data and tools using the latest MCP standards.
  • 1 reply
ยท
burtenshawย 
posted an update 23 days ago
view post
Post
2095
Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model.

The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect.

โœ”๏ธ training running
โœ”๏ธ evals running
โญ๏ธ improve dataset

The moe isn't going to fit into colab's A100 even with quantization (๐Ÿ™ @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow.

burtenshaw/Qwen3-Code-Lite#1
burtenshawย 
posted an update about 1 month ago
view post
Post
2513
The rebooted LLM course starts today with an overhauled chapter 1 on Transformers:

๐Ÿ‘‰ Follow the org to join the course: huggingface-course

Weโ€™re starting from the foundations of modern generative AI by looking at transformers. This chapter is expanded in depth and features so contains new material like:

FREE and CERTIFIED exam on fundamentals of transformers
deeper exploration of transformer architectures and attention mechanisms
end -to-end exploration of inference strategies for prefill and decode steps

The course has leveled up in complexity and depth, so this a great time to join in if you want to build you own AI models.
burtenshawย 
posted an update about 1 month ago
view post
Post
1986
Hacked my presentation building with inference providers, Cohere command a, and sheer simplicity. Use this script if youโ€™re burning too much time on presentations:

๐Ÿ”— https://github.com/burtenshaw/course_generator/blob/main/scripts/create_presentation.py

This is what it does:
- uses command a to generates slides and speaker notes based on some material.
- it renders the material in remark open format and imports all images, tables, etc
- you can then review the slides as markdown and iterate
- export to either pdf or pptx using backslide

๐Ÿš€ Next steps are: add text to speech for the audio and generate a video. This should make Hugging Face educational content scale to a billion AI Learners.
  • 1 reply
ยท
burtenshawย 
posted an update about 2 months ago
view post
Post
3249
NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

๐Ÿ”— reasoning-course

This unit is super useful if youโ€™re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

๐Ÿ“ฃ Shout out to @ShirinYamani who wrote the unit. Follow for more great content.
  • 1 reply
ยท
burtenshawย 
posted an update 2 months ago
view post
Post
3874
The Hugging Face Agents Course now includes three major agent frameworks!

๐Ÿ”— agents-course

This includes LlamaIndex, LangChain, and our very own smolagents. We've worked to integrate the three frameworks in distinctive ways so that learners can reflect on when and where to use each.

This also means that you can follow the course if you're already familiar with one of these frameworks, and soak up some of the fundamental knowledge in earlier units.

Hopefully, this makes the agents course as open to as many people as possible.
  • 3 replies
ยท
burtenshawย 
posted an update 2 months ago
view post
Post
2556
The open LLM leaderboard is completed, retired, dead, โ€˜ascended to a higher planeโ€™. And in its shadow we have an amazing range of leaderboards built and maintained by the community.

In this post, I just want to list some of those great leaderboards that you should bookmark for staying up to date:

- Chatbot Arena LLM Leaderboard is the first port of call for checking out the best model. Itโ€™s not the fastest because humans will need to use the models to get scores, but itโ€™s worth the wait. lmarena-ai/chatbot-arena-leaderboard

- OpenVLM Leaderboard is great for getting scores on vision language models opencompass/open_vlm_leaderboard

- Ai2 are doing a great job on RewardBench and I hope they keep it up because reward models are the unsexy workhorse of the field. allenai/reward-bench

- The GAIA leaderboard is great for evaluating agent applications. gaia-benchmark/leaderboard

๐Ÿคฉ This seems like such a sustainable way of building for the long term, where rather than leaning on a single company to evaluate all LLMs, we share the load.
  • 3 replies
ยท
burtenshawย 
posted an update 2 months ago
view post
Post
2263
Still speed running Gemma 3 to think. Today I focused on setting up gpu poor hardware to run GRPO.

This is a plain TRL and PEFT notebook which works on mac silicone or colab T4. This uses the 1b variant of Gemma 3 and a reasoning version of GSM8K dataset.

๐Ÿง‘โ€๐Ÿณ Thereโ€™s more still in the oven like releasing models, an Unsloth version, and deeper tutorials, but hopefully this should bootstrap your projects.

Hereโ€™s a link to the 1b notebook: https://colab.research.google.com/drive/1mwCy5GQb9xJFSuwt2L_We3eKkVbx2qSt?usp=sharing
  • 1 reply
ยท
burtenshawย 
posted an update 2 months ago
view post
Post
2053
everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go!

1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running

git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft


plus this with --no-deps

git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly


2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb

3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps.

4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters.

from trl import GRPOConfig

training_args = GRPOConfig(
    learning_rate = 5e-6,
    adam_beta1 = 0.9,
    adam_beta2 = 0.99,
    weight_decay = 0.1,
    warmup_ratio = 0.1,
    lr_scheduler_type = "cosine",
    optim = "adamw_8bit",
    logging_steps = 1,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 1,
    num_generations = 2,
    max_prompt_length = 256,
    max_completion_length = 1024 - 256,
    num_train_epochs = 1,
    max_steps = 250,
    save_steps = 250,
    max_grad_norm = 0.1,
    report_to = "none",
)


5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)


if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way.

reasoning-course
  • 2 replies
ยท
burtenshawย 
posted an update 2 months ago
view post
Post
2139
Hereโ€™s a notebook to make Gemma reason with GRPO & TRL. I made this whilst prepping the next unit of the reasoning course:

In this notebooks I combine together googleโ€™s model with some community tooling

- First, I load the model from the Hugging Face hub with transformersโ€™s latest release for Gemma 3
- I use PEFT and bitsandbytes to get it running on Colab
- Then, I took Will Browns processing and reward functions to make reasoning chains from GSM8k
- Finally, I used TRLโ€™s GRPOTrainer to train the model

Next step is to bring Unsloth AI in, then ship it in the reasoning course. Links to notebook below.

https://colab.research.google.com/drive/1Vkl69ytCS3bvOtV9_stRETMthlQXR4wX?usp=sharing
ยท
eliebakย 
posted an update 2 months ago
view post
Post
1771
Google just dropped an exciting technical report for the brand-new Gemma3 model! ๐Ÿš€ Here are my personal notes highlighting the most intriguing architectural innovations, design choices, and insights from this release:

1) Architecture choices:
> No more softcaping, replace by QK-Norm
> Both Pre AND Post Norm
> Wider MLP than Qwen2.5, ~ same depth
> SWA with 5:1 and 1024 (very small and cool ablation on the paper!)
> No MLA to save KV cache, SWA do the job!

2) Long context
> Only increase the rope in the global layer (to 1M)
> Confirmation that it's harder to do long context for smol models, no 128k for the 1B
> Pretrained with 32k context? seems very high
> No yarn nor llama3 like rope extension

3) Distillation
> Only keep te first 256 logits for the teacher
> Ablation on the teacher gap (tl;dr you need some "patience" to see that using a small teacher is better)
> On policy distillation yeahh (by
@agarwl_
et al), not sure if the teacher gap behave the same here, curious if someone have more info?

4) Others
> Checkpoint with QAT, that's very cool
> RL using improve version of BOND, WARM/WARP good excuse to look at
@ramealexandre
papers
> Only use Zero3, no TP/PP if i understand correctly ?
> Training budget relatively similar than gemma2
  • 1 reply
ยท
burtenshawย 
posted an update 3 months ago
view post
Post
3896
Iโ€™m super excited to work with @mlabonne to build the first practical example in the reasoning course.

๐Ÿ”— reasoning-course

Here's a quick walk through of the first drop of material that works toward the use case:

- a fundamental introduction to reinforcement learning. Answering questions like, โ€˜what is a reward?โ€™ and โ€˜how do we create an environment for a language model?โ€™

- Then it focuses on Deepseek R1 by walking through the paper and highlighting key aspects. This is an old school way to learn ML topics, but it always works.

- Next, it takes to you Transformers Reinforcement Learning and demonstrates potential reward functions you could use. This is cool because it uses Marimo notebooks to visualise the reward.

- Finally, Maxime walks us through a real training notebook that uses GRPO to reduce generation length. Iโ€™m really into this because it works and Maxime took the time to validate it share assets and logging from his own runs for you to compare with.

Maximeโ€™s work and notebooks have been a major part of the open source community over the last few years. I, like everyone, have learnt so much from them.
anditoย 
posted an update 3 months ago
view post
Post
2776
Extremely bullish on @CohereForAI 's Aya Vision (8B & 32B) - new SOTA open-weight VLMs

- 8B wins up to 81% of the time in its class, better than Gemini Flash
- 32B beats Llama 3.2 90B!
- Covers 23 languages, excels in image captioning, VQA & more
- Integrated on transformers from Day 0!

Efficient multimodal models are here to stay!!๐Ÿ”ฅ
Check out their blog! https://huggingface.co/blog/aya-vision