AI & ML interests

Data Quality

frascuchonΒ 
posted an update 6 days ago
frascuchonΒ 
posted an update 8 days ago
view post
Post
2772
Extending datasets just got a whole lot easier! πŸš€ With Sheets, I was able to create a Spanish version of the popular fka/awesome-chatgpt-prompts dataset in just a few minutes ⏱️.

Check out the resulting dataset: frascuchon/fka_awesome_chatgpt_es πŸ“Š

Want to try it out for yourself? Head over to the Sheets space and see how easy it is to extend and modify existing datasets 🀯. The possibilities are endless! 🌐
burtenshawΒ 
posted an update 8 days ago
view post
Post
677
You don't need remote APIs for a coding copliot, or the MCP Course! Set up a fully local IDE with MCP integration using Continue. In this tutorial Continue guides you through setting it up.

This is what you need to do to take control of your copilot:

1. Get the Continue extension from the [VS Code marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) to serve as the AI coding assistant.

2. Serve the model with an OpenAI compatible server in Llama.cpp / LmStudio/ etc.

llama-server -hf unsloth/Devstral-Small-2505-GGUF:Q4_K_M

3. Create a .continue/models/llama-max.yaml file in your project to tell Continue how to use the local Ollama model.
name: Llama.cpp model
    version: 0.0.1
    schema: v1
    models:
      - provider: llama.cpp
        model: unsloth/Devstral-Small-2505-GGUF
        apiBase: http://localhost:8080
        defaultCompletionOptions:
          contextLength: 8192 
    # Adjust based on the model
        name: Llama.cpp Devstral-Small
        roles:
          - chat
          - edit


4. Create a .continue/mcpServers/playwright-mcp.yaml file to integrate a tool, like the Playwright browser automation tool, with your assistant.

name: Playwright mcpServer
    version: 0.0.1
    schema: v1
    mcpServers:
      - name: Browser search
        command: npx
        args:
          - "@playwright/mcp@latest"


Check out the full tutorial in the [the MCP course](https://huggingface.co/learn/mcp-course/unit2/continue-client)
burtenshawΒ 
posted an update 13 days ago
view post
Post
1537
Brand new MCP Course has units are out, and now it's getting REAL! We've collaborated with Anthropic to dive deep into production ready and autonomous agents using MCP

πŸ”— mcp-course

This is what the new material covers and includes:

- Use Claude Code to build an autonomous PR agent
- Integrate your agent with Slack and Github to integrate it with you Team
- Get certified on your use case and share with the community
- Build an autonomous PR cleanup agent on the Hugging Face hub and deploy it with spaces

The material goes deep into these problems and helps you to build applications that work. We’re super excited to see what you build with it.
burtenshawΒ 
posted an update 14 days ago
view post
Post
1428
Super excited to release Autotrain MCP. This is an MCP server for training AI models, so you can use your AI tools to train your AI models 🀯.

πŸ”— burtenshaw/autotrain-mcp

Use this MCP server with tools like Claude Desktop, Cursor, VSCode, or Continue to do this:

- Define an ML problem like Image Classification, LLM fine-tuning, Text Classification, etc.
- The AI can retrieve models and datasets from the hub using the hub MCP.
- Training happens on a Hugging Face space, so no worries about hardware restraints.
- Models are pushed to the hub to be used inference tools like Llama.cpp, vLLM, MLX, etc.
- Built on top of the AutoTrain library, so it has full integration with transformers and other libraries.

Everything is still under active development, but I’m super excited to hear what people build, and I’m open to contributions!
  • 1 reply
Β·
frascuchonΒ 
posted an update 14 days ago
view post
Post
1304
Unlock the full potential of your datasets with SHEETS! It's incredibly easy to extend existing datasets and unlock new insights.

Leverage open-source models to translate, summarize, classify, and more - all directly within your existing columns.

Ready to give it a try? Explore the possibilities here: aisheets/sheets
  • 2 replies
Β·
dvilasueroΒ 
posted an update 15 days ago
view post
Post
2529
Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.

A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.

Today, I'm thrilled to introduce our first step in this direction.


In a nutshell:

πŸ“ Effortlessly run prompts and models over your data.
🌐 Agentic search for accuracy and real-time information.
πŸ–ΌοΈ Familiar, minimalistic interface for interacting with data.
🎯 Human feedback 2.0: Your input directly improves generated data.
πŸ’― Access hundreds of open models and leading inference providers.

Go to this space to try it out!

aisheets/sheets

Leave your questions below, we're just getting started!
  • 2 replies
Β·
frascuchonΒ 
posted an update 23 days ago
view post
Post
2988
Hey! I built RAG MCP Server Space, a simple Gradio MCP server for RAG systems that allows you to search relevant results without passing huge contexts to your LLM.

You can use this space to integrate with your agents and improve the efficiency of your search results. Feel free to try it out and let me know if you have any feedback or questions!

frascuchon/rag-mcp-server

Thanks for checking it out!
burtenshawΒ 
posted an update about 1 month ago
view post
Post
2602
MCP course is now LIVE! We just dropped quizzes, videos, and live streams to make it a fully interactive course:

πŸ”— join in now: mcp-course

- It’s still free!
- Video 1 walks you through onboarding to the course
- The first live session is next week!
- You can now get a certificate via exam app
- We improved and written material with interactive quizzes

If you’re studying MCP and want a live, interactive, visual, certified course, then join us on the hub!
burtenshawΒ 
posted an update about 1 month ago
view post
Post
3209
We're thrilled to announce the launch of our comprehensive Model Context Protocol (MCP) Course! This free program is designed to take learners from foundational understanding to practical application of MCP in AI.

Follow the course on the hub: mcp-course

In this course, you will:
πŸ“– Study Model Context Protocol in theory, design, and practice.
πŸ§‘β€πŸ’» Learn to use established MCP SDKs and frameworks.
πŸ’Ύ Share your projects and explore applications created by the community.
πŸ† Participate in challenges and evaluate your MCP implementations.
πŸŽ“ Earn a certificate of completion.

At the end of this course, you'll understand how MCP works and how to build your own AI applications that leverage external data and tools using the latest MCP standards.
  • 1 reply
Β·
burtenshawΒ 
posted an update about 2 months ago
view post
Post
2291
Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model.

The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect.

βœ”οΈ training running
βœ”οΈ evals running
⏭️ improve dataset

The moe isn't going to fit into colab's A100 even with quantization (πŸ™ @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow.

burtenshaw/Qwen3-Code-Lite#1
burtenshawΒ 
posted an update 2 months ago
view post
Post
2638
The rebooted LLM course starts today with an overhauled chapter 1 on Transformers:

πŸ‘‰ Follow the org to join the course: huggingface-course

We’re starting from the foundations of modern generative AI by looking at transformers. This chapter is expanded in depth and features so contains new material like:

FREE and CERTIFIED exam on fundamentals of transformers
deeper exploration of transformer architectures and attention mechanisms
end -to-end exploration of inference strategies for prefill and decode steps

The course has leveled up in complexity and depth, so this a great time to join in if you want to build you own AI models.
burtenshawΒ 
posted an update 2 months ago
view post
Post
2082
Hacked my presentation building with inference providers, Cohere command a, and sheer simplicity. Use this script if you’re burning too much time on presentations:

πŸ”— https://github.com/burtenshaw/course_generator/blob/main/scripts/create_presentation.py

This is what it does:
- uses command a to generates slides and speaker notes based on some material.
- it renders the material in remark open format and imports all images, tables, etc
- you can then review the slides as markdown and iterate
- export to either pdf or pptx using backslide

πŸš€ Next steps are: add text to speech for the audio and generate a video. This should make Hugging Face educational content scale to a billion AI Learners.
  • 1 reply
Β·
burtenshawΒ 
posted an update 3 months ago
view post
Post
3342
NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

πŸ”— reasoning-course

This unit is super useful if you’re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

πŸ“£ Shout out to @ShirinYamani who wrote the unit. Follow for more great content.
  • 1 reply
Β·