AI & ML interests

Webhooks are now publicly available on Hugging Face!

Recent Activity

chansung 
posted an update 5 days ago
view post
Post
3372
YAML engineering becomes more and more important than ever from infra provisioning to model training (recipes).

Here, I built a simple editor first for @dstackai , and I will share the live endpoint this week. Let me know what you think about this approach.

Based on this approach, if people think this is useful, I am going to do the same thing for the LLM training recipes for popular frameworks such as Hugging Face open-r1, Axolotl, and so on. Let me hear.
victor 
posted an update about 1 month ago
view post
Post
3086
Open Source Avengers, Assemble! Ask an expert AI agent team to solve complex problems together 🔥

Consilium brings together multiple agents that debate and use live research (web, arXiv, SEC) to reach a consensus. You set the strategy, they find the answer.

Credit to @azettl for this awesome demo: Agents-MCP-Hackathon/consilium_mcp
  • 2 replies
·
davanstrien 
posted an update about 1 month ago
view post
Post
2937
Inspired by Hugging Face's official MCP server, I've developed a complementary tool that exposes my semantic search API to enhance discovery across the HF platform.

Key capabilities:

- AI-powered semantic search for models and datasets
- Parameter count analysis via safetensors metadata
- Trending content discovery
- Find similar models/datasets functionality
- 11 tools total for enhanced ecosystem navigation

The semantic search goes beyond simple keyword matching, understanding context and relationships between different models and datasets.

Example query: "Find around 10 reasoning Hugging Face datasets published in 2025 focusing on topics other than maths and science. Show a link and a short summary for each dataset." (results in video!)

https://github.com/davanstrien/hub-semantic-search-mcp
mrfakename 
posted an update 2 months ago
view post
Post
4539
Hi everyone,

I just launched TTS Arena V2 - a platform for benchmarking TTS models by blind A/B testing. The goal is to make it easy to compare quality between open-source and commercial models, including conversational ones.

What's new in V2:

- **Conversational Arena**: Evaluate models like CSM-1B, Dia 1.6B, and PlayDialog in multi-turn settings
- **Personal Leaderboard**: Optional login to see which models you tend to prefer
- **Multi-speaker TTS**: Random voices per generation to reduce speaker bias
- **Performance Upgrade**: Rebuilt from Gradio → Flask. Much faster with fewer failed generations.
- **Keyboard Shortcuts**: Vote entirely via keyboard

Also added models like MegaTTS 3, Cartesia Sonic, and ElevenLabs' full lineup.

I'd love any feedback, feature suggestions, or ideas for models to include.

TTS-AGI/TTS-Arena-V2
·
victor 
posted an update 3 months ago
view post
Post
4898
DIA TTS is just amazing - please share your funniest gens (here is mine) 😂
nari-labs/Dia-1.6B
davanstrien 
posted an update 3 months ago
view post
Post
2269
Came across a very nice submission from @marcodsn for the reasoning datasets competition (https://huggingface.co/blog/bespokelabs/reasoning-datasets-competition).

The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:

- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model

It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.

I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.

Dataset can be found here: marcodsn/academic-chains (give it a like!)
davanstrien 
posted an update 3 months ago
view post
Post
1702
I've created a v1 dataset ( davanstrien/reasoning-required) and model ( davanstrien/ModernBERT-based-Reasoning-Required) to help curate "wild text" data for generating reasoning examples beyond the usual code/math/science domains.

- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions

My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.

This significantly reduces computation costs while expanding reasoning dataset domain coverage.
mrfakename 
posted an update 3 months ago
view post
Post
2929
Papla P1 from Papla Media is now available on the TTS Arena!

Try out Papla's new ultra-realistic TTS model + compare it with other leading models on the TTS Arena: TTS-AGI/TTS-Arena
awacke1 
posted an update 3 months ago
view post
Post
1999
AI Vision & SFT Titans 🌟 Turns PDFs into text, snaps pics, and births AI art.

https://huggingface.co/spaces/awacke1/TorchTransformers-Diffusion-CV-SFT

1. OCR a grocery list or train a titan while sipping coffee? ☕
2. Camera Snap 📷: Capture life’s chaos—your cat’s face or that weird receipt. Proof you’re a spy!
3. OCR 🔍: PDFs beg for mercy as GPT-4o extracts text.
4. Image Gen 🎨: Prompt “neon superhero me”
5. PDF 📄: Double-page OCR Single-page sniping

Build Titans 🌱: Train tiny AI models. 💪Characters🧑‍🎨: Craft quirky heroes.
🎥

chansung 
posted an update 4 months ago
view post
Post
3909
simple guide on the recipe for GRPO on Open-R1 which is built on top of TRL

I think FastAPI wrapper of vLLM with WeightSyncWorker is pretty cool feature. Also, we have many predefined reward functions out of the box!
·
emre 
posted an update 4 months ago
view post
Post
3542
having trouble with auto train
hello there this is the first time i am testing auto train with a 1.8k SFT dataset. Howevery i am not quite sure the training is going smooth. Logs seem quite confusing, token did not match can not auth, generates confusing train splits, do you know how i can check my running job properly?
what is being used for training as data?
any ideas?
  • 1 reply
·