
Bluesky Community
community
AI & ML interests
Tools for Bluesky 🦋
bluesky-community's activity
Post
3150
It's just become easier to share your apps on the biggest AI app store (aka HF spaces) for unlimited storage, more visibility and community interactions.
Just pick a React, Svelte, or Vue template when you create your space or add
Or follow this link: https://huggingface.co/new-space?sdk=static
Let's build!
Just pick a React, Svelte, or Vue template when you create your space or add
app_build_command: npm run build
in your README's YAML and app_file: build/index.html
in your README's YAML block.Or follow this link: https://huggingface.co/new-space?sdk=static
Let's build!

cfahlgren1
posted
an
update
12 days ago
Post
1657
Yesterday, we dropped a new conversational viewer for datasets on the hub! 💬
Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.
Here's some datasets you can try it out on:
• mlabonne/FineTome-100k
• Salesforce/APIGen-MT-5k
• open-thoughts/OpenThoughts2-1M
• allenai/tulu-3-sft-mixture
Any other good ones?
Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.
Here's some datasets you can try it out on:
• mlabonne/FineTome-100k
• Salesforce/APIGen-MT-5k
• open-thoughts/OpenThoughts2-1M
• allenai/tulu-3-sft-mixture
Any other good ones?
Post
1651

Check if there's one in your city here: LeRobot-worldwide-hackathon/worldwide-map
Post
1539
The
meta-llama
org just crossed 40,000 followers on Hugging Face. Grateful for all their impact on the field sharing the Llama weights openly and much more!
We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!

We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!

davanstrien
posted
an
update
about 1 month ago
Post
2179
Came across a very nice submission from
@marcodsn
for the reasoning datasets competition (https://huggingface.co/blog/bespokelabs/reasoning-datasets-competition).
The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:
- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model
It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.
I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.
Dataset can be found here: marcodsn/academic-chains (give it a like!)
The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:
- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model
It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.
I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.
Dataset can be found here: marcodsn/academic-chains (give it a like!)
Post
4023
Energy is a massive constraint for AI but do you even know what energy your chatGPT convos are using?
We're trying to change this by releasing ChatUI-energy, the first interface where you see in real-time what energy your AI conversations consume. Great work from @jdelavande powered by spaces & TGI, available for a dozen of open-source models like Llama, Mistral, Qwen, Gemma and more.
jdelavande/chat-ui-energy
Should all chat interfaces have this? Just like ingredients have to be shown on products you buy, we need more transparency in AI for users!
We're trying to change this by releasing ChatUI-energy, the first interface where you see in real-time what energy your AI conversations consume. Great work from @jdelavande powered by spaces & TGI, available for a dozen of open-source models like Llama, Mistral, Qwen, Gemma and more.
jdelavande/chat-ui-energy
Should all chat interfaces have this? Just like ingredients have to be shown on products you buy, we need more transparency in AI for users!
Post
2972
Just crossed half a million public apps on Hugging Face. A new public app is created every minute these days 🤯🤯🤯
What's your favorite? http://hf.co/spaces
What's your favorite? http://hf.co/spaces

davanstrien
posted
an
update
about 2 months ago
Post
1688
I've created a v1 dataset (
davanstrien/reasoning-required) and model (
davanstrien/ModernBERT-based-Reasoning-Required) to help curate "wild text" data for generating reasoning examples beyond the usual code/math/science domains.
- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions
My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.
This significantly reduces computation costs while expanding reasoning dataset domain coverage.
- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions
My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.
This significantly reduces computation costs while expanding reasoning dataset domain coverage.

BrigitteTousi
posted
an
update
about 2 months ago
Post
3136
AI agents are transforming how we interact with technology, but how sustainable are they? 🌍
Design choices — like model size and structure — can massively impact energy use and cost. ⚡💰 The key takeaway: smaller, task-specific models can be far more efficient than large, general-purpose ones.
🔑 Open-source models offer greater transparency, allowing us to track energy consumption and make more informed decisions on deployment. 🌱 Open-source = more efficient, eco-friendly, and accountable AI.
Read our latest, led by @sasha with assists from myself + @yjernite 🤗
https://huggingface.co/blog/sasha/ai-agent-sustainability
Design choices — like model size and structure — can massively impact energy use and cost. ⚡💰 The key takeaway: smaller, task-specific models can be far more efficient than large, general-purpose ones.
🔑 Open-source models offer greater transparency, allowing us to track energy consumption and make more informed decisions on deployment. 🌱 Open-source = more efficient, eco-friendly, and accountable AI.
Read our latest, led by @sasha with assists from myself + @yjernite 🤗
https://huggingface.co/blog/sasha/ai-agent-sustainability
Post
2664
Llama 4 is in transformers!
Fun example using the instruction-tuned Maverick model responding about two images, using tensor parallel for maximum speed.
From https://huggingface.co/blog/llama4-release
Fun example using the instruction-tuned Maverick model responding about two images, using tensor parallel for maximum speed.
From https://huggingface.co/blog/llama4-release
Post
2011
Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March.
People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this!
Kudos to all the small AI builders out there!
People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this!
Kudos to all the small AI builders out there!
Post
4045
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google.
Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.
With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratization—powered by openness and collaboration, in the US and around the world.
This is incredibly exciting. Let’s go, open science and open-source AI!
Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.
With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratization—powered by openness and collaboration, in the US and around the world.
This is incredibly exciting. Let’s go, open science and open-source AI!