It's been a bit since I took a step back and looked at xet-team progress to migrate Hugging Face from Git LFS to Xet, but every time I do it boggles the mind.
A month ago there were 5,500 users/orgs on Xet with 150K repos and 4PB. Today? π€ 700,000 users/orgs π 350,000 repos π 15PB
Meanwhile, our migrations have pushed throughput to numbers that are bonkers. In June, we hit upload speeds of 577Gb/s (crossing 500Gb/s for the first time).
These are hard numbers to put into context, but let's try:
The latest run of the Common Crawl from commoncrawl was 471 TB.
We now have ~32 crawls stored in Xet. At peak upload speed we could move the latest crawl into Xet in about two hours.
We're moving to a new phase in the process, so stay tuned.
This shift in gears means it's also time to roll up our sleeves and look at all the bytes we have and the value we're adding to the community.
I already have some homework from @RichardErkhov to look at the dedupe across their uploads, and I'll be doing the same for other early adopters, big models/datasets, and frequent uploaders (looking at you @bartowski π)
Let me know if there's anything you're interested in; happy to dig in!
Open Source Avengers, Assemble! Ask an expert AI agent team to solve complex problems together π₯
Consilium brings together multiple agents that debate and use live research (web, arXiv, SEC) to reach a consensus. You set the strategy, they find the answer.
Note : some things like persistent models/storage/custom LoRAs might not be fully working out of the box. If you need those, you might have to dig into the Wan2GP codebase, see how to tweak the storage folder. Happy hacking!
With major model families like Qwen and all of Llama from meta-llama on Xet, the time is right for new users and organizations to say goodbye to LFS on the Hub.
Xet is now the default storage for new AI builders π π π
Just sign up for an account, create a new model or dataset, pip install huggingface_hub and you're off to the races!
And for everyone with existing repositories, just sign up here https://huggingface.co/join/xet - we'll migrate all existing repositories to Xet and all new repos you create will be Xet-backed by default.
Heyo @RichardErkhov the xet-team at Hugging face was wondering if you wanted to join the fun and jump over to Xet storage. π€
We've been onboarding folks https://huggingface.co/blog/xet-on-the-hub know the backend can scale (Llama 4 and Qwen 3 are on Xet), is great for working with quants (see xet-team/quantization-dedup ), and we're pushing on inviting impactful orgs and users on the Hub. You fit the bill.
We'd love to onboard you, get some feedback, and create some excitement π
The steps are pretty straightforward - join the waitlist at hf.co/join/xet and we'll take care of the rest.
The system is fully backward compatible, so you shouldn't notice a thing. BUT to get the best experience when uploading/downloading, make sure you have hf_xet installed alongside the latest huggingface_hub
At xet-team we've been hard at work bringing a new generation of storage to the Hugging Face community, and weβve crossed some major milestones:
π· Over 2,000 builders and nearing 100 organizations with access to Xet π Over 70,000 model and dataset repositories are Xet-backed π€― 1.4 petabytes managed by Xet
As we move repos from LFS to Xet for everyone we onboard, weβre pushing our content-addressed store (CAS). Check out the chart below π of CAS hitting up to 150 Gb/s throughput this past week.
All of this growth is helping us build richer insights. We expanded our repo graph, which maps how Xet-backed repositories on the Hub share bytes with each other.
Check out the current network in the image below (nodes are repositories, edges are where repos share bytes) and visit the space to see how different versions of Qwen, Llama, and Phi models are grouped together xet-team/repo-graph
As xet-team infrastructure begins backing hundreds of repositories on the Hugging Face Hub, weβre getting to put on our researcher hats and peer into the bytes. π π€
IMO, one of the most interesting ideas Xet storage introduces is a globally shared store of data.
When you upload a file through Xet, the contents are split into ~64KB chunks and deduplicated, but what if those same chunks already exist in another repo on the Hub?
Because of this, different repositories can share bytes we store. That opens up something cool - we can draw a graph of which repos actually share data at the chunk level, where:
- Nodes = repositories - Edges = shared chunks - Edge thickness = how much they overlap
Come find the many BERT islands. Or see how datasets relate in practice, not just in theory. See how libraries or tasks can tie repositories together. You can play around with node size using storage/likes/downloads too.
The result is a super fun visualization from @saba9 and @znation that Iβve already lost way too much time to. I'm excited to see how the networks grow as we add more repositories!
What does it mean when models share the same bytes?
We've investigated some quants and have seen that a considerable portion of quantizations of the same model share the same bytes and can be deduplicated to save considerable upload time for quantizers on the Hub.
Since going into production the xet-team has migrated hundreds of repositories on the Hub to our storage layer, including classic "pre-Hub" open-source models like FacebookAI/xlm-roberta-large (XLM-R) from FacebookAI
XLM-R, introduced in 2019, set new benchmarks for multilingual NLP by learning shared representations across 100 languages. It was then fine-tuned on English, Spanish, Dutch, and German, generating language-specific derivations for each - check out the paper here Unsupervised Cross-lingual Representation Learning at Scale (1911.02116)
These finetunes share much of the same architecture and layout as XLM-R with similar training methods and goals. It makes sense that they would share bytes, but it's still fascinating to see.
We put together a similar space to explore these models to see where they overlap - check it out for yourself xet-team/finetune-dedupe
The darker each block in the heatmap, the more the bytes are shared. Clicking on a repos blocks shows all other repos that share blocks.
It's been a wild few days, and especially π€― to see every tensor file with a Xet logo next to it instead of LFS.
The attached graph shows requests per second to our content-addressed store (CAS) right as the release went live.
yellow = GETs; dashed line = launch time.
You can definitely tell when the community started downloading π
h/t to @rajatarya for the graph, the entire Xet crew to bring us to this point, and special shoutout to Rajat, @port8080, @brianronan , @seanses , and @znation who made sure the bytes kept flying all weekend β‘οΈ
Huge week for xet-team as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.
Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.
We expect builders on the Hub to see even more improvements, helping power innovation across the community.
With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.
Thanks to the meta-llama team for launching on Xet!
If you've been following along with the Xet Team's (xet-team) work, you know we've been working to migrate the Hugging Face Hub from Git LFS and to Xet.
Recently, we launched a waitlist to join the movement to Xet (join here! https://huggingface.co/join/xet ) but getting to this point was a journey.
From the initial proof of concept in August, to launching on the Hub internally, to migrating a set of repositories and routing a small chunk of download traffic on the Hub through our infrastructure. Every step of the way has been full of challenges, big and small, and well worth the effort.
Over the past few weeks, with real traffic flowing through our services weβve tackled some truly gnarly issues (unusual upload/download patterns, memory leaks, load imbalances, and more) and resolved each without major disruptions.
If you're curious about how this sliver of Hub infrastructure looks as we routed traffic through it for the first time (and want a deep dive full of Grafana and Kibana charts π€) I have a post for you.
Here's an inside look into the day of our first migrations and the weeks following, where we pieced together solutions in real time.
You can apply for yourself, or your entire organization. Head over to your account settings for more information or join anywhere you see the Xet logo on a repository you know.
Have questions? Join the conversation below π or open a discussion on the Xet team page xet-team/README
It comes complete with a section on open source AI (of obvious interest to the crowd here) and more than one mention of the Hugging Face community π€
In my opinion, one of the best parts is that it is a compendium for seminal and cutting-edge AI resources, with nearly 250 arXiv papers cited. I've done my best to collect them all in a single place, organized by chapter and by order in which they appear in the book: jsulz/ai-engineering-67c5abe02c8596b5c089934c
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!
They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.
This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).
Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.
Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
Six months after joining Hugging Face the Xet team is kicking off the first migrations from LFS to our storage for a number of repositories on the Hub.
More on the nitty gritty details behind the migration soon, but here are the big takeaways:
π€ We've successfully completed the first migrations from LFS -> Xet to test the infrastructure and prepare for a wider release
β No action on your part needed - you can work with a Xet-backed repo like any other repo on the Hub (for now - major improvements on their way!)
π Keep an eye out for the Xet logo to see if a repo you know is on our infra! See the screenshots below to spot the difference π
π Want Early Access? If youβre curious and want to test it out the bleeding edge that will power the development experience on the Hub, weβd love to partner with you. Let me know!
Toward the end of last year, the Xet team provided an inside look into the foundations of how we plan to enable rapid experimentation and iteration for the AI builders on the Hub: https://huggingface.co/blog/from-files-to-chunks
But it turns out chunks aren't all you need!
Our goal is to bring: π Faster uploads β¬ Speedy downloads πͺ All without sacrificing your workflow
To do that, we need the infrastructure and system and design to back it up. As we prepare to roll out the first Xet-backed repositories on the Hub, we wrote up a post explaining the nitty gritty details of the decisions that bring this to life https://huggingface.co/blog/from-chunks-to-blocks
Complete with an interactive visualization that shows the power of deduplication in action - taking a 191GB repo to ~97GB and shaving a few hours off upload speeds.
The darker each block in the heatmap, the more we dedupe, the less we have to transfer. Clicking on a file's blocks shows all other files that share blocks.
Finally, an open-source AI that turns your lyrics into full songs is hereβmeet YuE! Unlike other tools that only create short clips, YuE can make entire songs (up to 5 minutes) with vocals, melody, and instruments all working together. Letsss go!