Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10.4
TFLOPS
3
16
Rajat Arya
rajatarya
Follow
qgallouedec's profile picture
AdilZtn's profile picture
aractingi's profile picture
34 followers
Β·
7 following
https://rajatarya.com
rajatarya
rajatarya
rajatarya
rajatarya.com
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 8 hours ago
SmolLM3: smol, multilingual, long-context reasoner
reacted
to
jsulz
's
post
with π₯
12 days ago
It's been a bit since I took a step back and looked at https://huggingface.co/xet-team progress to migrate Hugging Face from Git LFS to Xet, but every time I do it boggles the mind. A month ago there were 5,500 users/orgs on Xet with 150K repos and 4PB. Today? π€ 700,000 users/orgs π 350,000 repos π 15PB Meanwhile, our migrations have pushed throughput to numbers that are bonkers. In June, we hit upload speeds of 577Gb/s (crossing 500Gb/s for the first time). These are hard numbers to put into context, but let's try: The latest run of the Common Crawl from https://huggingface.co/commoncrawl was 471 TB. We now have ~32 crawls stored in Xet. At peak upload speed we could move the latest crawl into Xet in about two hours. We're moving to a new phase in the process, so stay tuned. This shift in gears means it's also time to roll up our sleeves and look at all the bytes we have and the value we're adding to the community. I already have some homework from @RichardErkhov to look at the dedupe across their uploads, and I'll be doing the same for other early adopters, big models/datasets, and frequent uploaders (looking at you @bartowski π) Let me know if there's anything you're interested in; happy to dig in!
reacted
to
jsulz
's
post
with π
12 days ago
It's been a bit since I took a step back and looked at https://huggingface.co/xet-team progress to migrate Hugging Face from Git LFS to Xet, but every time I do it boggles the mind. A month ago there were 5,500 users/orgs on Xet with 150K repos and 4PB. Today? π€ 700,000 users/orgs π 350,000 repos π 15PB Meanwhile, our migrations have pushed throughput to numbers that are bonkers. In June, we hit upload speeds of 577Gb/s (crossing 500Gb/s for the first time). These are hard numbers to put into context, but let's try: The latest run of the Common Crawl from https://huggingface.co/commoncrawl was 471 TB. We now have ~32 crawls stored in Xet. At peak upload speed we could move the latest crawl into Xet in about two hours. We're moving to a new phase in the process, so stay tuned. This shift in gears means it's also time to roll up our sleeves and look at all the bytes we have and the value we're adding to the community. I already have some homework from @RichardErkhov to look at the dedupe across their uploads, and I'll be doing the same for other early adopters, big models/datasets, and frequent uploaders (looking at you @bartowski π) Let me know if there's anything you're interested in; happy to dig in!
View all activity
Organizations
rajatarya
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
xet-team/README
about 2 months ago
Xet Storage Not Deduplicating for Even Simple Binary Files
8
#3 opened about 2 months ago by
lyk
New activity in
meta-llama/Llama-4-Maverick-17B-128E-Instruct
3 months ago
[request for feedback] faster downloads with xet
5
#18 opened 3 months ago by
clem
New activity in
meta-llama/Llama-4-Scout-17B-16E-Instruct
3 months ago
[request for feedback] Faster downloads with Xet
π₯
12
18
#16 opened 3 months ago by
clem
[request for feedback] Faster downloads with Xet
π₯
12
18
#16 opened 3 months ago by
clem
[request for feedback] Faster downloads with Xet
π₯
12
18
#16 opened 3 months ago by
clem
Load more