AI & ML interests

Org page for Safetensors: Simple, safe way to store and distribute tensors

Recent Activity

safetensors's activity

Narsil 
posted an update 14 days ago
view post
Post
938
Performance leap: TGI v3 is out. Processes 3x more tokens, 13x faster than vLLM on long prompts. Zero config !



3x more tokens.

By reducing our memory footprint, we’re able to ingest many more tokens and more dynamically than before. A single L4 (24GB) can handle 30k tokens on llama 3.1-8B, while vLLM gets barely 10k. A lot of work went into reducing the footprint of the runtime and its effect are best seen on smaller constrained environments.
13x faster

On long prompts (200k+ tokens) conversation replies take 27.5s in vLLM, while it takes only 2s in TGI. How so ? We keep the initial conversation around, so when a new reply comes in, we can answer almost instantly. The overhead of the lookup is ~5us. Thanks @Dani ël de Kok for the beast data structure.
Zero config

That’s it. Remove all the flags your are using and you’re likely to get the best performance. By evaluating the hardware and model, TGI carefully selects automatic values to give best performance. In production, we don’t have any flags anymore in our deployments. We kept all existing flags around, they may come in handy in niche scenarios.

Read more: https://huggingface.co/docs/text-generation-inference/conceptual/chunking
julien-c 
posted an update 15 days ago
view post
Post
7614
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
julien-c 
posted an update 26 days ago
view post
Post
2193
wow 😮

INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.

PrimeIntellect/INTELLECT-1-Instruct
victor 
posted an update 27 days ago
view post
Post
1784
Qwen/QwQ-32B-Preview shows us the future (and it's going to be exciting)...

I tested it against some really challenging reasoning prompts and the results are amazing 🤯.

Check this dataset for the results: victor/qwq-misguided-attention
  • 2 replies
·
victor 
posted an update about 1 month ago
view post
Post
2256
Perfect example of why Qwen/Qwen2.5-Coder-32B-Instruct is insane?

Introducing: AI Video Composer 🔥
huggingface-projects/ai-video-composer

Drag and drop your assets (images/videos/audios) to create any video you want using natural language!

It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights 🚀.
victor 
posted an update about 1 month ago
view post
Post
1822
Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!
victor 
posted an update 2 months ago
victor 
posted an update 3 months ago
view post
Post
2669
NEW - Inference Playground

Maybe like me you have always wanted a super easy way to compare llama3.2-1B vs. llama3.2-3B? or the same model with different temperatures?

Trying and comparing warm Inference API models has never been easier!
Just go to https://hf.co/playground, set your token and you're ready to go.
We'll keep improving, feedback welcome 😊
  • 2 replies
·
victor 
posted an update 4 months ago
view post
Post
5562
🙋 Calling all Hugging Face users! We want to hear from YOU!

What feature or improvement would make the biggest impact on Hugging Face?

Whether it's the Hub, better documentation, new integrations, or something completely different – we're all ears!

Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! 👇
·
victor 
posted an update 4 months ago
view post
Post
4132
How good are you at spotting AI-generated images?

Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!

victor/fake-insects
·
victor 
posted an update 5 months ago
view post
Post
4046
Hugging Face famous organisations activity. Guess which one has the word "Open" in it 😂
  • 2 replies
·
victor 
posted an update 6 months ago
victor 
posted an update 6 months ago
view post
Post
4002
Together MoA is a really interesting approach based on open source models!

"We introduce Mixture of Agents (MoA), an approach to harness the collective strengths of multiple LLMs to improve state-of-the-art quality. And we provide a reference implementation, Together MoA, which leverages several open-source LLM agents to achieve a score of 65.1% on AlpacaEval 2.0, surpassing prior leader GPT-4o (57.5%)."

Read more here: https://www.together.ai/blog/together-moa

PS: they provide some demo code: (https://github.com/togethercomputer/MoA/blob/main/bot.py) - if someone release a Space for it it could go 🚀
  • 1 reply
·
victor 
posted an update 7 months ago
victor 
posted an update 7 months ago
view post
Post
1856
> We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024). (90.9% vs 90.2%).

@Bin12345 interested in a ZeroGPU Spaces for Bin12345/AutoCoder
  • 6 replies
·
victor 
posted an update 7 months ago
view post
Post
1546
✨ Tools are now available in HuggingChat (https://hf.co/chat)

In short, Tools allow HuggingChat to plug any ZeroGPU Space as a tool HuggingChat can use, offering limitless possibilities.

For the release we plugged 6 tools that you can use right now on command-R+, we plan to expand to more models.

We'll also allow you to add your own tools (any ZeroGPU space is compatible). For more info check out this discussion: huggingchat/chat-ui#470

Kudos to @nsarrazin @Saghen and @mishig for the release <3
·
julien-c 
posted an update 7 months ago
view post
Post
5164
Hey it was good meeting you yesterday @MaziyarPanahi 🔥

thanks @mishig for setting this up

Let's make the Hub as useful as possible for the community ❤️
  • 1 reply
·
Narsil 
posted an update 7 months ago
Narsil 
posted an update 8 months ago
victor 
posted an update 8 months ago
view post
Post
4291
The hype is real: a mysterious gpt2-chatbot model has appeared on the LLM Arena Leaderboard 👀.
It seems to be at least on par with the top performing models (closed and open).

To try it out: https://chat.lmsys.org/ -> then click on the Direct Chat tab and select gpt2-chatbot.

Take your bet, what do you think it is?
·