Omar Sanseviero's picture

Omar Sanseviero

osanseviero

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Recent Activity

liked a model about 11 hours ago
deepseek-ai/DeepSeek-V3-Base
liked a Space 1 day ago
Qwen/QVQ-72B-preview
liked a model 1 day ago
Qwen/QVQ-72B-Preview
View all activity

Articles

Organizations

Google's profile picture Notebooks-explorers's profile picture scikit-learn-examples's profile picture BigScience Workshop's profile picture Neuropark's profile picture Spaces-explorers's profile picture Flax Community's profile picture Templates's profile picture Gensim's profile picture NLP en ES's profile picture Whisper Fine-Tuning Event's profile picture Keras's profile picture Hackathon Somos NLP 2023: Los LLMs hablan Español's profile picture Training Transformers Together's profile picture Spaces Examples's profile picture I Hackathon Somos NLP: PLN en Español's profile picture fast.ai community's profile picture SomosNLP's profile picture HugGAN Community's profile picture Gradio-Themes-Party's profile picture University of Groningen Workshop's profile picture AI Guru's profile picture Huggingface.js's profile picture Gradio-Blocks-Party's profile picture Data Days Zurich's profile picture Webhooks Explorers (BETA)'s profile picture JAX ♥️ Diffusers 🧨's profile picture Team 7's profile picture Open-Source AI Meetup's profile picture EuroPython 2022's profile picture fastai X Hugging Face Group 2022's profile picture ICML 2022's profile picture Language Tools's profile picture Platzi Community's profile picture Keras Dreambooth Event's profile picture Active Learning Example's profile picture CompVis Community's profile picture Stable Diffusion concepts library's profile picture DeepFloyd's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Whispering GPT's profile picture Open Generative AI's profile picture OpenShape's profile picture LocalCodeLLMs's profile picture Hugging Face Extreme-Scale's profile picture Hugging Face H4 Community's profile picture Blog-explorers's profile picture UniverseTBD's profile picture Hands-On Generative AI with Transformers and Diffusion Models's profile picture Editing Images's profile picture Hacktoberfest 2023's profile picture ICCV2023's profile picture huggingPartyParis's profile picture ZeroGPU Explorers's profile picture Editing Audio's profile picture T5 community's profile picture BERT community's profile picture gg-hf's profile picture Llamas's profile picture MLX Community's profile picture TTS AGI's profile picture Social Post Explorers's profile picture Kato's profile picture La Leaderboard's profile picture Dev Mode Explorers's profile picture Chinese LLMs on Hugging Face's profile picture Paris AI Running Club's profile picture gg-tt's profile picture ONNX Community's profile picture Distillation Hugs's profile picture Hugging Face Discord Community's profile picture Hugging Face Party @ PyTorch Conference's profile picture dummyosan's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture

osanseviero's activity

reacted to merve's post with ❤️ 18 days ago
view post
Post
5508
This week in open-source AI was insane 🤠 A small recap🕺🏻 merve/dec-6-releases-67545caebe9fc4776faac0a3

Multimodal 🖼️
> Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏
> OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨
> Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts

LLMs 💬
> Meta released a new iteration of Llama 70B, Llama3.2-70B trained further
> EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥
> Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license
> Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models
> Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages!

Image/Video Generation 🖼️
> Tencent released HunyuanVideo, a new photorealistic video generation model
> OminiControl is a new editing/control framework for image generation models like Flux

Audio 🔊
> Indic-Parler-TTS is a new text2speech model made by community
reacted to ariG23498's post with 🚀 29 days ago
reacted to Xenova's post with 🔥 about 1 month ago
view post
Post
5520
Have you tried out 🤗 Transformers.js v3? Here are the new features:
⚡ WebGPU support (up to 100x faster than WASM)
🔢 New quantization formats (dtypes)
🏛 120 supported architectures in total
📂 25 new example projects and templates
🤖 Over 1200 pre-converted models
🌐 Node.js (ESM + CJS), Deno, and Bun compatibility
🏡 A new home on GitHub and NPM

Get started with npm i @huggingface/transformers.

Learn more in our blog post: https://huggingface.co/blog/transformersjs-v3
  • 3 replies
·
reacted to maxiw's post with 🤗🚀👍🔥❤️ about 1 month ago
view post
Post
4620
I was curious to see what people post here on HF so I created a dataset with all HF Posts: maxiw/hf-posts

Some interesting stats:

Top 5 Authors by Total Impressions:
-----------------------------------
@merve : 171,783 impressions (68 posts)
@fdaudens : 135,253 impressions (81 posts)
@singhsidhukuldeep : 122,591 impressions (81 posts)
@akhaliq : 119,526 impressions (78 posts)
@MonsterMMORPG : 112,500 impressions (45 posts)

Top 5 Users by Number of Reactions Given:
----------------------------------------
@osanseviero : 1278 reactions
@clem : 910 reactions
@John6666 : 899 reactions
@victor : 674 reactions
@samusenps : 655 reactions

Top 5 Most Used Reactions:
-------------------------
❤️: 7048 times
🔥: 5921 times
👍: 4856 times
🚀: 2549 times
🤗: 2065 times
·
reacted to tomaarsen's post with 🚀🔥 3 months ago
view post
Post
6839
📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
  • 1 reply
·
reacted to nyuuzyou's post with 👀 3 months ago
view post
Post
1962
🎓 Introducing Doc4web.ru Documents Dataset - nyuuzyou/doc4web

Dataset highlights:
- 223,739 documents from doc4web.ru, a document hosting platform for students and teachers
- Primarily in Russian, with some English and potentially other languages
- Each entry includes: URL, title, download link, file path, and content (where available)
- Contains original document files in addition to metadata
- Data reflects a wide range of educational topics and materials
- Licensed under Creative Commons Zero (CC0) for unrestricted use

The dataset can be used for analyzing educational content in Russian, text classification tasks, and information retrieval systems. It's also valuable for examining trends in educational materials and document sharing practices in the Russian-speaking academic community. The inclusion of original files allows for in-depth analysis of various document formats and structures.
reacted to merve's post with 🔥 3 months ago
view post
Post
3765
Meta AI vision has been cooking @facebook
They shipped multiple models and demos for their papers at @ECCV 🤗

Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos 👏

All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images

Model: facebook/vfusion3d
Demo: facebook/VFusion3D

- CoTracker is the state-of-the-art point (pixel) tracking model

Demo: facebook/cotracker
Model: facebook/cotracker
reacted to fdaudens's post with 🧠🤗👀🔥 3 months ago
view post
Post
3054
The Nobel Prize background for Hopfield and Hinton's work on neural networks is pure gold. It's a masterclass in explaining AI basics.

Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics 🤝 AI: A two-way street of innovation.

Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells

We're just scratching the surface. The interplay between AI and physics is reshaping both fields.

Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)

#AI #MachineLearning #Physics #Ethics #Innovation
  • 1 reply
·
reacted to reach-vb's post with 🔥👍 3 months ago
view post
Post
2093
On-device AI framework ecosystem is blooming these days:

1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp

2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm

3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples

4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle

Honorable mentions:

1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js

2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs

3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet

4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml

Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! 🤗

Also, which frameworks did I miss?
  • 1 reply
·
reacted to alielfilali01's post with 🔥 3 months ago
view post
Post
1825
Why nobdoy is talking about the new training corpus released by MBZUAI today.

TxT360 is +15 Trillion tokens corpus outperforming FineWeb on several metrics. Ablation studies were done up to 1T tokens.

Read blog here : LLM360/TxT360
Dataset : LLM360/TxT360
  • 2 replies
·
reacted to lucifertrj's post with 👀 3 months ago
view post
Post
1525
AI Agents LlamaIndex in 40 minutes

The video covers code and workflow explanations for:

- Function Calling
- Function Calling Agents + Agent Runner
- Agentic RAG
- REAcT Agent: Build your own Search Assistant Agent

Watch: https://youtu.be/bHn4dLJYIqE