Omar Sanseviero's picture

In a Training Loop 🔄

Omar Sanseviero

osanseviero

google

·

https://osanseviero.github.io/hackerllama/

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Recent Activity

liked a model 23 days ago

google/gemma-4-E2B-it

liked a model 23 days ago

google/gemma-4-E4B-it

liked a model 23 days ago

google/gemma-4-31B-it

View all activity

Organizations

Posts 19

Post

16667

Diaries of Open Source. Part 15 🤗

🕵️‍♀️Idefics 2 is out, a multimodal open-source model with very nice capabilities
Models, demo, and datasets: HuggingFaceM4/idefics2-661d1971b7c50831dd3ce0fe
Blog: https://hf.co/blog/idefics2

💾Snowflake released snowflake-arctic-embed, a family of powerful small embedding models
Model: Snowflake/snowflake-arctic-embed-m
Blog: https://www.snowflake.com/blog/introducing-snowflake-arctic-embed-snowflakes-state-of-the-art-text-embedding-family-of-models/

✨Pile-T5, EleutherAI's T5 model trained on 2T tokens
Blog: https://blog.eleuther.ai/pile-t5/
Models: EleutherAI/pile-t5-65a76a0d0022dd270b385a66
GitHub: https://github.com/EleutherAI/improved-t5

🤖CodeQwen1.5-7B base and chat models. Models trained on 3T tokens strong benchmark results for code generation, editing and SQL
Blog post: https://qwenlm.github.io/blog/codeqwen1.5/
Demo: https://hf.co/spaces/Qwen/CodeQwen1.5-7b-Chat-demo
Models: Qwen/CodeQwen1.5-7B and Qwen/CodeQwen1.5-7B-Chat

Misc
🦉 DocOwl1.5: Unified Stucture Learning for OCR-free Document Understanding mPLUG/DocOwl
👀Cerule - a tiny Vision LM model Tensoic/Cerule-v0.1
ChemLLM - a LLM for chemistry and molecule science ⚗️https://hf.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO
Distil Whisper Large
📝New pdf/OCR datasets with 19 samples pixparse/pdf-document-ocr-datasets-660701430b0346f97c4bc628
🔥Gretel AI high quality text-to-sql synthetic dataset gretelai/synthetic_text_to_sql

Articles 26

Article

191

Llama can now see and run on your device - welcome Llama 3.2

View all Articles

Collections 13

View 13 collections

Papers 6

arxiv:2509.20354

arxiv:2503.19786

arxiv:2310.16944

arxiv:2303.12582

spaces 179

InstantCoder

Generate app code from your idea

Co2 Estimator

Estimate CO2 activities from an image

How Much Do I Cost

Distilabel Dataset Generator

Generate datasets for training models

Mistral Super Fast

Build your Whisper demo

View 179 Spaces

models 301

osanseviero/browser-actions

Text Generation • 0.3B • Updated Jan 12 • 20 • 3

osanseviero/qwen2.5-0.5b-instruct-q2_K

0.5B • Updated Oct 10, 2024 • 44 • 1

osanseviero/o-blob-3.2

1B • Updated Oct 10, 2024 • 17

osanseviero/test-in-go7

Updated Oct 8, 2024

osanseviero/test-in-go6

Updated Oct 8, 2024

osanseviero/test-in-go5

Updated Oct 8, 2024

osanseviero/Reflection-Llama-3.1-70B-GGUF

Text Generation • 71B • Updated Sep 16, 2024 • 111

osanseviero/test-in-go4

Updated Sep 13, 2024

osanseviero/test-in-go3

Updated Sep 13, 2024

osanseviero/test-in-go

Updated Sep 12, 2024

View 301 models

datasets 39

osanseviero/browser-actions

Viewer • Updated Jan 15 • 1.83k • 9

osanseviero/super-fun-llamas

Viewer • Updated Sep 13, 2024 • 10 • 7 • 1

osanseviero/fun_llamas

Viewer • Updated Sep 12, 2024 • 50 • 10

osanseviero/my-llamas

Viewer • Updated Sep 11, 2024 • 100 • 12

osanseviero/bill_summary_us_chunks-similarity

Viewer • Updated Jul 12, 2024 • 2k • 14

osanseviero/bill_summary_us_chunks

Viewer • Updated Jul 12, 2024 • 3.45M • 14

osanseviero/testing_geospatial

Updated Jul 8, 2024 • 7

osanseviero/ag_misclassifications

Viewer • Updated Oct 8, 2023 • 200 • 28

osanseviero/test_hacks

Updated Apr 28, 2023 • 5

osanseviero/example_ola

Viewer • Updated Mar 24, 2023 • 2 • 3

View 39 datasets