2 21 43

Juan CM

jucamohedano

AI & ML interests

AI Systems MSc at Trento 🚀🤖

Recent Activity

liked a Space about 14 hours ago

nanotron/ultrascale-playbook

upvoted a collection 17 days ago

Model Merging

updated a collection about 1 month ago

Model search via model weights

View all activity

Organizations

liked a Space about 14 hours ago

2.8k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted a collection 17 days ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 243

updated a collection about 1 month ago

Model search via model weights

Collection

2 items • Updated Jun 4

upvoted 2 papers about 1 month ago

Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

Paper • 2502.09619 • Published Feb 13 • 36

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28 • 46

upvoted a collection 5 months ago

🤖 Agents

Collection

21 items • Updated Dec 31, 2024 • 161

upvoted an article 5 months ago

Article

Introducing smolagents: simple agents that write actions in code.

and 2 others •

Dec 31, 2024

• 1.08k

upvoted a paper 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

upvoted 2 articles 5 months ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.27k

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

and 2 others •

Jan 23

• 182

updated a model 8 months ago

jucamohedano/paligemma_a-okvqa

Updated Nov 15, 2024 • 2

updated a model 10 months ago

jucamohedano/char-lstm-shakespeare

Updated Sep 22, 2024

liked a dataset 10 months ago

karpathy/tiny_shakespeare

Updated Jan 18, 2024 • 4.25k • 57

updated a model 10 months ago

jucamohedano/char-lstm-shakespeare_

Updated Sep 21, 2024

liked a model about 1 year ago

microsoft/Phi-3-vision-128k-instruct

Text Generation • 4B • Updated Aug 20, 2024 • 21.2k • 960

upvoted an article about 1 year ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

and 2 others •

May 14, 2024

• 259

reacted to merve's post with 🚀 about 1 year ago

Post

1778

New open Vision Language Model by @Google : PaliGemma 💙🤍

📝 Comes in 3B, pretrained, mix and fine-tuned models in 224, 448 and 896 resolution
🧩 Combination of Gemma 2B LLM and SigLIP image encoder
🤗 Supported in transformers

PaliGemma can do..
🧩 Image segmentation and detection! 🤯
📑 Detailed document understanding and reasoning
🙋 Visual question answering, captioning and any other VLM task!

Read our blog 🔖 hf.co/blog/paligemma
Try the demo 🪀 hf.co/spaces/google/paligemma
Check out the Spaces and the models all in the collection 📚 google/paligemma-release-6643a9ffbf57de2ae0448dda
Collection of fine-tuned PaliGemma models google/paligemma-ft-models-6643b03efb769dad650d2dda