1 41 12

Le Huy Hoang

splendor1811

huyhoang18112k2

AI & ML interests

Computer Vision

Recent Activity

upvoted an article 17 days ago

Vision Language Models (Better, Faster, Stronger)

updated a Space 2 months ago

splendor1811/AlfredAgent

published a Space 2 months ago

splendor1811/AlfredAgent

View all activity

Organizations

None yet

splendor1811's activity

upvoted an article 17 days ago

Article

Vision Language Models (Better, Faster, Stronger)

and 4 others •

18 days ago

• 388

updated a Space 2 months ago

AlfredAgent

📚

published a Space 2 months ago

AlfredAgent

📚

updated a model 3 months ago

splendor1811/gemma-2-2B-it-thinking_FC

Updated Mar 2

published a model 3 months ago

splendor1811/gemma-2-2B-it-thinking_FC

Updated Mar 2

liked a Space 3 months ago

2.63k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 3 months ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 54

updated a Space 4 months ago

First Agent Template

⚡

liked a Space 4 months ago

564

Scaling test-time compute

📈

Enhance math problem solving by scaling test-time compute

upvoted 2 articles 4 months ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.25k

Article

SmolVLM - small yet mighty Vision Language Model

and 4 others •

Nov 26, 2024

• 296

reacted to sometimesanotion's post with 🚀 4 months ago

Post

3339

**Update** Either I had some wrong numbers plugged in to estimate benchmark numbers from comparator, or the benchmark changed. Virtuoso Small v2 at 41.07 average is still very impressive, especially for writing draft copy for business purposes, while Lamarck remains a chatty generalist-reasoning model.

I've felt confident that 14B Qwen finetunes and merges could break the 42.0 average, and Arcee **came close** with https://huggingface.co/arcee-ai/Virtuoso-Small-2. Congratulations to @arcee-ai !

Just two months ago, it was easy to think that 14B had plateaued, that you could have high IFEVAL or high MUSR/MATH/GPQA at 14B, but not both. That barrier is completely shattered. I see a pathway to even better, and Virtuoso Small 2 is a big part of why. Very impressive work. This community would expect no less from Arcee.

Just look at this graph! Keep in mind, my merges here build on the first Virtuoso Small, and *-DS merges build on DeepSeek R1. There are some impressive merges in the pipe!