Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
8317.2
TFLOPS
7
11
46
Mitko Vasilev
mitkox
Follow
onyx654321's profile picture
colson8765's profile picture
echo25031's profile picture
335 followers
·
23 following
iotcoi
mitkox
AI & ML interests
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Recent Activity
posted
an
update
8 days ago
I just threw Qwen3-0.6B in BF16 into an on device AI drag race on AMD Strix Halo with vLLM: 564 tokens/sec on short 100-token sprints 96 tokens/sec on 8K-token marathons TL;DR You don't just run AI on AMD. You negotiate with it. The hardware absolutely delivers. Spoiler alert; there is exactly ONE configuration where vLLM + ROCm + Triton + PyTorch + Drivers + Ubuntu Kernel to work at the same time. Finding it required the patience of a saint Consumer AMD for AI inference is the ultimate "budget warrior" play, insane performance-per-euro, but you need hardcore technical skills that would make a senior sysadmin nod in quiet respect.
posted
an
update
20 days ago
I have just vibe coded a feature for ODA on-device AI with MiniMax M2, running locally on my Z8 Fury - and holy silicon, this thing SLAPS! TL;DR the nerd stuff Specialized in coding and agentic work 60 tokens/sec Ryzen AI is getting some serious ROCm 7.0.2 brain implants One extra script to rule them all and bind them to my GPU Vibe coding feature implementation that actually worked on the first try. I know, I'm scared too
posted
an
update
24 days ago
I’m just reading that Ryzen AI 395 has to be 30% slower than DGX Spark in LLM inferencing… and only 96GB GPU RAM… good I haven’t RTFM upfront, so I made the AMD faster with 128GB unified RAM 🫡 Z2 mini G1a can run Qwen3 Coder 30B BF16 at 26.8 tok/sec in ~60GB GPU RAM
View all activity
Organizations
mitkox
's datasets
2
Sort:Â Recently updated
mitkox/aya_dataset
Viewer
•
Updated
Feb 13, 2024
•
1.75k
•
10
mitkox/vulcan
Updated
May 28, 2023
•
12