Piotr Wilkin
ilintar
·
AI & ML interests
None yet
Recent Activity
reacted
to
smirki's
post
with 🔥
15 days ago
✨ We’re live! Introducing TFrameX, the agentic framework for AI builders.
After nights of development, we’re finally open-sourcing TFrameX, a powerful AI agent communication and coordination library.
TFrameX lets you:
🤖 Run agents in dynamic flows
🔁 Compose reusable patterns like Sequential, Parallel, Router, and more
🧠 Enable agent-to-agent collaboration and delegation
⚡ Build modular, complex multi-agent systems that just work
👉 GitHub: TFrameX
https://github.com/TesslateAI/TFrameX
But we didn’t stop there.
We also built a sleek visual builder to design, deploy, and debug your agent patterns without writing boilerplate!
🧩 Visual Studio for TFrameX: https://github.com/TesslateAI/Studio
If you’re building agent frameworks, LLM tools, or agentic apps, TFrameX gives you the tools to move fast and reason deeply.
reacted
to
wolfram's
post
with 🔥
19 days ago
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science).
A few take-aways stood out - especially for those interested in local deployment and performance trade-offs:
1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s.
2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend.
3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s.
4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups.
5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off).
All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings.
**Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default.
Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
View all activity
Organizations