Sarath Shekkizhar's picture

7 4 22

Sarath Shekkizhar

shekkizh

·

https://shekkizh.github.io/

AI & ML interests

None yet

Recent Activity

posted an update 2 days ago

🙋🏽‍♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨 I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this. Here’s the TLDR: ✅ True multi agent systems require: • Persistent, private state per agent • Memory that impacts future decisions • Adaptation based on past experiences ❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining. 🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time. 🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point. What do you think? Curious what patterns you're experimenting with 🧐 👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/

posted an update 3 days ago

Think AGI is just around the corner? Not so fast. When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it? Spoiler: Wordle turned out to be a surprisingly effective benchmark. So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs. 🔑 Takeaways 1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks. 2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents. 3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉 🔗 Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434

posted an update 17 days ago

Some interesting architectural choices made in Llama 4 models -- were these key to the 10M context? Possibly 🤔 🔍 Takeaways: 🧩 Interleaved Attention without position encoding - LLaMA 4 removes explicit positional encoding in some attention layers to boost performance on longer contexts. - The principles here could be similar to the residual connections to facilitate attention to early tokens without positional decay. ⚖️ Scaled Softmax to increase attention at inference time - The max attention value (output of softmax) decreases as context size increases. - Llama 4 incorporates a context-size dependent temperature in the softmax function to modify the slope of softmax, allowing the model to focus better on relevant tokens. - Done only at inference time -- guessing it was more a choice after some observation on eval datasets. What did you think of these choices?

View all activity

Organizations

shekkizh's activity

New activity in openbmb/RLAIF-V-Dataset 11 months ago

Dataset loading failing with HF load_dataset

#3 opened 11 months ago by

New activity in tenyx/Llama3-TenyxChat-70B 12 months ago

great evals

#2 opened 12 months ago by

Script to reproduce MT-Bench

#1 opened 12 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 12 months ago

Evaluation for 70B model FAILED (tenyx/Llama3-TenyxChat-70B)

#719 opened 12 months ago by

New activity in tenyx/TenyxChat-7B-v1 over 1 year ago

Update README.md

#1 opened over 1 year ago by