seohyun's picture

17 2

seohyun

happy8825

·

seohyun8825

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

upvoted a paper 3 days ago

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

reacted to codelion's post with 🔥 5 days ago

🧠 We just implemented Andrej Karpathy's "third paradigm" for LLM learning! System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts. 🚀 How it works: Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates. 📊 Results across math benchmarks: Arena Hard: 29% → 37.6% (+8.6%) AIME24: 23.33% → 30% (+6.67%) OptILLMBench: 61% → 65% (+4%) The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently. ✨ Key benefits: 🔄 Cumulative learning over time 📖 Transparent, inspectable strategies 🔌 Works with any OpenAI-compatible API ⚡ Simple integration: just add "spl-" prefix to your model Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them! This feels like a genuine step toward AI that learns from experience while staying completely interpretable. 🔗 GitHub: https://github.com/codelion/optillm/tree/main/optillm/plugins/spl 📖 Full article: https://huggingface.co/blog/codelion/system-prompt-learning 🐦 Original Karpathy tweet: https://x.com/karpathy/status/1921368644069765486 Have you experimented with advanced system prompting? What strategies would you want your LLM to learn?

View all activity

Organizations

happy8825's activity

upvoted 2 papers 3 days ago

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Paper • 2506.03143 • Published 4 days ago • 38

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published 5 days ago • 50

reacted to codelion's post with 🔥 5 days ago

Post

3321

🧠 We just implemented Andrej Karpathy's "third paradigm" for LLM learning!

System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts.

🚀 How it works:
Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates.

📊 Results across math benchmarks:
Arena Hard: 29% → 37.6% (+8.6%)
AIME24: 23.33% → 30% (+6.67%)
OptILLMBench: 61% → 65% (+4%)

The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently.

✨ Key benefits:
🔄 Cumulative learning over time
📖 Transparent, inspectable strategies
🔌 Works with any OpenAI-compatible API
⚡ Simple integration: just add "spl-" prefix to your model

Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them!

This feels like a genuine step toward AI that learns from experience while staying completely interpretable.

🔗 GitHub: https://github.com/codelion/optillm/tree/main/optillm/plugins/spl
📖 Full article: https://huggingface.co/blog/codelion/system-prompt-learning
🐦 Original Karpathy tweet: https://x.com/karpathy/status/1921368644069765486

Have you experimented with advanced system prompting? What strategies would you want your LLM to learn?

upvoted 2 papers 5 days ago

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

Paper • 2505.23504 • Published 9 days ago • 6

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Paper • 2506.01049 • Published 6 days ago • 36

updated a collection 6 days ago

test set

평가 데이터셋 • 3 items • Updated 6 days ago

upvoted 6 papers 9 days ago

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Paper • 2505.21189 • Published 11 days ago • 60

One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 15 days ago • 59

Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models

Paper • 2505.17225 • Published 16 days ago • 64

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published 16 days ago • 86

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published 13 days ago • 144

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published 9 days ago • 89

liked a model 9 days ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • Updated 10 days ago • 82.1k • • 1.83k

upvoted a paper 10 days ago

Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Paper • 2505.22334 • Published 11 days ago • 36

updated a collection 10 days ago

test set

평가 데이터셋 • 3 items • Updated 6 days ago