Listener-Rewarded Thinking in VLMs for Image Preferences Paper • 2506.22832 • Published 6 days ago • 22
MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning Paper • 2506.22992 • Published 6 days ago • 11
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 257
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Paper • 2407.04363 • Published Jul 5, 2024 • 34
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language By davidberenstein1957 and 5 others • Dec 16, 2024 • 130
Gemma 3 Collection A collection of lightweight, state-of-the-art open models built from the same research and technology that powers the Gemini 2.0 models • 32 items • Updated May 14 • 28
view article Article Personal Copilot: Train Your Own Coding Assistant By smangrul and 1 other • Oct 27, 2023 • 64
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • Feb 10 • 58
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 870
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published Jan 21 • 62