view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 174
Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation Paper • 2502.00306 • Published Feb 1 • 5
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published Oct 17, 2024 • 62
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.07k
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.26k
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 870
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 91
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5, 2024 • 98
SLIM GGUF Collection Quantized GGUF 'tool' implementations of SLIM Models • 30 items • Updated Feb 23 • 11
SLIM Models Collection Structured Language Instruction Models (SLIMs) • 31 items • Updated Feb 10 • 32
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 73
ChatGPT for Robotics: Design Principles and Model Abilities Paper • 2306.17582 • Published Feb 20, 2023 • 10