Quentin Gallouédec's picture

Quentin Gallouédec PRO

qgallouedec

·

AI & ML interests

None yet

Recent Activity

updated a Space about 8 hours ago

qgallouedec/trl-trackio

published a Space about 8 hours ago

qgallouedec/trl-trackio

updated a dataset about 16 hours ago

hf-doc-build/doc-build-dev

View all activity

Organizations

upvoted 2 papers 6 days ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published 16 days ago • 42

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 20 days ago • 165

upvoted a collection 9 days ago

Testing datasets

5 items • Updated 9 days ago • 1

upvoted 3 papers 10 days ago

panda-gym: Open-source goal-conditioned environments for robotic learning

Paper • 2106.13687 • Published Jun 25, 2021 • 3

Cell-Free Latent Go-Explore

Paper • 2208.14928 • Published Aug 31, 2022 • 1

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Paper • 2402.03046 • Published Feb 5, 2024 • 7

upvoted a paper 12 days ago

Distributional Preference Alignment of LLMs via Optimal Transport

Paper • 2406.05882 • Published Jun 9, 2024 • 2

upvoted an article 13 days ago

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

By

and 4 others •

20 days ago

• 53

upvoted a paper 13 days ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published 19 days ago • 161

upvoted a paper 14 days ago

EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes

Paper • 2508.00180 • Published 27 days ago • 1

upvoted a collection 18 days ago

Gemma 3 Release

28 items • Updated 16 days ago • 487

upvoted an article 20 days ago

Article

Vision Language Model Alignment in TRL ⚡️

By

and 4 others •

21 days ago

• 75

upvoted a collection 22 days ago

gpt-oss

Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated 20 days ago • 322

upvoted an article 22 days ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

By

and 11 others •

23 days ago

• 477

upvoted a paper 26 days ago

Reinforcement Learning from Human Feedback

Paper • 2504.12501 • Published Apr 16 • 4

upvoted an article 29 days ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

By

and 4 others •

30 days ago

• 161

upvoted 2 papers 29 days ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 87

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 177

upvoted a paper about 1 month ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 291

upvoted a collection about 1 month ago

Gemma 3n

4 items • Updated Jul 10 • 215