@sergiopaniego on Hugging Face: "Just included example scripts for aligning models using GSPO (including VLM…"

Post

1604

Just included example scripts for aligning models using GSPO (including VLM example) 🙆‍♂️🙆‍♂️

GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.

Super-easy-to-get-started example scripts below, GO run them!👩‍💻👩‍💻

🧑‍🎨 Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
🦄 VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
🧙‍♂️ GSPO paper: Group Sequence Policy Optimization (2507.18071)