Post
172
Latest TRL release brings major upgrades for multimodal alignment!
We dive into 3 new techniques to improve VLM post-training in our new blog:
π GRPO
ποΈ GSPO
π MPO
β vLLM integration for online training w/ transformers backend\
π‘ Blog: https://huggingface.co/blog/trl-vlm-alignment
We dive into 3 new techniques to improve VLM post-training in our new blog:
π GRPO
ποΈ GSPO
π MPO
β vLLM integration for online training w/ transformers backend\
π‘ Blog: https://huggingface.co/blog/trl-vlm-alignment