R1-Reward
Collection
Training Multimodal Reward Model Through Stable Reinforcement Learning
โข
3 items
โข
Updated
๐ฅ We are proud to open-source R1-Reward, a comprehensive project for improve reward modeling through reinforcement learning. This release includes:
If you find it useful for your research and applications, please cite related papers/blogs using this BibTeX:
@article{zhang2025r1,
title={R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning},
author={Zhang, Yi-Fan and Lu, Xingyu and Hu, Xiao and Fu, Chaoyou and Wen, Bin and Zhang, Tianke and Liu, Changyi and Jiang, Kaiyu and Chen, Kaibing and Tang, Kaiyu and others},
journal={arXiv preprint arXiv:2505.02835},
year={2025}
}