hanbin commited on
Commit
8ca22b3
·
verified ·
1 Parent(s): 7a07ccf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -90,5 +90,19 @@ We achieved this with only 1/10 data and model resources compared with Qwen-Math
90
 
91
  ## Citation
92
 
 
 
 
 
 
 
93
  ```
 
 
 
 
 
 
 
 
94
  ```
 
90
 
91
  ## Citation
92
 
93
+ ```latex
94
+ @misc{cui2024process,
95
+ title={Process Reinforcement through Implicit Rewards},
96
+ author={Ganqu Cui and Lifan Yuan and Zefan Wang and Hanbin Wang and Wendi Li and Bingxiang He and Yuchen Fan and Tianyu Yu and Qixin Xu and Weize Chen and Jiarui Yuan and Huayu Chen and Kaiyan Zhang and Xingtai Lv and Shuo Wang and Yuan Yao and Hao Peng and Yu Cheng and Zhiyuan Liu and Maosong Sun and Bowen Zhou and Ning Ding},
97
+ year={2025}
98
+ }
99
  ```
100
+
101
+ ```latex
102
+ @article{yuan2024implicitprm,
103
+ title={Free Process Rewards without Process Labels},
104
+ author={Lifan Yuan and Wendi Li and Huayu Chen and Ganqu Cui and Ning Ding and Kaiyan Zhang and Bowen Zhou and Zhiyuan Liu and Hao Peng},
105
+ journal={arXiv preprint arXiv:2412.01981},
106
+ year={2024}
107
+ }
108
  ```