arxiv:2602.10693
Xiang Cheng
FFFc2
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning upvoted a paper about 2 months ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training upvoted a collection about 2 months ago
dots.ocrOrganizations
None yet