BadCat
Foresta
ยท
AI & ML interests
LLMs
Deep learning
Reinforcement learning
Recent Activity
upvoted
a
paper
about 13 hours ago
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search
upvoted
a
paper
9 days ago
Evaluating Parameter Efficient Methods for RLVR
upvoted
a
paper
3 months ago
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
Organizations
None yet