MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published May 25 • 24
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs Paper • 2505.13529 • Published May 18 • 11
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study Paper • 2505.15404 • Published May 21 • 13
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Paper • 2505.15656 • Published May 21 • 14
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published May 17 • 58
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 208
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks Paper • 2407.02855 • Published Jul 3, 2024 • 13