How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study Paper • 2505.15404 • Published May 21 • 13
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Paper • 2505.15656 • Published May 21 • 14
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement Paper • 2502.16776 • Published Feb 24 • 6
Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published Dec 19, 2024 • 13
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks Paper • 2407.02855 • Published Jul 3, 2024 • 13