An Embarrassingly Simple Defense Against LLM Abliteration Attacks Paper • 2505.19056 • Published 3 days ago • 3 • 2
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think Paper • 2504.20708 • Published 29 days ago • 22 • 2
Towards Data-Efficient Pretraining for Atomic Property Prediction Paper • 2502.11085 • Published Feb 16 • 3 • 3
Towards Data-Efficient Pretraining for Atomic Property Prediction Paper • 2502.11085 • Published Feb 16 • 3 • 3
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Paper • 2406.14563 • Published Jun 20, 2024 • 31 • 1