Post
π Today's pick in Interpretability & Analysis of LMs: Model Editing Can Hurt General Abilities of Large Language Models by J.C. Gu et al.
This work raises concerns that gains in factual knowledge after model editing can result in a significant degradation of the general abilities of LLMs. The authors evaluate 4 popular editing methods on 2 LLMs across eight representative tasks, showing model editing does substantially hurt model general abilities. A suggestion is made to prioritize improvements in LLMs' robustness, developing more precise editing methods, and better evaluation benchmarks.
π Paper: Model Editing Can Hurt General Abilities of Large Language Models (2401.04700)
π» Code: https://github.com/JasonForJoy/Model-Editing-Hurt
This work raises concerns that gains in factual knowledge after model editing can result in a significant degradation of the general abilities of LLMs. The authors evaluate 4 popular editing methods on 2 LLMs across eight representative tasks, showing model editing does substantially hurt model general abilities. A suggestion is made to prioritize improvements in LLMs' robustness, developing more precise editing methods, and better evaluation benchmarks.
π Paper: Model Editing Can Hurt General Abilities of Large Language Models (2401.04700)
π» Code: https://github.com/JasonForJoy/Model-Editing-Hurt