Post
🔍 Today's pick in Interpretability & Analysis of LMs: Rethinking Interpretability in the Era of Large Language Models
by C. Singh, J. P. Inala, @mgalley , R. Caruana, @wyngjf
In this opinion piece, authors contend that the new capabilities of LLMs can deeply transform the scope of interpretability, moving from low-level explanations such as saliency maps to natural language explanations that would allow for natural interaction with users.
This ambitious goal is however hindered by LM’s natural tendency to hallucinate, their large size and their inherent opaqueness. Authors highlight in particular dataset explanations for knowledge discovery, explanations’ reliability and interactive explanations as important priorities for the future of interpretability research.
📄 Paper: Rethinking Interpretability in the Era of Large Language Models (2402.01761)
by C. Singh, J. P. Inala, @mgalley , R. Caruana, @wyngjf
In this opinion piece, authors contend that the new capabilities of LLMs can deeply transform the scope of interpretability, moving from low-level explanations such as saliency maps to natural language explanations that would allow for natural interaction with users.
This ambitious goal is however hindered by LM’s natural tendency to hallucinate, their large size and their inherent opaqueness. Authors highlight in particular dataset explanations for knowledge discovery, explanations’ reliability and interactive explanations as important priorities for the future of interpretability research.
📄 Paper: Rethinking Interpretability in the Era of Large Language Models (2402.01761)