Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Neel Nanda's picture
3 2 7

Neel Nanda

NeelNanda
zperlman's profile picture Superfan89's profile picture kdkyum's profile picture
·
https://neelnanda.io
  • NeelNanda5
  • neelnanda-io

AI & ML interests

Mechanistic Interpretability

Recent Activity

authored a paper 18 days ago
Towards eliciting latent knowledge from LLMs with mechanistic interpretability
authored a paper 4 months ago
Open Problems in Mechanistic Interpretability
authored a paper 7 months ago
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
View all activity

Organizations

Science of Finetuning (Neel Nanda's MATS 7.0)'s profile picture

NeelNanda's activity

upvoted a paper over 1 year ago

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1, 2024 • 14
upvoted a paper almost 2 years ago

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Paper • 2307.09458 • Published Jul 18, 2023 • 11
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs