VConm's picture

6 3

VConm

ArthurConmy

·

AI & ML interests

None yet

Organizations

None yet

authored a paper 6 months ago

Open Problems in Mechanistic Interpretability

Paper • 2501.16496 • Published Jan 27 • 19

authored 3 papers 11 months ago

Successor Heads: Recurring, Interpretable Attention Heads In The Wild

Paper • 2312.09230 • Published Dec 14, 2023

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

Paper • 2211.00593 • Published Nov 1, 2022 • 2

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9, 2024 • 40

authored a paper 12 months ago

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

Paper • 2407.14435 • Published Jul 19, 2024 • 7

authored a paper over 1 year ago

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 92