Single-Layer SAEs with Transformers Collection TopK SAEs trained on the residual stream activation vectors from a single transformer layer, including the transformers. • 43 items • Updated 3 days ago
Single-Layer SAEs Collection TopK SAEs trained on the residual stream activation vectors from a single transformer layer. • 43 items • Updated 3 days ago
Multi-Layer SAEs with Tuned Lens and Transformers Collection Single SAEs trained on the residual stream activation vectors from every layer simultaneously using tuned lenses, including the transformers. • 17 items • Updated 3 days ago
Multi-Layer SAEs with Transformers Collection Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously, including the transformers. • 35 items • Updated 3 days ago
Multi-Layer SAEs with Tuned Lens Collection Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously using tuned lenses. • 17 items • Updated 3 days ago
Multi-Layer SAEs Collection Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously: https://arxiv.org/abs/2409.04185 • 35 items • Updated 3 days ago
Learning to Skip the Middle Layers of Transformers Collection Transformers with a novel gating mechanism that skips layers from the middle outward: https://arxiv.org/pdf/2506.21103 • 23 items • Updated 3 days ago