SAEs for use with the SAELens library

This repository contains the following batch topk Matryoshka SAEs for Gemma-3-1b. All SAEs have 32k width and are trained with k=40 on 750M tokens from the Pile using SAELens. The SAEs were trained with Matryoshka layers of width 128, 512, 2048, 8192, and 32768.

This repository contains the following SAEs:

layer SAE ID width l0 explained variance
0 blocks.0.hook_resid_post 32768 40 0.99118
1 blocks.1.hook_resid_post 32768 40 0.985819
2 blocks.2.hook_resid_post 32768 40 0.981468
3 blocks.3.hook_resid_post 32768 40 0.979252
4 blocks.4.hook_resid_post 32768 40 0.973719
5 blocks.5.hook_resid_post 32768 40 0.977229
6 blocks.6.hook_resid_post 32768 40 0.982247
7 blocks.7.hook_resid_post 32768 40 0.989271
8 blocks.8.hook_resid_post 32768 40 0.985447
9 blocks.9.hook_resid_post 32768 40 0.985869
10 blocks.10.hook_resid_post 32768 40 0.98235
11 blocks.11.hook_resid_post 32768 40 0.980853
12 blocks.12.hook_resid_post 32768 40 0.977682
13 blocks.13.hook_resid_post 32768 40 0.969005
14 blocks.14.hook_resid_post 32768 40 0.956484
15 blocks.15.hook_resid_post 32768 40 0.937399
16 blocks.16.hook_resid_post 32768 40 0.928849
17 blocks.17.hook_resid_post 32768 40 0.912209
18 blocks.18.hook_resid_post 32768 40 0.904198
19 blocks.19.hook_resid_post 32768 40 0.895405
20 blocks.20.hook_resid_post 32768 40 0.883044
21 blocks.21.hook_resid_post 32768 40 0.868396
22 blocks.22.hook_resid_post 32768 40 0.831975
23 blocks.23.hook_resid_post 32768 40 0.793732
24 blocks.24.hook_resid_post 32768 40 0.7452

Load these SAEs using SAELens as below:

from sae_lens import SAE

sae, cfg_dict, sparsity = SAE.from_pretrained("chanind/gemma-3-1b-batch-topk-matryoshka-saes-w-32k-l0-40", "<sae_id>")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support