SAEs for use with the SAELens library

This repository contains the following batch topk Matryoshka SAEs for Gemma-3-1b. All SAEs have 32k width and are trained with k=40 on 750M tokens from the Pile using SAELens. The SAEs were trained with Matryoshka layers of width 128, 512, 2048, 8192, and 32768.

This repository contains the following SAEs:

layer	SAE ID	width	l0	explained variance
0	blocks.0.hook_resid_post	32768	40	0.99118
1	blocks.1.hook_resid_post	32768	40	0.985819
2	blocks.2.hook_resid_post	32768	40	0.981468
3	blocks.3.hook_resid_post	32768	40	0.979252
4	blocks.4.hook_resid_post	32768	40	0.973719
5	blocks.5.hook_resid_post	32768	40	0.977229
6	blocks.6.hook_resid_post	32768	40	0.982247
7	blocks.7.hook_resid_post	32768	40	0.989271
8	blocks.8.hook_resid_post	32768	40	0.985447
9	blocks.9.hook_resid_post	32768	40	0.985869
10	blocks.10.hook_resid_post	32768	40	0.98235
11	blocks.11.hook_resid_post	32768	40	0.980853
12	blocks.12.hook_resid_post	32768	40	0.977682
13	blocks.13.hook_resid_post	32768	40	0.969005
14	blocks.14.hook_resid_post	32768	40	0.956484
15	blocks.15.hook_resid_post	32768	40	0.937399
16	blocks.16.hook_resid_post	32768	40	0.928849
17	blocks.17.hook_resid_post	32768	40	0.912209
18	blocks.18.hook_resid_post	32768	40	0.904198
19	blocks.19.hook_resid_post	32768	40	0.895405
20	blocks.20.hook_resid_post	32768	40	0.883044
21	blocks.21.hook_resid_post	32768	40	0.868396
22	blocks.22.hook_resid_post	32768	40	0.831975
23	blocks.23.hook_resid_post	32768	40	0.793732
24	blocks.24.hook_resid_post	32768	40	0.7452

Load these SAEs using SAELens as below:

from sae_lens import SAE

sae, cfg_dict, sparsity = SAE.from_pretrained("chanind/gemma-3-1b-batch-topk-matryoshka-saes-w-32k-l0-40", "<sae_id>")