tylerachang commited on
Commit
3a5221b
·
verified ·
1 Parent(s): a859d82

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ license: apache-2.0
4
+ language:
5
+ - eng
6
+ ---
7
+
8
+ # bigram-subnetworks-pythia-1b
9
+ We release bigram subnetworks as described in [Chang and Bergen (2025)](https://tylerachang.github.io/).
10
+ These are sparse subsets of model parameters that recreate bigram predictions (next token predictions conditioned only on the current token) in Transformer language models.
11
+ This repository contains the bigram subnetwork for [EleutherAI/pythia-1b](https://huggingface.co/EleutherAI/pythia-1b).
12
+
13
+ ## Format
14
+
15
+ A subnetwork file is a pickled Python dictionary that maps the original model parameter names to numpy binary masks with the same shapes as the original model parameters (1: keep, 0: drop).
16
+ For details on usage, see: https://github.com/tylerachang/bigram-subnetworks.
17
+ For details on how these subnetworks were trained, see the paper linked above.
18
+
19
+ For minimal usage, download the code at https://github.com/tylerachang/bigram-subnetworks (or just the file `circuit_loading_utils.py`) and run in Python:
20
+ ```
21
+ from circuit_loading_utils import load_bigram_subnetwork_dict, load_subnetwork_model
22
+ mask_dict = load_bigram_subnetwork_dict('EleutherAI/pythia-1b')
23
+ model, tokenizer, config = load_subnetwork_model('EleutherAI/pythia-1b', mask_dict)
24
+ ```
25
+
26
+ ## Citation
27
+ <pre>
28
+ @article{chang-bergen-2025-bigram,
29
+ title={Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models},
30
+ author={Chang, Tyler A. and Bergen, Benjamin K.},
31
+ journal={Preprint},
32
+ year={2024},
33
+ }
34
+ </pre>