schaeff commited on
Commit
a258416
·
verified ·
1 Parent(s): 40cdc23

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -59,8 +59,16 @@ This model is part of a collection of LayerNorm-free models. The table below pro
59
 
60
  ## Citation
61
 
62
- Title: *Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability*
63
 
64
- **BibTeX:**
65
-
66
- [TBD]
 
 
 
 
 
 
 
 
 
59
 
60
  ## Citation
61
 
62
+ If you have found our work useful please cite as:
63
 
64
+ ```
65
+ @misc{gpt2layernorm2025,
66
+ author = {Baroni, Luca and Khara, Galvin and Schaeffer, Joachim and Subkhankulov, Marat and Heimersheim, Stefan},
67
+ title = {Transformers Don't Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability},
68
+ year = {2025},
69
+ eprint = {2507.02559},
70
+ archivePrefix = {arXiv},
71
+ primaryClass = {cs.LG},
72
+ url = {https://arxiv.org/abs/2507.02559v1}
73
+ }
74
+ ```