JunxiongWang commited on
Commit
8ce687d
·
verified ·
1 Parent(s): dcf65f8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -58,3 +58,14 @@ The following hyperparameters were used during training:
58
  - Pytorch 2.1.0+cu118
59
  - Datasets 2.20.0
60
  - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
58
  - Pytorch 2.1.0+cu118
59
  - Datasets 2.20.0
60
  - Tokenizers 0.19.1
61
+
62
+ [MambaInLlama](arxiv.org/abs/2408.15237)
63
+
64
+ ```
65
+ @article{junxiongdaniele2024mambainllama,
66
+ title = {The Mamba in the Llama: Distilling and Accelerating Hybrid Models},
67
+ author = {Junxiong Wang and Daniele Paliotta and Avner May and Alexander M. Rush and Tri Dao},
68
+ journal = {arXiv preprint arXiv:2408.15237},
69
+ year = {2024}
70
+ }
71
+ ```