PyTorch
llama
alignment-handbook
Generated from Trainer
JunxiongWang commited on
Commit
e4e53e3
1 Parent(s): 6c2c854

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -76,3 +76,14 @@ The following hyperparameters were used during training:
76
  - Pytorch 2.1.1+cu118
77
  - Datasets 2.20.0
78
  - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
76
  - Pytorch 2.1.1+cu118
77
  - Datasets 2.20.0
78
  - Tokenizers 0.19.1
79
+
80
+ [MambaInLlama](arxiv.org/abs/2408.15237)
81
+
82
+ ```
83
+ @article{junxiongdaniele2024mambainllama,
84
+ title = {The Mamba in the Llama: Distilling and Accelerating Hybrid Models},
85
+ author = {Junxiong Wang and Daniele Paliotta and Avner May and Alexander M. Rush and Tri Dao},
86
+ journal = {arXiv preprint arXiv:2408.15237},
87
+ year = {2024}
88
+ }
89
+ ```