Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ Model was trained by Auri.
|
|
23 |
|
24 |
**Training notes**
|
25 |
|
26 |
-
Model was trained on a dataset consisting of 77M tokens of synthetic RP and short story gen data for one epoch. Training took around 11 hours on 2xRTX 3090 workstation, generously provided by [OwenArli](https://huggingface.co/OwenArli). Went with some sane defaults for training config, QLoRA plus CCE
|
27 |
|
28 |
Huge thanks to [ArliAI](https://www.arliai.com/) for providing compute and collaborating on this run!
|
29 |
|
|
|
23 |
|
24 |
**Training notes**
|
25 |
|
26 |
+
Model was trained on a dataset consisting of 77M tokens of synthetic RP and short story gen data for one epoch. Training took around 11 hours on 2xRTX 3090 workstation, generously provided by [OwenArli](https://huggingface.co/OwenArli). Went with some sane defaults for training config, QLoRA plus CCE for a nice chunk of memory usage optimization, 16k fit on 48GB nicely with some room to spare. I seem to have a problem with Eval/Loss being broken, not sure why, otherwise it trained smoothly.
|
27 |
|
28 |
Huge thanks to [ArliAI](https://www.arliai.com/) for providing compute and collaborating on this run!
|
29 |
|