Text Generation
Transformers
Safetensors
English
glm4
conversational
AuriAetherwiing commited on
Commit
3251a81
·
verified ·
1 Parent(s): 2c65aea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ Model was trained by Auri.
23
 
24
  **Training notes**
25
 
26
- Model was trained on a dataset consisting of 77M tokens of synthetic RP and short story gen data for one epoch. Training took around 11 hours on 2xRTX 3090 workstation, generously provided by [OwenArli](https://huggingface.co/OwenArli). Went with some sane defaults for training config, QLoRA plus CCE and sequence parallelism for nice chunk of memory usage optimization, 16k fit on 48GB nicely with some room to spare. I seem to have a problem with Eval/Loss being broken, not sure why, otherwise it trained smoothly.
27
 
28
  Huge thanks to [ArliAI](https://www.arliai.com/) for providing compute and collaborating on this run!
29
 
 
23
 
24
  **Training notes**
25
 
26
+ Model was trained on a dataset consisting of 77M tokens of synthetic RP and short story gen data for one epoch. Training took around 11 hours on 2xRTX 3090 workstation, generously provided by [OwenArli](https://huggingface.co/OwenArli). Went with some sane defaults for training config, QLoRA plus CCE for a nice chunk of memory usage optimization, 16k fit on 48GB nicely with some room to spare. I seem to have a problem with Eval/Loss being broken, not sure why, otherwise it trained smoothly.
27
 
28
  Huge thanks to [ArliAI](https://www.arliai.com/) for providing compute and collaborating on this run!
29