arshiaafshani commited on
Commit
19a2aef
·
verified ·
1 Parent(s): ac22166

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -4,7 +4,7 @@ license: mit
4
  ---
5
  This model is a Llama architecture based model with 500m parameters created to generate codes, texts and stories. It is pretrained totaly about 35 hours on some kinda small datasets using t4 gpu.
6
  after that, I put about 5 hours to train the model on shareGpt inscructured chat template.
7
- I've got 1.8 ~ 2.1 training loss after training and it can be lower by more training. This model has a great potansiel to compare with the similar models (If it get trained).
8
  This model shouldn't be used as a project itself, It must be trained on some larger datasets. Then, It must be post trained on conversational datasets.
9
  **I will done it, soon!**
10
 
 
4
  ---
5
  This model is a Llama architecture based model with 500m parameters created to generate codes, texts and stories. It is pretrained totaly about 35 hours on some kinda small datasets using t4 gpu.
6
  after that, I put about 5 hours to train the model on shareGpt inscructured chat template.
7
+ I've got 1.2 ~ 1.9 training loss after training and it can be lower by more training. This model has a great potansiel to compare with the similar models (If it get trained).
8
  This model shouldn't be used as a project itself, It must be trained on some larger datasets. Then, It must be post trained on conversational datasets.
9
  **I will done it, soon!**
10