arshiaafshani
/

Arsh-llm

Text Generation

text-generation-inference

Model card Files Files and versions Community

arshiaafshani commited on 9 days ago

Commit

19a2aef

·

verified ·

1 Parent(s): ac22166

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: mit
 ---
 This model is a Llama architecture based model with 500m parameters created to generate codes, texts and stories. It is pretrained totaly about 35 hours on some kinda small datasets using t4 gpu.
 after that, I put about 5 hours to train the model on shareGpt inscructured chat template.
-I've got 1.8 ~ 2.1 training loss after training and it can be lower by more training. This model has a great potansiel to compare with the similar models (If it get trained).
 This model shouldn't be used as a project itself, It must be trained on some larger datasets. Then, It must be post trained on conversational datasets.
 **I will done it, soon!**

 ---
 This model is a Llama architecture based model with 500m parameters created to generate codes, texts and stories. It is pretrained totaly about 35 hours on some kinda small datasets using t4 gpu.
 after that, I put about 5 hours to train the model on shareGpt inscructured chat template.
+I've got 1.2 ~ 1.9 training loss after training and it can be lower by more training. This model has a great potansiel to compare with the similar models (If it get trained).
 This model shouldn't be used as a project itself, It must be trained on some larger datasets. Then, It must be post trained on conversational datasets.
 **I will done it, soon!**