--- library_name: transformers license: mit --- This model is a Llama architecture based model with 500m parameters created to generate codes, texts and stories. It is pretrained totaly about 35 hours on some kinda small datasets using t4 gpu. after that, I put about 5 hours to train the model on shareGpt inscructured chat template. I've got 1.2 ~ 1.9 training loss after training and it can be lower by more training. This model has a great potansiel to compare with the similar models (If it get trained). This model shouldn't be used as a project itself, It must be trained on some larger datasets. Then, It must be post trained on conversational datasets. **I will done it, soon!** # License This model is licensed under MIT.