Br-T-GPT-1

This model is a "Fast-Transformer", A normal Text Generation Model. It can "talks" like ChatGPT, LLama, Falcon etc. Model is: "Open Source, Free Model"

Model Details

Model Description

Parameters:

Vocab Size: 67304

  • Developed by: Bertug Gunel
  • Model type: Decoder only Transformer
  • Language(s) (NLP): TR
  • License: CC-BY-NC-ND-4.0

Uses

Model can used with; .safetensors file Web GUI is coming soon!

Direct Use

Web GUI is coming soon!

Out-of-Scope Use

-Model can only use with turkish, because %95+ of dataset is Turkish! -Model generates bad (low quality sentences) answers!

Bias, Risks, and Limitations

Risks: May generates Political answers! No NSFW sentences used in training!

Recommendations

Risks: May generates Political answers! No NSFW sentences used in training!

How to Get Started with the Model

Please install .safetensors file first, code is coming soon!

Training Details

Training Data

Model trained on: %90+ Turkish (Türkce) data, but it's contains a lot of Japanese, Engilish and Arabic words, names, places etc.

Train details: Number of epochs: 1 Number of iterations: 2167 Training time: 3 Minutes 41 seconds (221 Seconds) Training devices: 1x T4 GPU (Google Collab)

Training Procedure

1x T-4 GPU used for 3 Minutes and 50+ Seconds (About 4 Minutes)

Training Hyperparameters

Training Parameters: Learning Rate: 1e-4 Epochs: 1 Batch Size: 16 (Pairs (QA)) Time: 3 Min(s) 41 Second(s)

Speeds, Sizes, Times [optional]

Model Size: 4x Transformer Block(s), 4 Attention Head (Per Block, total: 16x), Expension Factor: 4x, D_Model (Model Dimension): 256, Max Squnce Lenght: 128 ### Testing Data, Factors & Metrics #### Testing Data Test results (MMLU, MMMU, Human Eval, AIME, Humanity's last exam etc.) coming soon! ### Results Coming soon! #### Summary Model is can use for Generating BAD/LOW Quality Sentences. ## Environmental Impact - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective Transformer (Masked), (Deocder-only, Like ChatGPT) ### Compute Infrastructure

Hardware

Software


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support