|
--- |
|
language: |
|
- hi |
|
tags: |
|
- hindi |
|
- text-generation |
|
- causal-lm |
|
- lm |
|
license: mit |
|
datasets: |
|
- custom_hindi_corpus |
|
--- |
|
|
|
# Hindi-CausalLM |
|
|
|
A Hindi language generation model with the following specifications: |
|
|
|
## Model Architecture |
|
- **Type**: Causal Language Model with Transformer architecture |
|
- **Hidden size**: 768 |
|
- **Layers**: 12 |
|
- **Attention heads**: 16 |
|
- **Key-value heads**: 4 (using grouped-query attention) |
|
- **Vocabulary size**: 16000 |
|
- **Parameters**: ~74.1M |
|
- **Context window**: 512 tokens |
|
- **Trained on**: Large corpus of Hindi text |
|
|
|
## Training |
|
|
|
The model was trained on a large corpus of Hindi text using a cosine learning rate schedule with warmup. Training utilized mixed-precision and distributed data parallel across multiple GPUs. |
|
|
|
## Capabilities |
|
|
|
This model can: |
|
- Generate coherent Hindi text |
|
- Continue text from a given prompt |
|
- Create stories, explanations, and other content in Hindi |
|
|
|
## Limitations |
|
|
|
- Performance varies based on the similarity of the input to the training data |
|
- May occasionally generate repetitive content for longer texts |
|
- May produce grammatically incorrect Hindi in some contexts |
|
- Has no knowledge of events beyond its training corpus |
|
|
|
## Intended Use |
|
|
|
This model is intended for Hindi language generation tasks, creative writing assistance, and as a foundation for fine-tuning on specific tasks. |
|
|
|
## Ethical Considerations |
|
|
|
Users should be aware that like all language models, this model may reproduce biases or generate problematic content in certain contexts. |
|
|