hindi-causal-lm / README.md
convaiinnovations's picture
Upload Hindi CausalLM model
0194203 verified
|
raw
history blame
1.52 kB
---
language:
- hi
tags:
- hindi
- text-generation
- causal-lm
- lm
license: mit
datasets:
- custom_hindi_corpus
---
# Hindi-CausalLM
A Hindi language generation model with the following specifications:
## Model Architecture
- **Type**: Causal Language Model with Transformer architecture
- **Hidden size**: 768
- **Layers**: 12
- **Attention heads**: 16
- **Key-value heads**: 4 (using grouped-query attention)
- **Vocabulary size**: 16000
- **Parameters**: ~74.1M
- **Context window**: 512 tokens
- **Trained on**: Large corpus of Hindi text
## Training
The model was trained on a large corpus of Hindi text using a cosine learning rate schedule with warmup. Training utilized mixed-precision and distributed data parallel across multiple GPUs.
## Capabilities
This model can:
- Generate coherent Hindi text
- Continue text from a given prompt
- Create stories, explanations, and other content in Hindi
## Limitations
- Performance varies based on the similarity of the input to the training data
- May occasionally generate repetitive content for longer texts
- May produce grammatically incorrect Hindi in some contexts
- Has no knowledge of events beyond its training corpus
## Intended Use
This model is intended for Hindi language generation tasks, creative writing assistance, and as a foundation for fine-tuning on specific tasks.
## Ethical Considerations
Users should be aware that like all language models, this model may reproduce biases or generate problematic content in certain contexts.