metadata
language:
- hi
tags:
- hindi
- text-generation
- causal-lm
- lm
license: mit
datasets:
- custom_hindi_corpus
Hindi-CausalLM
A Hindi language generation model with the following specifications:
Model Architecture
- Type: Causal Language Model with Transformer architecture
- Hidden size: 768
- Layers: 12
- Attention heads: 16
- Key-value heads: 4 (using grouped-query attention)
- Vocabulary size: 16000
- Parameters: ~74.1M
- Context window: 512 tokens
- Trained on: Large corpus of Hindi text
Training
The model was trained on a large corpus of Hindi text using a cosine learning rate schedule with warmup. Training utilized mixed-precision and distributed data parallel across multiple GPUs.
Capabilities
This model can:
- Generate coherent Hindi text
- Continue text from a given prompt
- Create stories, explanations, and other content in Hindi
Limitations
- Performance varies based on the similarity of the input to the training data
- May occasionally generate repetitive content for longer texts
- May produce grammatically incorrect Hindi in some contexts
- Has no knowledge of events beyond its training corpus
Intended Use
This model is intended for Hindi language generation tasks, creative writing assistance, and as a foundation for fine-tuning on specific tasks.
Ethical Considerations
Users should be aware that like all language models, this model may reproduce biases or generate problematic content in certain contexts.