hindi-causal-lm / README.md
convaiinnovations's picture
Upload Hindi CausalLM model
0194203 verified
|
raw
history blame
1.52 kB
metadata
language:
  - hi
tags:
  - hindi
  - text-generation
  - causal-lm
  - lm
license: mit
datasets:
  - custom_hindi_corpus

Hindi-CausalLM

A Hindi language generation model with the following specifications:

Model Architecture

  • Type: Causal Language Model with Transformer architecture
  • Hidden size: 768
  • Layers: 12
  • Attention heads: 16
  • Key-value heads: 4 (using grouped-query attention)
  • Vocabulary size: 16000
  • Parameters: ~74.1M
  • Context window: 512 tokens
  • Trained on: Large corpus of Hindi text

Training

The model was trained on a large corpus of Hindi text using a cosine learning rate schedule with warmup. Training utilized mixed-precision and distributed data parallel across multiple GPUs.

Capabilities

This model can:

  • Generate coherent Hindi text
  • Continue text from a given prompt
  • Create stories, explanations, and other content in Hindi

Limitations

  • Performance varies based on the similarity of the input to the training data
  • May occasionally generate repetitive content for longer texts
  • May produce grammatically incorrect Hindi in some contexts
  • Has no knowledge of events beyond its training corpus

Intended Use

This model is intended for Hindi language generation tasks, creative writing assistance, and as a foundation for fine-tuning on specific tasks.

Ethical Considerations

Users should be aware that like all language models, this model may reproduce biases or generate problematic content in certain contexts.