Discussion Regarding the model (Important)
#10
by
UJJAWAL-TYAGI
- opened
You guys claim that your base model is Mistral 3.1 24B params, but Sarvam-M is 23.6B. Did you prune parameters? Or was this reduction from fine-tuning. Or you fine-tune with LoRA ? Or you guys removed adapters, attention heads? Or are you just using system prompts + RAG for language adaptation?