Discussion Regarding the model (Important)

#10

by UJJAWAL-TYAGI - opened 25 days ago

25 days ago

You guys claim that your base model is Mistral 3.1 24B params, but Sarvam-M is 23.6B. Did you prune parameters? Or was this reduction from fine-tuning. Or you fine-tune with LoRA ? Or you guys removed adapters, attention heads? Or are you just using system prompts + RAG for language adaptation?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment