root
initial
a841f20
---
license: mit
datasets:
- Vikhrmodels/GrandMaster-PRO-MAX
language:
- en
- ru
tags:
- mistral
- chat
- conversational
- transformers
inference:
parameters:
temperature: 0
pipeline_tag: text-generation
base_model:
- mistralai/Mistral-Small-3.1-24B-Instruct-2503
- anthracite-core/Mistral-Small-3.1-24B-Instruct-2503-HF
library_name: vllm
---
# Zero-Mistral-Small-3.1-24B-Instruct-2503-beta
Zero-Mistral-Small-3.1 is an improved TEXT-ONLY version of [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) based on NO-VISION anthracite-core/Mistral-Small-3.1-24B-Instruct-2503-HF, primarily adapted for Russian and English languages.
The training involved SFT stage on [GrandMaster-PRO-MAX](https://huggingface.co/datasets/Vikhrmodels/GrandMaster-PRO-MAX) dataset.
This is a beta version. Benchmarks and some more fine-tuning coming soon.
Current status:
```
Trained with lm_head
Train loss: 0.564200
Eval loss: 0.638504
```