Mistral-Small-3.1-DRAFT-0.5B-exl2
Original model: Mistral-Small-3.1-DRAFT-0.5B by alamios
Based on: Qwen2.5-0.5B by Qwen
Quants
4bpw h6 (main)
5bpw h6
6bpw h6
8bpw h8
Quantization notes
Made with Exllamav2 with default dataset.
These quants are meant to be used as a draft model for 24B Mistral models with TabbyAPI app.
8bpw version with FP16 cache probably might be the most reliable option for this purpose.
Original model card
Mistral-Small-3.1-DRAFT-0.5B
This model is meant to be used as draft model for speculative decoding with mistralai/Mistral-Small-3.1-24B-Instruct-2503 or mistralai/Mistral-Small-24B-Instruct-2501
Data info
The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.
- Downloads last month
- 7
Model tree for cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2
Base model
Qwen/Qwen2.5-0.5B