This model was generated using DWQ quantization to bring the quality of the 4bit quantization closer to 8bit without increasing in size. This was done using mlx-lm version 0.26.3, using --bits 4 --learning-rate 1e-7 --batch-size 1 --group-size 16.

Downloads last month
59
Safetensors
Model size
1.19B params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for michaellin/Qwen2.5-Coder-7B-4bit-mlx-dwq-lr1e-7

Base model

Qwen/Qwen2.5-7B
Quantized
(1)
this model