This model was generated using DWQ quantization to bring the quality of the 4bit quantization closer to 8bit without increasing in size. This was done using mlx-lm version 0.26.3, using --bits 4 --learning-rate 1e-7 --batch-size 1 --group-size 16.

Downloads last month: 59

Safetensors

Model size

1.19B params

Tensor type

BF16

U32

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for michaellin/Qwen2.5-Coder-7B-4bit-mlx-dwq-lr1e-7

Base model

Qwen/Qwen2.5-7B

Finetuned

mlx-community/Qwen2.5-Coder-7B-8bit

Quantized

(1)

this model