image-3.webp

A 0.5B parameter draft (speculative decoding) model for use with deepseek-ai/DeepSeek-V3-0324.

See jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0 for the non-GGUF version, and a detailed explanation of how the model was created.


Without imatrix

With imatrix


See DeepSeek-R1-DRAFT-0.5B-v1.0-GGUF for detailed PPL statistics and recommendations on which quant to use, etc.

I have included the imatrix file used to generate the Q4_0-Q6_K quants, along with the 1MB sample of the fine-tuning data used to create it.

Downloads last month
596
GGUF
Model size
501M params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

Base model

Qwen/Qwen2.5-0.5B
Quantized
(108)
this model

Datasets used to train jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

Collection including jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF