Quantized Qwen2.5-pyEvaluator-3B-GGUF-Q4_K_M Model (GGUF Format)

This is a quantized and finetune version of a Qwen2.5-Coder-3B model, converted to GGUF format and quantized to Q4_K_M using llama.cpp. This is fine tune on the specific dataset to evaluate only python code. The model is optimized for efficient inference on CPU-based systems.

Model Details

  • Model Name: Qwen2.5-pyEvaluator-3B-GGUF-Q4_K_M
  • Base Model: Qwen2.5-Coder-3B
  • Quantization Type: Q4_K_M
  • Framework: llama.cpp
  • Model Size: 1.79 GB
  • Author: Moiz2517

Quantization Details

This model was quantized using llama.cpp with the following settings:

  • Quantization method: Q4_K_M
  • Bits: 4-bit

The quantization process reduces the model size while maintaining a good balance between performance and accuracy.

Usage

Using llama.cpp

You can load and run this model directly using ollama or hugging face. Here's an example:

ollama run hf.co/Moiz2517/Qwen2.5-pyEvaluator-3B-GGUF-Q4_K_M

Downloads last month
4
GGUF
Model size
3.09B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support