Qwen 3 4B Multilingual Quantized Models

This repo contains quantized versions of the agentlans/Qwen3-4B-multilingual-sft model, optimized for efficient local use with llama.cpp.

The models were quantized using an unofficial Docker image and calibrated on the first 100 rows of the LinguaNova dataset to maintain strong multilingual performance.

These quantized models share the same strengths and limitations as the original Qwen 3 4B multilingual model. They offer a lighter, faster alternative for inference with minor trade-offs in precision.

Downloads last month
0
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for agentlans/Qwen3-4B-multilingual-sft-GGUF

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(1)
this model

Dataset used to train agentlans/Qwen3-4B-multilingual-sft-GGUF