Ameena Qwen3-8B e3 Quantized GGUF

This is a quantized version of a fine-tuned Qwen3-8B model, optimized for efficient inference.

Model Details

  • Base Model: Qwen/Qwen3-8B
  • Quantization: Q4_K_M (4-bit with K-quant mixed precision)
  • Original Size: ~15.26 GB
  • Quantized Size: ~4.68 GB
  • Compression Ratio: 3.3x
  • Format: GGUF (GPT-Generated Unified Format)

Usage

With llama-cpp-python

from llama_cpp import Llama

# Load the model
llm = Llama(
    model_path="Ameena_Qwen3-8B_e3.gguf",
    n_gpu_layers=-1,  # Use GPU acceleration
    n_ctx=4096,       # Context window
    verbose=False
)

# Generate text
response = llm(
    "Your prompt here",
    max_tokens=512,
    temperature=0.7,
    top_p=0.9
)

With Hugging Face Transformers + llama.cpp

# Download the model
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="Tohirju/Ameena_Qwen3-8B_e3_Quantised_gguf",
    filename="Ameena_Qwen3-8B_e3.gguf"
)

Quantization Details

  • Method: Q4_K_M - Mixed precision 4-bit quantization
  • Quality: Excellent balance between model size and performance
  • Speed: Optimized for fast inference on both CPU and GPU
  • Memory: Significantly reduced VRAM requirements

Performance

  • Inference Speed: ~3.3x faster loading due to smaller file size
  • Memory Usage: ~69% reduction in memory requirements
  • Quality: Minimal quality loss compared to FP16 version

Hardware Requirements

  • CPU: Any modern CPU (optimized for x86_64)
  • GPU: CUDA-compatible GPU recommended (RTX 3060+ or better)
  • RAM: 8GB minimum, 16GB recommended
  • Storage: ~5GB for the model file

License

This model follows the Apache 2.0 license of the base Qwen3-8B model.

Downloads last month
47
GGUF
Model size
8.19B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Tohirju/Ameena_Qwen3-8B_e3_Quantised_gguf

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Quantized
(136)
this model

Space using Tohirju/Ameena_Qwen3-8B_e3_Quantised_gguf 1