lex-au's picture
Create README.md
ae7e231 verified
---
license: apache-2.0
language:
- en
- zh
- es
base_model:
- shuttleai/shuttle-3.5
tags:
- Qwen
- Shuttle
- GGUF
- 32b
- quantized
- Q8_0
---
# Shuttle 3.5 β€” Q8_0 GGUF Quant
This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.
## πŸ”— Base Model
- **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5)
- **Parent architecture**: Qwen 3 32B
- **Quantized by**: Lex-au
- **Quantization format**: GGUF Q8_0
## πŸ“¦ Model Size
| Format | Size |
|----------|----------|
| Original (safetensors, F16) | 65.52 GB |
| Q8_0 (GGUF) | 34.8 GB |
**Compression Ratio**: ~47%
**Size Reduction**: ~18% absolute (30.72 GB saved)
## πŸ§ͺ Quality
- Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model.
- Ideal for high-quality inference on capable consumer hardware.
## πŸš€ Usage
Compatible with all major GGUF-supporting runtimes, including:
- `llama.cpp`
- `KoboldCPP`
- `text-generation-webui`
- `llamafile`
- `LM Studio`
Example with `llama.cpp`:
```bash
./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."