File size: 1,399 Bytes

ae7e231

---
license: apache-2.0
language:
- en
- zh
- es
base_model:
- shuttleai/shuttle-3.5
tags:
- Qwen
- Shuttle
- GGUF
- 32b
- quantized
- Q8_0
---

# Shuttle 3.5 — Q8_0 GGUF Quant

This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.

## 🔗 Base Model

- **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5)
- **Parent architecture**: Qwen 3 32B
- **Quantized by**: Lex-au
- **Quantization format**: GGUF Q8_0

## 📦 Model Size

| Format   | Size     |
|----------|----------|
| Original (safetensors, F16) | 65.52 GB |
| Q8_0 (GGUF)                 | 34.8 GB  |

**Compression Ratio**: ~47%  
**Size Reduction**: ~18% absolute (30.72 GB saved)

## 🧪 Quality

- Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model.
- Ideal for high-quality inference on capable consumer hardware.

## 🚀 Usage

Compatible with all major GGUF-supporting runtimes, including:

- `llama.cpp`
- `KoboldCPP`
- `text-generation-webui`
- `llamafile`
- `LM Studio`

Example with `llama.cpp`:

```bash
./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."