File size: 1,399 Bytes
ae7e231 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
license: apache-2.0
language:
- en
- zh
- es
base_model:
- shuttleai/shuttle-3.5
tags:
- Qwen
- Shuttle
- GGUF
- 32b
- quantized
- Q8_0
---
# Shuttle 3.5 β Q8_0 GGUF Quant
This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.
## π Base Model
- **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5)
- **Parent architecture**: Qwen 3 32B
- **Quantized by**: Lex-au
- **Quantization format**: GGUF Q8_0
## π¦ Model Size
| Format | Size |
|----------|----------|
| Original (safetensors, F16) | 65.52 GB |
| Q8_0 (GGUF) | 34.8 GB |
**Compression Ratio**: ~47%
**Size Reduction**: ~18% absolute (30.72 GB saved)
## π§ͺ Quality
- Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model.
- Ideal for high-quality inference on capable consumer hardware.
## π Usage
Compatible with all major GGUF-supporting runtimes, including:
- `llama.cpp`
- `KoboldCPP`
- `text-generation-webui`
- `llamafile`
- `LM Studio`
Example with `llama.cpp`:
```bash
./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."
|