--- license: apache-2.0 language: - en - zh - es base_model: - shuttleai/shuttle-3.5 tags: - Qwen - Shuttle - GGUF - 32b - quantized - Q8_0 --- # Shuttle 3.5 โ€” Q8_0 GGUF Quant This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality. ## ๐Ÿ”— Base Model - **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5) - **Parent architecture**: Qwen 3 32B - **Quantized by**: Lex-au - **Quantization format**: GGUF Q8_0 ## ๐Ÿ“ฆ Model Size | Format | Size | |----------|----------| | Original (safetensors, F16) | 65.52 GB | | Q8_0 (GGUF) | 34.8 GB | **Compression Ratio**: ~47% **Size Reduction**: ~18% absolute (30.72 GB saved) ## ๐Ÿงช Quality - Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model. - Ideal for high-quality inference on capable consumer hardware. ## ๐Ÿš€ Usage Compatible with all major GGUF-supporting runtimes, including: - `llama.cpp` - `KoboldCPP` - `text-generation-webui` - `llamafile` - `LM Studio` Example with `llama.cpp`: ```bash ./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."