--- base_model: Qwen/Qwen3-32B license: other license_name: qwen-research license_link: https://huggingface.co/Qwen/Qwen3-32B/blob/main/LICENSE model_creator: Qwen model_name: Qwen3-32B quantized_by: Second State Inc. ---

# Qwen3-32B-GGUF ## Original Model [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) ## Run with LlamaEdge - LlamaEdge version: - Thinking: [v0.17.0](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.17.0) and above - No Thinking: [v0.18.2](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.18.2) - Prompt template - Prompt type: `chatml` (for thinking) - Prompt string ```text <|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ``` - Prompt type: `qwen3-no-think` (for no thinking) - Prompt string ```text <|im_start|>system {system_message}<|im_end|> <|im_start|>user {user_message_1}<|im_end|> <|im_start|>assistant {assistant_message_1}<|im_end|> <|im_start|>user {user_message_2}<|im_end|> <|im_start|>assistant ``` - Context size: `128000` - Run as LlamaEdge service ```bash wasmedge --dir .:. --nn-preload default:GGML:AUTO:Qwen3-32B-Q5_K_M.gguf \ llama-api-server.wasm \ --model-name Qwen3-32B \ --prompt-template chatml \ --ctx-size 128000 ``` *Quantized with llama.cpp b5097*