unsloth-QwQ-32B-gguf-japanese-imatrix
Qwen/QwQ-32B は人によって評価がかなり分かれています。これは、パラメータ―に敏感である事が影響しているようです。
このgguf版は量子化パラメータ―を改良し、日本語能力を向上させたggufの作成を目指したものです
詳細は検証中です
Qwen/QwQ-32B has received very mixed reviews from people. This is likely due to its sensitivity to parameters.
This gguf version aims to create a gguf with improved quantization parameters and improved Japanese language capabilities.
Details are under verification.
currnet sample parameters.
temperature = 0.6
top-k = 40 (20 to 40 suggested)
min-p = 0.00 (optional, but 0.01 works well, llama.cpp default is 0.1)
top-p = 0.95
repetition-penalty = 1.0
dry-multiplier 0.5
Chat template: <|im_start|>user\nCreate a Flappy Bird game in Python.<|im_end|>\n<|im_start|>assistant\n<think>\n
Reference information
Tutorial: How to Run QwQ-32B effectively
- Downloads last month
- 150
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.