recommended generation parameters
https://huggingface.co/Qwen/QwQ-32B/blob/main/generation_config.json
Is this your recommendation?
"repetition_penalty": 1.0,
"temperature": 0.6,
"top_k": 40,
"top_p": 0.95,
Nah, the model thinks for way longer on neutral settings(everything set to one or zero). Trying these resulted in messy results and stray tokens.
Edit: yeah, that was a one off, bad conclusions
yeah I notice much too long thinking, and second guessing.
I am asking here for the recommended generation parameters.
But that actually seems to be their recommendation as it also stated in the README:
Sampling Parameters:
Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid endless repetitions.
Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output.
I wrote up more details on https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively for other settings I found to be effective + it stops infinite generations!
@erichartford I really feel a relief saying publicly you are behaving like a true asshole. And that's not the first time I saw you acting contemptuous and ungratefully.