recommended generation parameters

by erichartford - opened Mar 5

Discussion

erichartford

Mar 5

https://huggingface.co/Qwen/QwQ-32B/blob/main/generation_config.json

Is this your recommendation?
"repetition_penalty": 1.0,
"temperature": 0.6,
"top_k": 40,
"top_p": 0.95,

Olafangensan

Mar 5

•

edited Mar 7

Nah, the model thinks for way longer on neutral settings(everything set to one or zero). Trying these resulted in messy results and stray tokens.
Edit: yeah, that was a one off, bad conclusions

erichartford

Mar 5

yeah I notice much too long thinking, and second guessing.
I am asking here for the recommended generation parameters.

owao

Mar 5

But that actually seems to be their recommendation as it also stated in the README:

Sampling Parameters:
Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid endless repetitions.
Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output.

erichartford changed discussion status to closed Mar 6

owao

Mar 6

You are welcome @erichartford

danielhanchen

Qwen org Mar 7

I wrote up more details on https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively for other settings I found to be effective + it stops infinite generations!

owao

Mar 10

@erichartford I really feel a relief saying publicly you are behaving like a true asshole. And that's not the first time I saw you acting contemptuous and ungratefully.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment