Qwen2 or 3, GGUF quants & context size?

#3
by smcleod - opened

I noticed in your config.json it states this is built on qwen2 rather than qwen3? https://huggingface.co/SWE-bench/SWE-agent-LM-32B/blob/main/config.json#L14

It would be great to have some official GGUF quants made available.

Also, just confirming if the context size is the full 128k, or if it's limited to something small like 32k?

smcleod changed discussion title from GGUF quants & context size? to Qwen2 or 3, GGUF quants & context size?

Yes, it's built on Qwen 2.5! Qwen 3 was released essentially at the same time as our work - at some point we'll run on pipeline on Qwen 3 as well and see what performance is like!

I also think the community has created several quantizations here? Do these address your ask?

And for the context size - it's the same as original Qwen 2.5, which is 32k. We didn't experiment with RoPE or anything to increase context window size, but ofc this is on the docket!

Sign up or log in to comment