long Context length

#1
by SuperSonnix71 - opened

40,960 is pretty big context length.... unsloth with Transformers 4.53.0 and high RoPE theta (1,000,000) and Qwen3RMSNorm throughout including for attention projections—q_norm and k_norm.. Awesome!

HelpingAI org

Just like Qwen-3, you can easily adjust its context to 128k (131,072) tokens using Yarn.

Sign up or log in to comment