Thank you!

#1
by owao - opened

Hello, guys! I'm currently experimenting with your model and am really impressed by the generalization so far.
I also wanted to congratulate you on your blog post: I'm not even a researcher in the field, I'm just a simple curious user, but I read it all and really enjoyed it! Your method appears super elegant, and you did an amazing job making it accessible to a wide audience. I'm very grateful for that because it's clearly more effort than just dumping the info!
Really thanks!

POLARIS org

Hey, thank you for your kind words! I’ll keep it up!

Hello!
What is context length for the 4B model?

@OleLukoe Not directly answering your question but take a look at their https://hkunlp.github.io/blog/2025/Polaris/#evaluation section. They recommend at least 64K max new tokens and set it to 90K in their example. But I'm not sure about the input limit.

edit: I guess we should set the input limit to 32768 as the base model can handle it, and thus max context length should be set to 122K without any issue but I don't know if we can go above this (e.g. 32768 + 100K = 132K).

I'll second the thanks, it's an absolutely amazing model. Very underrated.

I even tried to use it for longQA on a 10 pages technical pdf, and it didn't miss important details that were when using qwen3, qwen3-a3b, gemma 27b, and GLM4!
Pretty outstanding!
I'm definitely excited to keep on experimenting on other out of domain tasks!

By the way, I was wondering if you are pushing the training further or if you are rather planning to explore new methods from scratch? This because of the "preview" part in the name.
Not asking for any ETA, just curious about your next area of research.

I'll definitely follow your journey :)

I have used it for parsing HTML, the model hold and understand 90k context, and don't miss any detail.

I'm very surprised for 4B model, and wanna say thank you to authors, keep doing great stuff

Sign up or log in to comment