FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 4 days ago • 6
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 4 days ago • 6 • 5
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 4 days ago • 6
Say Anything but This: When Tokenizer Betrays Reasoning in LLMs Paper • 2601.14658 • Published 12 days ago • 1
GutenOCR: A Grounded Vision-Language Front-End for Documents Paper • 2601.14490 • Published 13 days ago • 36
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG 18 days ago • 63
It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models Paper • 2601.08500 • Published 20 days ago • 1
Introducing TrGLUE and SentiTurca: A Comprehensive Benchmark for Turkish General Language Understanding and Sentiment Analysis Paper • 2512.22100 • Published Dec 26, 2025 • 3