MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published 24 days ago • 77
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published 24 days ago • 77
ConFiguRe: Exploring Discourse-level Chinese Figures of Speech Paper • 2209.07678 • Published Sep 16, 2022
Long Context Alignment with Short Instructions and Synthesized Positions Paper • 2405.03939 • Published May 7, 2024
More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression Paper • 2412.12706 • Published Dec 17, 2024
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19 • 34
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20 • 49
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20 • 49
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20 • 49 • 2
view article Article Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick By cxdu • Oct 24, 2024 • 12
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 76
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper • 2410.13824 • Published Oct 17, 2024 • 32