Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model Paper • 2501.05122 • Published Jan 9 • 19
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Paper • 2501.08197 • Published 28 days ago • 8
PokerBench: Training Large Language Models to become Professional Poker Players Paper • 2501.08328 • Published 28 days ago • 15
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 28 days ago • 54
The Geometry of Tokens in Internal Representations of Large Language Models Paper • 2501.10573 • Published 25 days ago • 9
Almost Surely Safe Alignment of Large Language Models at Inference-Time Paper • 2502.01208 • Published 9 days ago • 11
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 4 days ago • 43
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 20 days ago • 79
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 20 days ago • 315
Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 15 days ago • 24
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Paper • 2501.15570 • Published 16 days ago • 23