Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 2 items • Updated 10 days ago • 107
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published 22 days ago • 44
Running on Zero 156 156 Chat with Kimi-VL-A3B-Thinking-2506 🤔 Chat with images, videos, or PDFs to generate text
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • about 1 month ago • 65
VDT: General-purpose Video Diffusion Transformers via Mask Modeling Paper • 2305.13311 • Published May 22, 2023
WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training Paper • 2103.06561 • Published Mar 11, 2021
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5, 2024 • 49
DeepSeek-VL: Towards Real-World Vision-Language Understanding Paper • 2403.05525 • Published Mar 8, 2024 • 47
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling Paper • 2302.06605 • Published Feb 13, 2023
Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs Paper • 2406.09367 • Published Jun 13, 2024
Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining Paper • 2410.16166 • Published Oct 21, 2024
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper • 2503.10615 • Published Mar 13 • 17
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? Paper • 2505.23359 • Published May 29 • 40