The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6, 2024 • 65
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 296
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8 • 179
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16 • 254
Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model Paper • 2505.17894 • Published May 23 • 218
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 277
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 34
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer Paper • 2502.15631 • Published Feb 21 • 9
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 20
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 152
MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections Paper • 2502.12170 • Published Feb 13 • 12
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published Feb 16 • 60
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published Feb 18 • 38