TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published Oct 1 • 28
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Paper • 2408.12528 • Published Aug 22 • 50
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Paper • 2408.12060 • Published Aug 22 • 4
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community Paper • 2408.08291 • Published Aug 15 • 9
LLM Circuit Analyses Are Consistent Across Training and Scale Paper • 2407.10827 • Published Jul 15 • 4
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 346
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21 • 61
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17 • 48
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization Paper • 2406.11431 • Published Jun 17 • 4
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Paper • 2406.07522 • Published Jun 11 • 37
Open-Endedness is Essential for Artificial Superhuman Intelligence Paper • 2406.04268 • Published Jun 6 • 11
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration Paper • 2406.01014 • Published Jun 3 • 30
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Paper • 2405.21060 • Published May 31 • 63