view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • Jun 21 • 66
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 19 days ago • 207
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated Mar 21 • 22
Instruct Models - Better instruction following. Collection Q6s Instruct models. Other Qs in model listing. Instruct models are better at one shot / api type usage in most cases vs "chat". Listed: old to new. • 88 items • Updated 4 days ago • 5
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 8 days ago • 514
200+ Roleplay, Creative Writing, Uncensored, NSFW models. Collection Oldest models listed first, with Newest models at bottom of the page. Most repos have full examples, instructions, best settings and so on. • 217 items • Updated 4 days ago • 238
Long Context - 16k,32k,64k,128k,200k,256k,512k,1000k Collection Listed oldest to newest. Some with up to 1 million context. • 61 items • Updated 3 days ago • 17
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21, 2024 • 117