Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 91
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning Paper • 2601.11141 • Published Jan 16 • 23
NVIDIA Cosmos 2 Collection The latest open, multimodal generation models for world generation and reasoning for Physical AI. • 3 items • Updated 14 days ago • 13
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 192
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 5 days ago • 119
Gemma 2 JPN Release Collection A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated Jul 10, 2025 • 30
TimesFM Release Collection TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. • 6 items • Updated Oct 4, 2025 • 30
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated Jul 10, 2025 • 24
ImageInWords Release Collection arXiv: https://arxiv.org/abs/2405.02793 • 3 items • Updated Jul 10, 2025 • 4
IndicGenBench Collection Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated Jul 10, 2025 • 12
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated Jul 10, 2025 • 63
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 10, 2025 • 18
SEAHORSE release Collection The SEAHORSE metrics (as described in https://arxiv.org/abs/2305.13194). • 12 items • Updated Jul 10, 2025 • 20
MT5 release Collection The MT5 release follows the T5 family, but is pretrained on multilingual data. The update UMT5 models are pretrained on an updated corpus. • 10 items • Updated Jul 10, 2025 • 23