Dayan Ruben's picture

149 321

Dayan Ruben

dayanruben

·

https://dayanruben.com

AI & ML interests

None yet

Recent Activity

liked a model 6 days ago

XiaomiRobotics/Xiaomi-Robotics-0-LIBERO

upvoted a collection 6 days ago

Xiaomi-Robotics-0

liked a model 14 days ago

mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4

View all activity

Organizations

upvoted a collection 6 days ago

Xiaomi-Robotics-0

6 items • Updated 8 days ago • 9

upvoted 2 collections 20 days ago

Mistral Large 3

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 91

Kimi K2.5

Moonshot's most powerful model • 2 items • Updated 16 days ago • 51

upvoted a collection 21 days ago

Trinity-Large

5 items • Updated 14 days ago • 39

upvoted a paper 26 days ago

FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning

Paper • 2601.11141 • Published Jan 16 • 23

upvoted 2 collections about 1 month ago

TranslateGemma

3 items • Updated Jan 15 • 212

NVIDIA Cosmos 2

The latest open, multimodal generation models for world generation and reasoning for Physical AI. • 3 items • Updated 14 days ago • 13

upvoted 3 collections about 2 months ago

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 192

Gemma Family

LiteRT models in the Gemma Family • 15 items • Updated 22 days ago • 50

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 5 days ago • 119

upvoted 10 collections 2 months ago

Google's Gemma models family

335 items • Updated Jan 15 • 732

Gemma 2 JPN Release

A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated Jul 10, 2025 • 30

TimesFM Release

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. • 6 items • Updated Oct 4, 2025 • 30

Gemma-APS Release

Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated Jul 10, 2025 • 24

ImageInWords Release

arXiv: https://arxiv.org/abs/2405.02793 • 3 items • Updated Jul 10, 2025 • 4

IndicGenBench

Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated Jul 10, 2025 • 12

SigLIP

Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated Jul 10, 2025 • 63

Switch-Transformers release

This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 10, 2025 • 18

SEAHORSE release

The SEAHORSE metrics (as described in https://arxiv.org/abs/2305.13194). • 12 items • Updated Jul 10, 2025 • 20

MT5 release

The MT5 release follows the T5 family, but is pretrained on multilingual data. The update UMT5 models are pretrained on an updated corpus. • 10 items • Updated Jul 10, 2025 • 23