Dl's picture

Dl

Dlbk

·

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

black-forest-labs/FLUX.1-Kontext-dev

liked a model 18 days ago

moonshotai/Kimi-Dev-72B

liked a model about 1 month ago

ByteDance-Seed/BAGEL-7B-MoT

View all activity

Organizations

upvoted a collection about 2 months ago

Gemma 3n Preview

4 items • Updated 24 days ago • 149

upvoted a collection 2 months ago

NextCoder

NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 5 items • Updated May 5 • 49

upvoted an article 2 months ago

Article

Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time

By

and 4 others •

Feb 18

• 33

upvoted a collection 2 months ago

Qwen3

72 items • Updated 20 days ago • 824

upvoted 2 collections 3 months ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated 4 days ago • 129

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated 4 days ago • 68

upvoted a collection 5 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 15 items • Updated Apr 18 • 233

upvoted 2 articles 5 months ago

Article

Open-source DeepResearch – Freeing our search agents

By

and 4 others •

Feb 4

• 1.26k

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 305

upvoted a collection 5 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 498

upvoted a paper 5 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66

upvoted 2 collections 6 months ago

OuteTTS

10 items • Updated Apr 7 • 17

OuteTTS 0.3

4 items • Updated Apr 7 • 17

upvoted a paper 6 months ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 295

upvoted 2 collections 6 months ago

QwQ

Qwen with Questions • 6 items • Updated Apr 28 • 97

QVQ

QVQ: Qwen models for visual reasoning • 7 items • Updated Apr 28 • 51

upvoted a collection 7 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated May 21 • 86

upvoted a paper 7 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 146

upvoted 2 collections 7 months ago

GLM-4

GLM-4 Open Models • 14 items • Updated 4 days ago • 118

DeepSeek-V2.5

2 items • Updated Dec 10, 2024 • 41