ZhangJin

Benjamin0

AI & ML interests

None yet

Recent Activity

liked a dataset 4 days ago

SynthLabsAI/Big-Math-RL-Verified

upvoted an article 28 days ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted a paper 29 days ago

Pre-Trained Policy Discriminators are General Reward Models

View all activity

Organizations

None yet

upvoted an article 28 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

30 days ago

• 611

upvoted a paper 29 days ago

Pre-Trained Policy Discriminators are General Reward Models

Paper • 2507.05197 • Published 30 days ago • 38

upvoted an article about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.28k

upvoted an article 2 months ago

Article

The Common Pile v0.1

and 2 others •

Jun 6

• 46

upvoted an article 3 months ago

Article

PipelineRL

and 3 others •

Apr 25

• 29

upvoted a paper 4 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 280

upvoted an article 4 months ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

and 1 other •

Apr 4

• 14

upvoted 2 articles 5 months ago

Article

What changed in the Transformer architecture

•

Mar 8

• 15

Article

Common AI Model Formats

•

Feb 27

• 47

upvoted a paper 5 months ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24 • 73

upvoted 2 articles 6 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 197

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

and 1 other •

Jan 16

• 75

ZhangJin

AI & ML interests

Recent Activity

Organizations

Benjamin0's activity

SmolLM3: smol, multilingual, long-context reasoner

Open-source DeepResearch – Freeing our search agents

The Common Pile v0.1

PipelineRL

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

What changed in the Transformer architecture

Common AI Model Formats

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference