118 325

Dokyoon

leeloolee

Eruly

AI & ML interests

Recent Activity

reacted to imnotkitty's post with 🔥 29 days ago

https://huggingface.co/spaces/tencent/Hy3-preview is out: an open-weights MoE reasoning model. ✅ 295B total / 21B active / 256K context ✅ Fused fast-and-slow thinking in a single model ✅ First model trained on Hunyuan's rebuilt pretraining + RL infra (Feb → Apr) Benchmarks: 👉 SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, WideSearch — competitive results, particularly strong on agentic tool use 👉 Top score on Tsinghua's 2026 Spring math PhD qualifying exam 👉 Strong context-learning and instruction-following on Tencent's CL-bench / CL-bench-Life More details can be found in my article: https://huggingface.co/blog/imnotkitty/hy3-preview

upvoted a paper 29 days ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

liked a model about 1 month ago

microsoft/maira-2-sae

View all activity

Organizations

upvoted a paper 29 days ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Paper • 2602.22495 • Published Feb 26 • 5

upvoted 2 papers about 2 months ago

Grounding Everything in Tokens for Multimodal Large Language Models

Paper • 2512.10554 • Published Dec 11, 2025 • 1

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Paper • 2603.22117 • Published Mar 23 • 29

upvoted 2 articles about 2 months ago

Article

Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments

burtenshaw

•

Jan 20

• 12

Article

Build a Domain-Specific Embedding Model in Under a Day

nvidia

•

Mar 20

• 73

upvoted a paper 2 months ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 109

upvoted a collection 2 months ago

Activation Oracles

Collection

12 items • Updated Dec 26, 2025 • 18

upvoted 3 papers 3 months ago

upvoted an article 3 months ago

Article

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

christian-washington, ajasuja, santosh-iima, lewtun, burtenshaw

•

Feb 12

• 32

upvoted 2 papers 3 months ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published Feb 11 • 198

Towards Pixel-Level VLM Perception via Simple Points Prediction

Paper • 2601.19228 • Published Jan 27 • 19

upvoted an article 4 months ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 75

upvoted 6 papers 4 months ago

VISTA-PATH: An interactive foundation model for pathology image segmentation and quantitative analysis in computational pathology

Paper • 2601.16451 • Published Jan 23 • 3

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Paper • 2601.14243 • Published Jan 20 • 23

Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Paper • 2601.10332 • Published Jan 15 • 32

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195

VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation

Paper • 2601.02256 • Published Jan 5 • 33

Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Paper • 2410.20198 • Published Oct 26, 2024 • 1

Dokyoon

AI & ML interests

Recent Activity

Organizations

leeloolee's activity

Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments

Build a Domain-Specific Embedding Model in Under a Day

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective