siyeng feng

siyengfeng

AI & ML interests

None yet

Recent Activity

liked a model about 9 hours ago

lerobot/pi0

upvoted an article about 9 hours ago

Open-source DeepResearch – Freeing our search agents

upvoted a paper about 12 hours ago

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

View all activity

Organizations

None yet

siyengfeng's activity

upvoted an article about 9 hours ago

Article

Open-source DeepResearch – Freeing our search agents

2 days ago

• 536

upvoted 4 papers about 12 hours ago

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

Paper • 2502.01081 • Published 3 days ago • 9

LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information

Paper • 2502.02095 • Published 2 days ago • 3

Improving Transformer World Models for Data-Efficient RL

Paper • 2502.01591 • Published 2 days ago • 8

Fast Encoder-Based 3D from Casual Videos via Point Track Processing

Paper • 2404.07097 • Published Apr 10, 2024 • 3

upvoted 3 papers 3 days ago

upvoted 4 papers 5 days ago

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Paper • 2501.18511 • Published 6 days ago • 17

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation

Paper • 2501.16609 • Published 9 days ago • 6

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

Paper • 2501.16411 • Published 9 days ago • 17

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 6 days ago • 48

upvoted a paper 7 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 8 days ago • 100

upvoted a paper 8 days ago

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 9 days ago • 24

upvoted a paper 13 days ago

Autonomy-of-Experts Models

Paper • 2501.13074 • Published 14 days ago • 40

upvoted 4 papers 14 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 15 days ago • 86

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 15 days ago • 296

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published 15 days ago • 39

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published 17 days ago • 31

upvoted an article 14 days ago

Article

Process Reinforcement through Implicit Rewards

and 1 other •

Jan 3

• 22