8 93 8

Harold Chen

Harold328

https://haroldchen19.github.io/

HaroldChen19

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 1 day ago

Action Images: End-to-End Policy Learning via Multiview Video Generation

upvoted a paper 2 days ago

Vero: An Open RL Recipe for General Visual Reasoning

upvoted a paper 3 days ago

Self-Distilled RLVR

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Action Images: End-to-End Policy Learning via Multiview Video Generation

Paper • 2604.06168 • Published 3 days ago • 9

upvoted a paper 2 days ago

Vero: An Open RL Recipe for General Visual Reasoning

Paper • 2604.04917 • Published 4 days ago • 23

upvoted a paper 3 days ago

Self-Distilled RLVR

Paper • 2604.03128 • Published 7 days ago • 145

upvoted 2 papers 6 days ago

EgoSim: Egocentric World Simulator for Embodied Interaction Generation

Paper • 2604.01001 • Published 9 days ago • 36

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published 8 days ago • 91

upvoted 3 papers 8 days ago

upvoted a paper 10 days ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 14 days ago • 154

upvoted 2 papers 13 days ago

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Paper • 2603.12254 • Published 28 days ago • 21

UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

Paper • 2603.23500 • Published 16 days ago • 35

upvoted a paper 16 days ago

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

Paper • 2603.21986 • Published 17 days ago • 121

upvoted a paper 17 days ago

Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

Paper • 2603.17051 • Published 23 days ago • 108

upvoted 4 papers 20 days ago

MosaicMem: Hybrid Spatial Memory for Controllable Video World Models

Paper • 2603.17117 • Published 23 days ago • 87

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 23 days ago • 136

FASTER: Rethinking Real-Time Flow VLAs

Paper • 2603.19199 • Published 21 days ago • 57

ESPIRE: A Diagnostic Benchmark for Embodied Spatial Reasoning of Vision-Language Models

Paper • 2603.13033 • Published 27 days ago • 13

upvoted 2 papers 21 days ago

GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published 26 days ago • 34

Demystifing Video Reasoning

Paper • 2603.16870 • Published 23 days ago • 367

upvoted a paper 23 days ago

Learning Latent Proxies for Controllable Single-Image Relighting

Paper • 2603.15555 • Published 24 days ago • 8

Harold Chen

AI & ML interests

Recent Activity

Organizations

Harold328's activity