10 48 47

Yuhang Zang

yuhangzang

https://yuhangzang.github.io/

AI & ML interests

🤗 HuggingFace is all you need

Recent Activity

liked a Space 3 days ago

CodeGoat24/UniGenBench_Leaderboard

authored a paper 18 days ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

upvoted a paper 19 days ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

View all activity

Organizations

authored a paper 18 days ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published 19 days ago • 47

authored a paper 19 days ago

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published 24 days ago • 62

authored a paper about 1 month ago

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Paper • 2507.15852 • Published Jul 21 • 38

authored a paper 2 months ago

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published Jun 24 • 26

authored 3 papers 3 months ago

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings

Paper • 2506.04997 • Published Jun 5

Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning

Paper • 2505.14677 • Published May 20 • 15

Visual Agentic Reinforcement Fine-Tuning

Paper • 2505.14246 • Published May 20 • 32

authored a paper 4 months ago

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published May 6 • 94

authored 2 papers 5 months ago

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published Apr 10 • 34

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Paper • 2504.06232 • Published Apr 8 • 14

authored 4 papers 6 months ago

authored 5 papers 7 months ago

WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation

Paper • 2407.02165 • Published Jul 2, 2024

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Paper • 2407.11691 • Published Jul 16, 2024 • 14

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 66

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published Jan 21 • 46

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 44

authored a paper 8 months ago

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Paper • 2501.03218 • Published Jan 6 • 37

Yuhang Zang

AI & ML interests

Recent Activity

Organizations

yuhangzang's activity