5 29 4

Zhengyuan Yang PRO

zyang39

https://zhengyuan.info/

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

Computer-Use Agents as Judges for Generative User Interface

upvoted a paper about 1 month ago

Computer-Use Agents as Judges for Generative User Interface

upvoted a paper 3 months ago

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

View all activity

Organizations

upvoted a paper about 1 month ago

Computer-Use Agents as Judges for Generative User Interface

Paper • 2511.15567 • Published Nov 19, 2025 • 52

upvoted 3 papers 3 months ago

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

Paper • 2510.06917 • Published Oct 8, 2025 • 34

EdiVal-Agent: An Object-Centric Framework for Automated, Scalable, Fine-Grained Evaluation of Multi-Turn Editing

Paper • 2509.13399 • Published Sep 16, 2025 • 5

InfoAgent: Advancing Autonomous Information-Seeking Agents

Paper • 2509.25189 • Published Sep 29, 2025 • 11

upvoted a paper 4 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 84

upvoted a paper 6 months ago

STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models

Paper • 2507.15375 • Published Jul 21, 2025 • 30

upvoted 2 papers 7 months ago

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Paper • 2506.10128 • Published Jun 11, 2025 • 22

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

Paper • 2506.04633 • Published Jun 5, 2025 • 19

upvoted a paper 8 months ago

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Paper • 2505.08617 • Published May 13, 2025 • 41

upvoted 2 papers 9 months ago

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Paper • 2504.07934 • Published Apr 10, 2025 • 20

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Paper • 2504.06148 • Published Apr 8, 2025 • 13

upvoted a paper 11 months ago

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11, 2025 • 45

upvoted a paper 12 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 433

upvoted 4 papers about 1 year ago

OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation

Paper • 2412.09585 • Published Dec 12, 2024 • 11

Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension

Paper • 2412.03704 • Published Dec 4, 2024 • 6

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published Nov 4, 2024 • 20

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Paper • 2410.23277 • Published Oct 30, 2024 • 9

upvoted 2 papers over 1 year ago

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

Paper • 2408.00765 • Published Aug 1, 2024 • 13

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Paper • 2404.16375 • Published Apr 25, 2024 • 18

upvoted a paper about 2 years ago

Interfacing Foundation Models' Embeddings

Paper • 2312.07532 • Published Dec 12, 2023 • 12

Zhengyuan Yang PRO

AI & ML interests

Recent Activity

Organizations

zyang39's activity