7 11 1

Siteng Huang

huangsiteng

https://kyonhuang.top/

AI & ML interests

vision-language models

Recent Activity

upvoted a paper about 1 month ago

WorldVLA: Towards Autoregressive Action World Model

upvoted a paper 2 months ago

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

upvoted a paper 2 months ago

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26 • 39

upvoted 2 papers 2 months ago

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 146

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

Paper • 2505.15791 • Published May 21 • 5

commented a paper 2 months ago

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

Paper • 2505.15791 • Published May 21 • 5 •

upvoted a paper 2 months ago

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Paper • 2505.12448 • Published May 18 • 10

commented a paper 2 months ago

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Paper • 2505.12448 • Published May 18 • 10 •

upvoted a paper 3 months ago

OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation

Paper • 2505.03912 • Published May 6 • 8

commented a paper 3 months ago

OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation

Paper • 2505.03912 • Published May 6 • 8 •

upvoted a paper 4 months ago

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Paper • 2503.22655 • Published Mar 28 • 40

authored a paper 4 months ago

Exploring the Evolution of Physics Cognition in Video Generation: A Survey

Paper • 2503.21765 • Published Mar 27 • 11

upvoted a paper 4 months ago

Exploring the Evolution of Physics Cognition in Video Generation: A Survey

Paper • 2503.21765 • Published Mar 27 • 11

commented a paper 4 months ago

Exploring the Evolution of Physics Cognition in Video Generation: A Survey

Paper • 2503.21765 • Published Mar 27 • 11 •

authored 3 papers 8 months ago

Accelerating Diffusion Transformers with Token-wise Feature Caching

Paper • 2410.05317 • Published Oct 5, 2024

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Paper • 2411.17686 • Published Nov 26, 2024 • 21

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Paper • 2412.06782 • Published Dec 9, 2024 • 7

upvoted a paper 8 months ago

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Paper • 2412.06782 • Published Dec 9, 2024 • 7

commented a paper 8 months ago

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Paper • 2412.06782 • Published Dec 9, 2024 • 7 •

upvoted a paper 8 months ago

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Paper • 2411.17686 • Published Nov 26, 2024 • 21

commented a paper 8 months ago

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Paper • 2411.17686 • Published Nov 26, 2024 • 21 •

authored a paper 11 months ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Paper • 2409.07239 • Published Sep 11, 2024 • 15

Siteng Huang

AI & ML interests

Recent Activity

Organizations

huangsiteng's activity