Joya Chen's picture

Joya Chen PRO

chenjoya

·

https://chenjoya.github.io/

chenjoya

AI & ML interests

Video LLM

Recent Activity

upvoted a paper 12 days ago

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

upvoted a paper 12 days ago

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

upvoted a paper 19 days ago

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

View all activity

Organizations

authored 6 papers 8 months ago

Is Heuristic Sampling Necessary in Training Deep Object Detectors?

Paper • 1909.04868 • Published Sep 11, 2019

Bootstrapping SparseFormers from Vision Foundation Models

Paper • 2312.01987 • Published Dec 4, 2023 • 1

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Paper • 2306.08640 • Published Jun 14, 2023 • 26

Learning Video Context as Interleaved Multimodal Sequences

Paper • 2407.21757 • Published Jul 31, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation

Paper • 2408.16730 • Published Aug 29, 2024

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11, 2025 • 154

authored a paper 9 months ago

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published Apr 22, 2025 • 36

authored 2 papers over 1 year ago

One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

Paper • 2409.19603 • Published Sep 29, 2024 • 19

VideoLLM-online: Online Video Large Language Model for Streaming Video

Paper • 2406.11816 • Published Jun 17, 2024 • 26

authored a paper over 2 years ago

UniVTG: Towards Unified Video-Language Temporal Grounding

Paper • 2307.16715 • Published Jul 31, 2023 • 11