Jie Shao's picture

2 8 4

Jie Shao

hehesang

·

http://www.lamda.nju.edu.cn/shaoj/

hehesangsj

AI & ML interests

computer vision, ai for science

Recent Activity

liked a model 17 days ago

AIDC-AI/Ovis-U1-3B

liked a dataset 21 days ago

OpenGVLab/MMBench-GUI

upvoted a paper 2 months ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

View all activity

Organizations

upvoted a paper 2 months ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 79

upvoted 2 papers 3 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 277

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published Apr 3 • 70

upvoted 2 papers 4 months ago

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25 • 52

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 37

upvoted a paper 7 months ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published Dec 12, 2024 • 39

upvoted a paper 8 months ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 82

upvoted a paper about 1 year ago

Needle In A Multimodal Haystack

Paper • 2406.07230 • Published Jun 11, 2024 • 55