Shuo Xing's picture

4 14 5

Shuo Xing

shuoxing

·

https://shuoxing98.github.io/

ShuoXing98

AI & ML interests

MLLMs, LLMs

Recent Activity

updated a collection 8 days ago

MLLM Reasoning, Rewarding, and Understanding

updated a collection 8 days ago

MLLM Reasoning, Rewarding, and Understanding

updated a collection 8 days ago

MLLM Reasoning, Rewarding, and Understanding

View all activity

Organizations

upvoted a paper about 1 month ago

4KAgent: Agentic Any Image to 4K Super-Resolution

Paper • 2507.07105 • Published Jul 9 • 98

upvoted a paper about 2 months ago

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Paper • 2506.15645 • Published Jun 18 • 4

upvoted a paper 2 months ago

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Paper • 2506.07564 • Published Jun 9 • 6

upvoted 6 papers 3 months ago

Generative AI for Autonomous Driving: Frontiers and Opportunities

Paper • 2505.08854 • Published May 13 • 1

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 135

MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning

Paper • 2505.24871 • Published May 30 • 22

DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models

Paper • 2505.24025 • Published May 29 • 27

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Paper • 2505.23359 • Published May 29 • 40

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 22

upvoted a paper 4 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

upvoted a collection 5 months ago

mechanistic interpretability with sparse autoencoders

A collection of papers that I found useful for learning about using Sparse Autoencoders for finding interpretable features in language models • 9 items • Updated Sep 3, 2024 • 3

upvoted 2 papers 5 months ago

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Paper • 2503.24381 • Published Mar 31 • 1

Can Large Vision Language Models Read Maps Like a Human?

Paper • 2503.14607 • Published Mar 18 • 10

upvoted a paper 6 months ago

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Paper • 2502.13146 • Published Feb 18 • 1