3 12 3

Chi Chen

carboncoo

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes

authored a paper 24 days ago

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

upvoted a paper 24 days ago

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

View all activity

Organizations

carboncoo's activity

upvoted a paper 4 days ago

An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes

Paper • 2504.15270 • Published 5 days ago • 10

authored a paper 24 days ago

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

Paper • 2503.23733 • Published 26 days ago • 11

upvoted a paper 24 days ago

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

Paper • 2503.23733 • Published 26 days ago • 11

commented a paper 24 days ago

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

Paper • 2503.23733 • Published 26 days ago • 11 •

upvoted a paper about 1 month ago

Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Paper • 2503.12303 • Published Mar 16 • 7

authored a paper about 1 month ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published Mar 17 • 30

upvoted a paper about 1 month ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published Mar 17 • 30

commented a paper about 1 month ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published Mar 17 • 30 •

upvoted a paper 3 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Paper • 2501.06598 • Published Jan 11 • 1

authored 2 papers 3 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Paper • 2501.06598 • Published Jan 11 • 1

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 30

upvoted a paper 3 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 30

commented a paper 3 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 30 •

authored a paper 4 months ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18

upvoted 4 papers 4 months ago

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Paper • 2411.03628 • Published Nov 6, 2024 • 2

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Paper • 2308.13437 • Published Aug 25, 2023 • 4

Densing Law of LLMs

Paper • 2412.04315 • Published Dec 5, 2024 • 19

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18

liked a dataset 5 months ago

mjuicem/StreamingBench

Viewer • Updated Nov 15, 2024 • 4.55k • 2.14k • 6

authored a paper 5 months ago

Mask-Align: Self-Supervised Neural Word Alignment

Paper • 2012.07162 • Published Dec 13, 2020