Charles Cai

charlescai2016

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Intern-S1: A Scientific Multimodal Foundation Model

upvoted a paper 2 days ago

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

upvoted an article 4 days ago

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

View all activity

Organizations

upvoted 2 papers 2 days ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published 5 days ago • 227

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 6 days ago • 34

upvoted an article 4 days ago

Article

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

and 1 other •

Oct 14, 2024

• 96

upvoted a paper 4 days ago

Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization

Paper • 2508.14811 • Published 6 days ago • 39

upvoted an article 6 days ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 658

upvoted a paper 9 days ago

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published 13 days ago • 51

upvoted an article 9 days ago

Article

🪆 Introduction to Matryoshka Embedding Models

and 2 others •

Feb 23, 2024

• 157

upvoted 2 papers 10 days ago

A Survey on Diffusion Language Models

Paper • 2508.10875 • Published 12 days ago • 33

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published 12 days ago • 136

upvoted a paper 17 days ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

upvoted an article 18 days ago

Article

Vision Language Model Alignment in TRL ⚡️

and 4 others •

19 days ago

• 73

upvoted a paper 23 days ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 64

upvoted 2 papers 26 days ago

ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents

Paper • 2507.22827 • Published 27 days ago • 94

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published 29 days ago • 31

upvoted a paper 28 days ago

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Paper • 2507.18553 • Published Jul 24 • 39

upvoted 3 papers about 1 month ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 126

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 128

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Paper • 2507.07095 • Published Jul 9 • 54

upvoted 2 papers 2 months ago

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19 • 87

Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Paper • 2506.09482 • Published Jun 11 • 46

Charles Cai

AI & ML interests

Recent Activity

Organizations

charlescai2016's activity

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

Uncensor any LLM with abliteration

🪆 Introduction to Matryoshka Embedding Models

Vision Language Model Alignment in TRL ⚡️