5 27 7

Kehan Li

lkhl

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

Kwai Keye-VL Technical Report

upvoted a paper 20 days ago

WorldVLA: Towards Autoregressive Action World Model

upvoted a paper 22 days ago

Unified Vision-Language-Action Model

View all activity

Organizations

upvoted a paper 14 days ago

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published 14 days ago • 121

upvoted a paper 20 days ago

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published 20 days ago • 39

upvoted a paper 22 days ago

Unified Vision-Language-Action Model

Paper • 2506.19850 • Published 22 days ago • 25

upvoted a paper 24 days ago

Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details

Paper • 2506.16504 • Published 27 days ago • 23

upvoted a paper 27 days ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Paper • 2506.15681 • Published 28 days ago • 37

upvoted 2 papers about 1 month ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published Jun 8 • 108

EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?

Paper • 2506.05287 • Published Jun 5 • 15

upvoted 3 papers about 2 months ago

upvoted a paper 2 months ago

SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations

Paper • 2505.02094 • Published May 4 • 19

upvoted an article 3 months ago

Article

Efficient MoE Align & Sort design in SGLang Fused MoE

•

Mar 25

• 3

upvoted a paper 3 months ago

OpenCodeReasoning: Advancing Data Distillation for Competitive Coding

Paper • 2504.01943 • Published Apr 2 • 15

upvoted an article 4 months ago

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

and 3 others •

Jun 13, 2024

• 55

upvoted a paper 5 months ago

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

Paper • 2502.13922 • Published Feb 19 • 28

upvoted a collection 6 months ago

VideoLLaMA3

Collection

Frontier Multimodal Foundation Models for Video Understanding • 14 items • Updated 27 days ago • 14

upvoted 4 papers 6 months ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 91

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 295

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 48

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107