Xilin Jiang's picture

Xilin Jiang

xi-j

·

xi-j

AI & ML interests

None yet

Organizations

upvoted a paper 4 months ago

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking

Paper • 2601.17645 • Published Jan 25 • 23

upvoted a paper 10 months ago

DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis

Paper • 2507.14988 • Published Jul 20, 2025 • 8

upvoted 6 papers about 1 year ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 172

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Paper • 2503.10613 • Published Mar 13, 2025 • 79

S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information

Paper • 2503.05085 • Published Mar 7, 2025 • 47

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7, 2025 • 124

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

Paper • 2502.16794 • Published Feb 24, 2025 • 5

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19, 2025 • 69

upvoted 11 papers over 1 year ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24, 2025 • 77

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9, 2025 • 98

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published Jan 9, 2025 • 62

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10, 2025 • 53

UniMuMo: Unified Text, Music and Motion Generation

Paper • 2410.04534 • Published Oct 6, 2024 • 19

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 182

Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7, 2024 • 18

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26, 2024 • 44

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 62

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 42

upvoted a paper almost 2 years ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 118