35 75 110

Li Dong

unilm

AI & ML interests

Language Model Pre-Training

Recent Activity

updated a Space 1 day ago

microsoft/VibeVoice-ASR

liked a Space 1 day ago

microsoft/VibeVoice-ASR

published a Space 1 day ago

microsoft/VibeVoice-ASR

View all activity

Organizations

authored a paper 3 days ago

LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published 7 days ago • 80

authored 4 papers 9 days ago

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Paper • 2412.07067 • Published Dec 10, 2024

submitted a paper to Daily Papers 9 days ago

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published 16 days ago • 38

authored 7 papers 3 months ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 51

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Paper • 2509.22613 • Published Sep 26, 2025 • 10

DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13, 2025 • 27

Information-Preserving Reformulation of Reasoning Traces for Antidistillation

Paper • 2510.11545 • Published Oct 13, 2025 • 2

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 58

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

Paper • 2510.24514 • Published Oct 28, 2025 • 22

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published Oct 30, 2025 • 28

authored 2 papers 4 months ago

AdaPrompt: Adaptive Model Training for Prompt-based NLP

Paper • 2202.04824 • Published Feb 10, 2022

Thinking Augmented Pre-training

Paper • 2509.20186 • Published Sep 24, 2025 • 23

authored 4 papers 5 months ago

SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

Paper • 2506.08889 • Published Jun 10, 2025 • 23

Model as a Game: On Numerical and Spatial Consistency for Generative Games

Paper • 2503.21172 • Published Mar 27, 2025

Data Efficacy for Language Model Training

Paper • 2506.21545 • Published Jun 26, 2025 • 11

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

authored a paper 8 months ago

Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20, 2025 • 20

Li Dong

AI & ML interests

Recent Activity

Organizations

unilm's activity