Yuancheng Wang's picture

Yuancheng Wang

Hecheng0625

·

https://hecheng0625.github.io/

Hecheng0625

AI & ML interests

ML, DL, Speech, Audio, NLP

Recent Activity

upvoted a paper about 12 hours ago

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

authored a paper 2 days ago

AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models

authored a paper 2 days ago

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

View all activity

Organizations

upvoted a paper about 12 hours ago

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

Paper • 2508.16790 • Published 7 days ago • 3

authored 12 papers 2 days ago

AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models

Paper • 2304.00830 • Published Apr 3, 2023 • 2

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5, 2024 • 39

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Paper • 2404.03204 • Published Apr 4, 2024 • 10

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Paper • 2406.13340 • Published Jun 19, 2024

Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

Paper • 2407.05361 • Published Jul 7, 2024 • 2

MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

Paper • 2409.00750 • Published Sep 1, 2024 • 4

Overview of the Amphion Toolkit (v0.2)

Paper • 2501.15442 • Published Jan 26 • 3

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Paper • 2502.03128 • Published Feb 5 • 1

AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement

Paper • 2501.15417 • Published Jan 26 • 1

DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec for Speech Generation

Paper • 2505.13000 • Published May 19 • 1

NVSpeech: An Integrated and Scalable Pipeline for Human-Like Speech Modeling with Paralinguistic Vocalizations

Paper • 2508.04195 • Published 23 days ago • 1

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

Paper • 2508.16790 • Published 7 days ago • 3

New activity in amphion/TaDiCodec-TTS-MGM 3 days ago

Add `library_name: transformers` to metadata

#1 opened 3 days ago by

New activity in amphion/TaDiCodec-TTS-AR-Qwen2.5-0.5B 3 days ago

Add library_name metadata for Transformers compatibility

#1 opened 3 days ago by

New activity in amphion/TaDiCodec-TTS-AR-Qwen2.5-3B 3 days ago

Add `library_name: transformers` to model card metadata

#1 opened 3 days ago by

New activity in amphion/TaDiCodec 3 days ago

Improve model card: Add library_name, paper/project/GitHub links, and full abstract

#1 opened 3 days ago by

commented a paper 3 days ago

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

Paper • 2508.16790 • Published 7 days ago • 3 •

updated 2 models 3 days ago

amphion/TaDiCodec-TTS-AR-Qwen2.5-0.5B

Text-to-Speech • 0.5B • Updated 3 days ago • 34 • 3

amphion/TaDiCodec-TTS-AR-Qwen2.5-3B

Text-to-Speech • 3B • Updated 3 days ago • 32 • 2