Pengcheng He's picture

2 5

Pengcheng He

DeBERTa

·

https://github.com/microsoft/DeBERTa

BigBird01

AI & ML interests

None yet

Recent Activity

authored a paper 7 days ago

Chain of Draft: Thinking Faster by Writing Less

authored a paper 7 months ago

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

authored a paper 7 months ago

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

View all activity

Organizations

DeBERTa's activity

authored a paper 7 days ago

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published 12 days ago • 44

authored 19 papers 7 months ago

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Paper • 2210.01351 • Published Oct 4, 2022 • 2

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

Paper • 2208.09770 • Published Aug 21, 2022

GODEL: Large-Scale Pre-Training for Goal-Directed Dialog

Paper • 2206.11309 • Published Jun 22, 2022 • 1

Diffusion-GAN: Training GANs with Diffusion

Paper • 2206.02262 • Published Jun 5, 2022

Generation-Augmented Retrieval for Open-domain Question Answering

Paper • 2009.08553 • Published Sep 17, 2020

On the Variance of the Adaptive Learning Rate and Beyond

Paper • 1908.03265 • Published Aug 8, 2019

In-Context Learning Unlocked for Diffusion Models

Paper • 2305.01115 • Published May 1, 2023

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

Paper • 2305.00350 • Published Apr 29, 2023

Instruction Tuning with GPT-4

Paper • 2304.03277 • Published Apr 6, 2023

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Paper • 2303.10512 • Published Mar 18, 2023 • 2

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

Paper • 2302.12813 • Published Feb 24, 2023 • 1

Guiding Large Language Models via Directional Stimulus Prompting

Paper • 2302.11520 • Published Feb 22, 2023

Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

Paper • 2310.06389 • Published Oct 10, 2023 • 1

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Paper • 1911.03437 • Published Nov 8, 2019

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

Paper • 2310.11451 • Published Oct 17, 2023

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3

Query Rewriting for Retrieval-Augmented Large Language Models

Paper • 2305.14283 • Published May 23, 2023

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

Paper • 2111.09543 • Published Nov 18, 2021 • 2

Deep Reinforcement Learning from Hierarchical Weak Preference Feedback

Paper • 2309.02632 • Published Sep 6, 2023 • 1