Less is More: Task-aware Layer-wise Distillation for Language Model Compression Paper • 2210.01351 • Published Oct 4, 2022 • 2
Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization Paper • 2208.09770 • Published Aug 21, 2022
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog Paper • 2206.11309 • Published Jun 22, 2022 • 1
Generation-Augmented Retrieval for Open-domain Question Answering Paper • 2009.08553 • Published Sep 17, 2020
POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models Paper • 2305.00350 • Published Apr 29, 2023
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning Paper • 2303.10512 • Published Mar 18, 2023 • 2
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback Paper • 2302.12813 • Published Feb 24, 2023 • 1
Guiding Large Language Models via Directional Stimulus Prompting Paper • 2302.11520 • Published Feb 22, 2023
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling Paper • 2310.06389 • Published Oct 10, 2023 • 1
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization Paper • 1911.03437 • Published Nov 8, 2019
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective Paper • 2310.11451 • Published Oct 17, 2023
DeBERTa: Decoding-enhanced BERT with Disentangled Attention Paper • 2006.03654 • Published Jun 5, 2020 • 3
Query Rewriting for Retrieval-Augmented Large Language Models Paper • 2305.14283 • Published May 23, 2023
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing Paper • 2111.09543 • Published Nov 18, 2021 • 2
Deep Reinforcement Learning from Hierarchical Weak Preference Feedback Paper • 2309.02632 • Published Sep 6, 2023 • 1