Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 78
Learning diverse attacks on large language models for robust red-teaming and safety tuning Paper • 2405.18540 • Published May 28, 2024 • 1
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training Paper • 2503.18929 • Published Mar 24 • 4
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA Paper • 2505.12805 • Published May 19 • 22
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published Mar 10 • 32
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models Paper • 2502.12464 • Published Feb 18 • 28
Self-Supervised Dataset Distillation for Transfer Learning Paper • 2310.06511 • Published Oct 10, 2023
DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models Paper • 2305.16943 • Published May 26, 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks Paper • 2305.18395 • Published May 28, 2023 • 1
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation Paper • 2012.07280 • Published Dec 14, 2020
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries Paper • 2402.13043 • Published Feb 20, 2024 • 2
Self-Distillation for Further Pre-training of Transformers Paper • 2210.02871 • Published Sep 30, 2022 • 1
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models Paper • 2410.01524 • Published Oct 2, 2024 • 3
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation Paper • 2208.12401 • Published Aug 26, 2022 • 1