Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning Paper • 2306.00477 • Published Jun 1, 2023 • 1
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token Paper • 2211.04898 • Published Nov 9, 2022
3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability Paper • 2409.00119 • Published Aug 28, 2024
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation Paper • 2505.06027 • Published May 9 • 18
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments Paper • 2504.02918 • Published Apr 3
DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search Paper • 2410.14609 • Published Oct 18, 2024 • 1
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks Paper • 2406.18403 • Published Jun 26, 2024
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published Jan 31 • 40
Gradient-based Parameter Selection for Efficient Fine-Tuning Paper • 2312.10136 • Published Dec 15, 2023 • 1
Cross-modal Information Flow in Multimodal Large Language Models Paper • 2411.18620 • Published Nov 27, 2024 • 2
Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective Paper • 2411.18615 • Published Nov 27, 2024 • 1
The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models Paper • 2406.19999 • Published Jun 28, 2024 • 5
The LAMBADA dataset: Word prediction requiring a broad discourse context Paper • 1606.06031 • Published Jun 20, 2016
Interpretable Word Sense Representations via Definition Generation: The Case of Semantic Change Analysis Paper • 2305.11993 • Published May 19, 2023