Heeding the Inner Voice: Aligning ControlNet Training via Intermediate Features Feedback Paper β’ 2507.02321 β’ Published 2 days ago β’ 33
DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization Paper β’ 2505.20975 β’ Published May 27 β’ 36
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper β’ 2506.06395 β’ Published 29 days ago β’ 126
Image Reconstruction as a Tool for Feature Analysis Paper β’ 2506.07803 β’ Published 26 days ago β’ 28
ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models Paper β’ 2505.22569 β’ Published May 28 β’ 56
cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning Paper β’ 2505.22914 β’ Published May 28 β’ 35
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper β’ 2504.06261 β’ Published Apr 8 β’ 110
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper β’ 2503.16660 β’ Published Mar 20 β’ 73
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper β’ 2503.13358 β’ Published Mar 17 β’ 96
A Primer on the Inner Workings of Transformer-based Language Models Paper β’ 2405.00208 β’ Published Apr 30, 2024 β’ 10
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper β’ 2502.15007 β’ Published Feb 20 β’ 175
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper β’ 2502.06394 β’ Published Feb 10 β’ 90
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others β’ Dec 19, 2024 β’ 660
CLEAR: Character Unlearning in Textual and Visual Modalities Paper β’ 2410.18057 β’ Published Oct 23, 2024 β’ 210
Mechanistic Permutability: Match Features Across Layers Paper β’ 2410.07656 β’ Published Oct 10, 2024 β’ 20
π Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized β’ 119 items β’ Updated 4 days ago β’ 107
view article Article PaliGemma β Google's Cutting-Edge Open Vision Language Model By merve and 2 others β’ May 14, 2024 β’ 255
Layerwise Recurrent Router for Mixture-of-Experts Paper β’ 2408.06793 β’ Published Aug 13, 2024 β’ 33
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper β’ 2402.10644 β’ Published Feb 16, 2024 β’ 82