Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model
Abstract
Normalized Attention Guidance (NAG) enhances diffusion models by providing effective negative guidance across regimes and modalities without retraining.
Negative guidance -- explicitly suppressing unwanted attributes -- remains a fundamental challenge in diffusion models, particularly in few-step sampling regimes. While Classifier-Free Guidance (CFG) works well in standard settings, it fails under aggressive sampling step compression due to divergent predictions between positive and negative branches. We present Normalized Attention Guidance (NAG), an efficient, training-free mechanism that applies extrapolation in attention space with L1-based normalization and refinement. NAG restores effective negative guidance where CFG collapses while maintaining fidelity. Unlike existing approaches, NAG generalizes across architectures (UNet, DiT), sampling regimes (few-step, multi-step), and modalities (image, video), functioning as a universal plug-in with minimal computational overhead. Through extensive experimentation, we demonstrate consistent improvements in text alignment (CLIP Score), fidelity (FID, PFID), and human-perceived quality (ImageReward). Our ablation studies validate each design component, while user studies confirm significant preference for NAG-guided outputs. As a model-agnostic inference-time approach requiring no retraining, NAG provides effortless negative guidance for all modern diffusion frameworks -- pseudocode in the Appendix!
Community
Project Page: https://chendaryen.github.io/NAG.github.io/
Arxiv Paper : https://arxiv.org/abs/2505.21179
Online Demo: https://huggingface.co/spaces/ChenDY/NAG_FLUX.1-schnell, https://huggingface.co/spaces/ChenDY/NAG_FLUX.1-dev
TL;DR:
- We introduce NAG, a universal, training-free attention guidance method that provides stable, controllable negative guidance across the diffusion model ecosystem.
- We restore effective negative guidance in few-step diffusion models where traditional CFG fails completely, while also enhancing negative control in multi-step diffusion when integrated with existing guidance methods.
- We validate NAG's generalization to video diffusion without domain-specific modifications, improving both semantic alignment and motion characteristics through effective negative guidance.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering (2025)
- Interactive Video Generation via Domain Adaptation (2025)
- SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training (2025)
- Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models (2025)
- Few-Step Diffusion via Score identity Distillation (2025)
- Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking (2025)
- Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper