DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion
Abstract
A novel training framework called DiffusionBlocks optimizes neural network blocks as denoising operations in a diffusion process, achieving memory efficiency and competitive performance in generative tasks.
Training large neural networks with end-to-end backpropagation creates significant memory bottlenecks, limiting accessibility to state-of-the-art AI research. We propose DiffusionBlocks, a novel training framework that interprets neural network blocks as performing denoising operations in a continuous-time diffusion process. By partitioning the network into independently trainable blocks and optimizing noise level assignments based on equal cumulative probability mass, our approach achieves significant memory efficiency while maintaining competitive performance compared to traditional backpropagation in generative tasks. Experiments on image generation and language modeling tasks demonstrate memory reduction proportional to the number of blocks while achieving superior performance. DiffusionBlocks provides a promising pathway for democratizing access to large-scale neural network training with limited computational resources.
Community
We propose DiffusionBlocks, a novel training framework that eliminates end-to-end backpropagation by interpreting neural network blocks as denoising operations in a continuous-time diffusion process.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Diffusion Models with Double Guidance: Generate with aggregated datasets (2025)
- DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling (2025)
- Evolution Meets Diffusion: Efficient Neural Architecture Generation (2025)
- DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning (2025)
- Identifying Memorization of Diffusion Models Through p-Laplace Analysis (2025)
- Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches (2025)
- Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper