Papers
arxiv:2506.14202

DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion

Published on Jun 17
· Submitted by mkshing on Jun 17

Abstract

A novel training framework called DiffusionBlocks optimizes neural network blocks as denoising operations in a diffusion process, achieving memory efficiency and competitive performance in generative tasks.

AI-generated summary

Training large neural networks with end-to-end backpropagation creates significant memory bottlenecks, limiting accessibility to state-of-the-art AI research. We propose DiffusionBlocks, a novel training framework that interprets neural network blocks as performing denoising operations in a continuous-time diffusion process. By partitioning the network into independently trainable blocks and optimizing noise level assignments based on equal cumulative probability mass, our approach achieves significant memory efficiency while maintaining competitive performance compared to traditional backpropagation in generative tasks. Experiments on image generation and language modeling tasks demonstrate memory reduction proportional to the number of blocks while achieving superior performance. DiffusionBlocks provides a promising pathway for democratizing access to large-scale neural network training with limited computational resources.

Community

Paper author Paper submitter

We propose DiffusionBlocks, a novel training framework that eliminates end-to-end backpropagation by interpreting neural network blocks as denoising operations in a continuous-time diffusion process.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.14202 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.14202 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.14202 in a Space README.md to link it from this page.

Collections including this paper 2