Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2 |
LocalDiT is a lightweight Diffusion Transformer model for high-quality text-to-image generation that incorporates local attention mechanisms to improve computational efficiency while maintaining generation quality.
|
3 |
|
4 |
# Model Description
|
5 |
-
LocalDiT builds upon the architecture of PixArt
|
6 |
|
7 |
- **Type**: Diffusion Transformer (DiT) with Local Attention
|
8 |
- **Parameters**: 0.52B
|
|
|
2 |
LocalDiT is a lightweight Diffusion Transformer model for high-quality text-to-image generation that incorporates local attention mechanisms to improve computational efficiency while maintaining generation quality.
|
3 |
|
4 |
# Model Description
|
5 |
+
LocalDiT builds upon the architecture of [PixArt-α](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS), introducing local attention mechanisms to reduce computational complexity and memory requirements. By processing image patches in local windows rather than with global attention, the model achieves faster inference and training while preserving image generation quality.
|
6 |
|
7 |
- **Type**: Diffusion Transformer (DiT) with Local Attention
|
8 |
- **Parameters**: 0.52B
|