Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 103
Causal Diffusion Transformers for Generative Modeling Paper • 2412.12095 • Published Dec 16, 2024 • 23
Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging Paper • 2504.08635 • Published Apr 11 • 5
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation Paper • 2504.09454 • Published Apr 13 • 12
Efficient Generative Model Training via Embedded Representation Warmup Paper • 2504.10188 • Published Apr 14 • 12
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published 18 days ago • 26