Prebuilt Wheels | Python Versions | PyTorch Versions | CUDA Versions | Source |
---|---|---|---|---|
Flash-Attention 2.7.4.post1 | 3.12 | 2.8.0.dev | 12.8.1 | Dao-AILab/flash-attention |
SageAttention2.2.0 | 3.12 | 2.8.0.dev | 12.8.1 | jt-zhang/SageAttention2_plus |
Flash-Attention_2.8.1 | 3.12 | 2.9.0.dev | 12.9.1 | Dao-AILab/flash-attention |
xformers_0.0.31.post1 | 3.12 | 2.9.0.dev | 12.9.1 | facebookresearch/xformers |
SageAttention3 (pending official release) | 3.12 | 2.9.0.dev | 12.9.1 | INSERT |
INSERT | INSERT | INSERT | INSERT | INSERT |
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support