Post
236
You can get flash-attention 3 ⚡️ directly from the hub now using
kernels-community/flash-attn3
kernels
!kernels-community/flash-attn3
None defined yet.
kernels
!kernelize
will pick the kernel depending on whether you are going to do training or inference.
kernels
. kernels
makes it possible to load compute kernels directly from the Hub! 🚀torch.compile
support.transformers
🚀generate
interface 🤓transformers
with minimal effort. You'll also have access to all Hub features: a landing page for your creation, discussions, usage metrics, ... 🤓