This CUDA extension implements fused dropout + residual + LayerNorm from the flash-attention repo.
- Downloads last month
- -
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support
This CUDA extension implements fused dropout + residual + LayerNorm from the flash-attention repo.