# This model is from the paper arxiv.org/abs/2504.20966
# Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

See code: https://github.com/zaydzuhri/softpick-attention

This model is only usable through these repositories:
https://github.com/zaydzuhri/flash-linear-attention/tree/softpick-attention
https://github.com/zaydzuhri/flame/tree/softpick-attention