GLA - a fla-hub Collection

fla-hub 's Collections

GSA

HGRN2

GLA

RWKV6

RetNet

Mamba

HGRN

Qwen2.5

GLA

updated Mar 18

fla-hub/gla-1.3B-100B

Text Generation • Updated Feb 9 • 661 • 1
fla-hub/gla-2.7B-100B

Text Generation • Updated Feb 9 • 17
Gated Linear Attention Transformers with Hardware-Efficient Training

Paper • 2312.06635 • Published Dec 11, 2023 • 7