NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models
Paper
β’
2506.07731
β’
Published
β’
2
None defined yet.
mamba
is now available in transformers. Thanks to
@tridao
and
@albertgu
for this brilliant model! π and the amazing mamba-ssm
kernels powering this!