Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
hazyresearch 's Collections
EVAPORATE SUITE
Weaver
Kitchen Sync
GTE
LoLCATS
based
M2-BERT Embeddings
Just Read Twice!

based

updated Oct 18, 2024

These language model checkpoints are trained at the 360M and 1.3Bn parameter scales for up to 50Bn tokens on the Pile corpus, for research purposes.

Upvote
9

  • hazyresearch/based-360m

    Updated Apr 21, 2024 • 39 • 2

  • hazyresearch/attn-360m

    Updated Apr 21, 2024 • 2 • 1

  • hazyresearch/mamba-360m

    Updated Apr 21, 2024 • 6

  • hazyresearch/based-1b

    Text Generation • Updated Apr 19 • 3 • 8

  • hazyresearch/mamba-1b

    Updated Apr 20, 2024 • 13 • 1

  • Simple linear attention language models balance the recall-throughput tradeoff

    Paper • 2402.18668 • Published Feb 28, 2024 • 21

  • hazyresearch/attn-1b

    Updated Apr 20, 2024 • 6

  • hazyresearch/based-fda

    Viewer • Updated May 19, 2024 • 1.1k • 3.42k • 3

  • hazyresearch/based-squad

    Viewer • Updated Jun 1, 2024 • 2.98k • 2.34k • 2

  • hazyresearch/based-swde

    Viewer • Updated May 19, 2024 • 1.11k • 2.88k • 4

  • hazyresearch/based-1b-50b

    Updated May 5, 2024 • 8 • 1

  • hazyresearch/mamba-1b-50b

    Updated Apr 20, 2024 • 4

  • hazyresearch/attn-1b-50bn

    Updated May 2, 2024 • 1

  • hazyresearch/my-awesome-model

    Updated Oct 18, 2024 • 5 • 2
Upvote
9
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs