Bingus-v0.1-60M-Base

A not-so-state-of-the-art 60M parameter transformer model.
Uses the olmo default architecture.

Specs

Heads: 8
Layers: 8
Dimension model: 512
Dimension mlp: 4096

eval/v3-small-c4_en-validation/Perplexity: 40.33

Training Data

Pretraining:

  • 5B Tokens C4 (preprocessed, from olmo-data.org)
Downloads last month
94
Safetensors
Model size
85.1M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for KT313/Bingus-v0.1-60M-Base

Quantizations
1 model

Dataset used to train KT313/Bingus-v0.1-60M-Base