license: apache-2.0 | |
datasets: | |
- HuggingFaceTB/smollm-corpus | |
language: | |
- en | |
library_name: transformers | |
pipeline_tag: text2text-generation | |
tags: | |
- fineweb | |
- t5 | |
- 1024 ctx | |
- SiLU activations | |
- `fineweb-edu-dedup` split of `HuggingFaceTB/smollm-corpus` | |
## plots | |
training loss | |
 | |
grad norm | |
 | |
weights norm | |
 | |
--- |