license: apache-2.0 | |
datasets: | |
- HuggingFaceTB/smollm-corpus | |
language: | |
- en | |
library_name: transformers | |
pipeline_tag: text2text-generation | |
tags: | |
- fineweb | |
- t5 | |
- 1024 ctx | |
- SiLU activations | |
- `fineweb-edu-dedup` split of `HuggingFaceTB/smollm-corpus` | |
## plots | |
training loss | |
![loss](./checkpoints/loss_over_steps.png) | |
grad norm | |
![grad](./checkpoints/grad_l2_over_steps.png) | |
weights norm | |
![weights](./checkpoints/weights_l2_over_steps.png) | |
--- |