metadata
library_name: transformers
tags:
- language-model
license: odc-by
datasets:
- HuggingFaceFW/fineweb-edu
language:
- en
Model Card for AICrossSim/bitflip-clm-600m
A 600M parameter bitflip-aware language model trained on 22 * 600M
tokens from FineWeb-Edu dataset.
Model Details
bitflip-aixsim-600M is a transformer-based language model with approximately 600 million parameters (embedding layer params excluded). It uses RMSNorm for normalization and is trained on the FineWeb-Edu dataset.
- Developed by: AICrossSim
- Funded by: ARIA
- Model type: Transformer Language Model
- Language(s) (NLP): English
- Tokenizer: HuggingFaceTB/cosmo2-tokenizer
- Repository: AICrossSim/NewComputeBench
Training Details
Experiment setup and training logs can be found at wandb run.