English

Latent SAE

A series of SAEs trained on embeddings from nomic-embed-text-v1.5

The SAEs were trained on the 100BT sample of Fineweb-EDU, see an example of the 10BT sample of Fineweb-Edu.

Run the models or train your own with Latent SAE which is heavily borrowing from https://github.com/EleutherAI/sae

Training

The models were trained using Modal Labs infrastructure with the command:

modal run train_modal.py --batch-size 512 --grad-acc-steps 4 --k 64 --expansion-factor 32

Error and dead latents charts can be seen here: image/png

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train enjalot/sae-nomic-text-v1.5-FineWeb-edu-100BT