Transformers
Inference Endpoints

Doge-tokenizer

Tokenizer for the training model on smollm-corpus, and support reasoning fine-tuning like R1. This tokenizer was trained on 2M samples from:

  • FineWeb-Edu 70%
  • Cosmopedia v2 20%
  • Python-Edu 5%
  • FineMath 5%
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Dataset used to train SmallDoge/Doge-tokenizer