metadata

base_model:
  - BAAI/OpenSeek-Small-v1
license: open-mdw

OpenSeek-Small-v1-SFT Documentation

Overview

We adopt the Octothinker to build strong reasoning foundations. Our model's training consists of two phases: a mid-training stable phase on 200 billion tokens from a mathematical corpus, followed by a 20 billion token decay phase. Subsequently, we fine-tune the model on the Infinity-Instruct dataset to achieve superior instruction-following capabilities. This model is open-sourced as a baseline for future experiments, such as enhancing the reasoning capabilities of small models through reinforcement learning. The model architecture is the same as the OpenSeek-Small-v1 model.

Evaluation

Metric	GSM8K	MATH-500	Minerva Math	OlympiadBench	Avg.
Pass@1	20.698	13.100	3.470	2.741	10.002
Pass@4	41.768	19.100	8.415	4.997	18.570
Pass@8	51.838	19.599	11.680	5.185	22.075

License

OpenMDW 1.0