BAAI
/

ldwang's picture
Update README.md
1515c18 verified
metadata
base_model:
  - BAAI/OpenSeek-Small-v1
license: open-mdw

OpenSeek-Small-v1-SFT Documentation

Overview

We adopt the Octothinker to build strong reasoning foundations. Our model's training consists of two phases: a mid-training stable phase on 200 billion tokens from a mathematical corpus, followed by a 20 billion token decay phase. Subsequently, we fine-tune the model on the Infinity-Instruct dataset to achieve superior instruction-following capabilities. This model is open-sourced as a baseline for future experiments, such as enhancing the reasoning capabilities of small models through reinforcement learning. The model architecture is the same as the OpenSeek-Small-v1 model.

Evaluation

Metric GSM8K MATH-500 Minerva Math OlympiadBench Avg.
Pass@1 20.698 13.100 3.470 2.741 10.002
Pass@4 41.768 19.100 8.415 4.997 18.570
Pass@8 51.838 19.599 11.680 5.185 22.075

License

OpenMDW 1.0