thomas-yanxin/XinYuan-Qwen2.5-7B-0917

The main purpose of this model is to validate the usability of thomas-yanxin/MT-SFT-ShareGPT, i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.

Here are the results from our OpenCompass evaluation：

Classification	Benchmarks	Models
	名称	XinYuan-Qwen2-7B
English	MMLU	73.72
	MMLU-Pro	/
	Theorem QA	/
	GPQA	33.04
	BBH	67.55
	IFEval (Prompt Strict-Acc.)	40.48
	ARC-C	91.19
Math	GSM8K	82.94
	MATH	41.06
Chinese	C-EVAL	81.02
	CMMLU	80.06
Code	MBPP	50.6
	HumanEval	83.99

thomas-yanxin
/

XinYuan-Qwen2.5-7B-0917

Dataset used to train thomas-yanxin/XinYuan-Qwen2.5-7B-0917