shengliu66's picture
Update README.md
8d710c6 verified
|
raw
history blame
152 Bytes

Base Model: ReasoningEval/DeepSeek-R1-Distill-Qwen-7B-Huatuo-SFT-all

Training Epochs: 3

Training Objective: RL

Training Data: ReasoningEval/Huatuo-RL