DeepSeek-R1-Distill-Qwen-7B-RL-length-penalty-low-new / model-00004-of-00004.safetensors

Commit History

Training in progress, step 116
68a5fdc
verified

zijianh commited on

Training in progress, step 110
92fbd35
verified

zijianh commited on

Training in progress, step 100
7ec7d90
verified

zijianh commited on

Training in progress, step 90
fba36d8
verified

zijianh commited on

Training in progress, step 80
cf9383c
verified

zijianh commited on

Training in progress, step 70
833cff8
verified

zijianh commited on

Training in progress, step 60
c4315ae
verified

zijianh commited on

Training in progress, step 50
7780eeb
verified

zijianh commited on

Training in progress, step 40
90c9c5e
verified

zijianh commited on

Training in progress, step 30
936b962
verified

zijianh commited on

Training in progress, step 20
c82d4c6
verified

zijianh commited on

Training in progress, step 10
a988725
verified

zijianh commited on