mradermacher/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter1-4k-GGUF 15B • Updated 8 days ago • 257 • 1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2-4k Text Generation • 0.0B • Updated 7 days ago • 6 • 1
mradermacher/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2-4k-GGUF 15B • Updated 7 days ago • 229 • 1
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO Text Generation • 0.0B • Updated 6 days ago • 17 • 1
mradermacher/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO-GGUF 15B • Updated 4 days ago • 259 • 1