nisten
/

medgrpo-3b-experiment3

Text Generation

text-generation-inference

Model card Files Files and versions

Uploaded model

Developed by: nisten
License: apache-2.0
Finetuned from model : unsloth/DeepSeek-R1-Distill-Qwen-1.5B

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 4

Safetensors

Model size

1.78B params

Tensor type

F16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nisten/medgrpo-3b-experiment3

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Finetuned

unsloth/DeepSeek-R1-Distill-Qwen-1.5B

Finetuned

(49)

this model