Uploaded model

  • Developed by: tcotter
  • License: apache-2.0
  • Finetuned from model : unsloth/DeepSeek-R1-Distill-Qwen-1.5B-unsloth-bnb-4bit

Finetuned Qwen1.5B (R1 Distilled Version) on this dataset, which comes from this dataset but with an additional "summary" produced by an in-house synthetic data generator.

This LoRA is therefore a LoRA which helps the model return a "\n\nFinal Answer: ..." after it's reasoning and initial response steps.

See this paper for more details.

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tcotter/DeepSeek-R1-Qwen-1.5B-unsloth-bnb-4bit-LoRA-Adapter