license: mit | |
datasets: | |
- pr4nav101/COT_TTM_Finetuning | |
language: | |
- en | |
base_model: | |
- unsloth/Llama-3.2-1B-Instruct-bnb-4bit | |
- pr4nav101/llama-3-8b-Instruct-bnb-4bit-Tiny-Thought-Model-Large | |
pipeline_tag: text-generation | |
library_name: peft | |
tags: | |
- COT | |
- TTM | |
- LLM | |
method: | |
- Knowledge Distillation with Reverse KL Divergence + PEFT Finetuning | |