Draft Model for Speculative Decoding
#4
by
SupremeCmdr741
- opened
As the title said, I wondered if there is a draft model for speculative decoding, something a fraction the size of the big one, to speed things up.
Sorry it took so long to get to this.
I don't remember if sleep did this model from scratch or if he based it on Cydonia 2.1.
if it is on Cydonia, you can try this draft model: https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B
There is a good chance it will work.
thank you!
thank you!
No problem!
After looking at the merge information in more detail, yeah, that model should work as a draft.
FrenzyBiscuit
changed discussion status to
closed