Model description
This is a Vicuna-like model with only 68M parameters, which is fine-tuned from LLaMA-68m on ShareGPT data.
The training setup follows the Vicuna suite.
The model is mainly developed as a base Small Speculative Model in the MCSD paper. As a comparison, it can be better aligned to the Vicuna models than LLaMA-68m with little loss of alignment to the LLaMA models.
Draft Model | Target Model | Alignment |
---|---|---|
LLaMA-68/160M | LLaMA-13/33B | π |
LLaMA-68/160M | Vicuna-13/33B | π |
Vicuna-68/160M | LLaMA-13/33B | π |
Vicuna-68/160M | Vicuna-13/33B | π |
- Downloads last month
- 2,419
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.