finetuned smol 220M
Collection
smol_llama 220M fine-tunes we did
•
6 items
•
Updated
•
1
This is BEE-spoke-data/smol_llama-220M-GQA
fine-tuned for code generation on:
This model (and the base model) were both trained using ctx length 2048.
Example script for inference testing: here
It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes.
The screenshot is on CPU on a laptop.