Any plans for 32B/70B distilled models?
#83
by
NanaBanana22
- opened
Hey. Any plans to distill qwen3 32b / llama 70b?
we want this too!
qwen3 30b a3b
Please no more distills. They just lack so far behind because they use entirely different architectures (in this case, Qwen3)
I'd rather have a DeepSeek R1 Lite. The same model, with the same training data, just scaled down so it can run on consumer hardware.