原始模型:https://huggingface.co/SakuraLLM/Sakura-13B-Qwen2beta-v0.9
4Bit AWQ量化,未测试,不建议使用。
GroupSize=64
适用于Kaggle双卡推理。
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.