Int4为什么比没量化的float32和float16还慢
#3 opened 5 days ago
by
hujianmin
Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 模型加载时间过长(近 2 小时)
#2 opened 8 months ago
by
TimVan1

Not working with sample code
3
#1 opened 12 months ago
by
rupeshs