We look forward to a perfect AWQ or GPTQ quantized version.
1
#2 opened about 1 month ago
by
su400
How to Only compress non-shared experts within transformer blocks?
1
#1 opened 2 months ago
by
CobraMamba
