could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
#10
by
superahn
- opened
could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
superahn
changed discussion title from
could you give me a reason why ignore kv_a_proj_with_mqa layer when quantizing this model?
to could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
Because some kernel implementation of AWQ only support dim divisible by 128, while that layer has a dim of 64.