could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?

#10
by superahn - opened

could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?

superahn changed discussion title from could you give me a reason why ignore kv_a_proj_with_mqa layer when quantizing this model? to could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
Cognitive Computations org

Because some kernel implementation of AWQ only support dim divisible by 128, while that layer has a dim of 64.

Sign up or log in to comment