GPTQ first

by rabidaught - opened

How dare you wait to do GPTQ last?
Are you out of your mind?

GGML is fast to quant. GPTQ is slow to quant. Even if you wanted everything at the exact same time, GGML would always come faster.

Thank you for releasing this . I respect your effort.

Is there a place to understand the documentation for the following parameters and their effects?

  • Bits
  • GS
  • Act Order
  • Damp %
  • ExLlama

There's a basic explanation in the README:


Sign up or log in to comment