How imatrix is generated

#1
by yarikdevcom - opened

Hi, curious what the process of creating imatrix, want to experiment with different one, any direction appreciated

the imatrix file is created by llama-imatrix, and is relatively straightforward. you should use gpu-acceleration for it, as llama-imatrix will happily use your cpu and never finish. you need some training data, too, for example bartowski hasd a good one: https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8

llama.cpp comes with docs about this, you should find and study them, but the above is basically the process.

Awesome, thank you for direction and link, will look into it

Did I got it right, imatrix is basically used to calibrate what layers to quantize (like prioritize what is active in prompts)?

yarikdevcom changed discussion status to closed

Sign up or log in to comment