How imatrix is generated
Hi, curious what the process of creating imatrix, want to experiment with different one, any direction appreciated
the imatrix file is created by llama-imatrix, and is relatively straightforward. you should use gpu-acceleration for it, as llama-imatrix will happily use your cpu and never finish. you need some training data, too, for example bartowski hasd a good one: https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8
llama.cpp comes with docs about this, you should find and study them, but the above is basically the process.
Awesome, thank you for direction and link, will look into it
Did I got it right, imatrix is basically used to calibrate what layers to quantize (like prioritize what is active in prompts)?
Found, https://www.reddit.com/r/Oobabooga/comments/1dyu2qg/what_is_the_imatrix_file_extension/, thanks for responding!