mmproj-Q5_K_M
#1
by
matwork403
- opened
Hi
Wouldn't you share mmproj-Q5_K_M or mmproj-Q4_K_M?
I wouldn't know how to create those, or which software would support them. How do you want to use those files?
MMPROJ can only be in F16 and Q8_0 as far I'm aware. Generaly you probably don't want to go below Q8_0 for the vision layers as vision is much more affected by quantisation that the LLM layers. I don't think there is anything technicaly stoping anyone from quantisation vision even further but to my knoledge it is not something implemented in llama.cpp because nobody wants terrible vision. If you are that memory thight that you can't fit the relatively small Q8_0 vision stack you probably want to just not provide any mmproj file to disable vision.