Maybe need requant and IQ3_S models?
#1
by
Cran-May
- opened
as title.
IQ3_S is just fit for 4GB VRAM devices running 8B models.(IQ3_M is best for 7B models.)
I'd need to try to redo these quants in the latest llamacpp and if do I'll include the IQ3_S.
These will be reuploaded with the new llamacpp version.
Lewdiculous
changed discussion status to
closed