Hi sir, Thank for the excellent work, I am curious about how you quantize the hunyuan model, may you show more details?
you need torch and llama.cpp; could try to convert the safetensors to gguf and test it first; simply execute: ggc t
· Sign up or log in to comment