this model not even close to qwen 30B A3b somethink wrong with quant ?

#4
by gopi87 - opened

hi i just checked this model and wib q8 both are not performing well is there somethink wrong with quant creation or model itself is wrong ? we need support for minmax and ERNIE-4.5

Owner

Watch these two threads for updates as everyone seems a little confused about what is going on with Hunyuan-80B-A13B:

I haven't touched minimax nor ERNIE and was focusing on getting Hunyuan through first. Then there is the new R1T chimera now, its big but at least a supported working architecture unlike these new ones.

Looks like possibly an open PR for dots1 on ik's fork opened last night too fwiw.

yep i checked the dots pretty good tbh but llama cpp pretty low so jumbed to ik_llama waiting for dots support.

This comment has been hidden

Sign up or log in to comment