Nice performance

by Evi1ran - opened Mar 26

Discussion

Evi1ran

Mar 26

Good job, nice performance! 👍
Could you please release the full weights? (Not quantized with AWQ int4)

RoadToNowhere

Owner Mar 26

Good job, nice performance! 👍
Could you please release the full weights? (Not quantized with AWQ int4)

it's the quantization of DavidAU/Qwen2.5-QwQ-35B-Eureka-Cubed-abliterated-uncensored.
i just want to test lmdeploy's performance compared to vllm so i create this.
sadly my gpu with 24GB VRAM still can not run this quantization on lmdeploy.

RoadToNowhere

Owner Mar 26

Good job, nice performance! 👍
Could you please release the full weights? (Not quantized with AWQ int4)

by the way, if you have enough RAM and VRAM for full weights, could you make gptq int4 and awq int4 quantization of hf format?
i encouter some problems hard to solve with AutoGPTQ and AutoAWQ

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment