RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != struct c10::BFloat16

#9
by rameshch - opened

File "C:\ProgramData\anaconda3\envs\llamaenv\lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "C:\ProgramData\anaconda3\envs\llamaenv\lib\site-packages\awq\modules\linear\gemm.py", line 63, in forward
out = torch.matmul(x, out)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != struct c10::BFloat16

Team Any updates?

Note: I am on AWQ 0.2.8, Transformers 4.51.1, Accelerate 1.6.0,PyTorch version: 2.6.0+cu126, CUDA Version: 12.6, flash attn 2.7.4.post1, Triton version 3.2.0 Post 10

This AWQ model works with transformers version 4.49.0 (rest as above), the reasons are unknown but has an issue with latest transformers version (4.51.1) which i was using earlier

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment