RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != struct c10::BFloat16
File "C:\ProgramData\anaconda3\envs\llamaenv\lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "C:\ProgramData\anaconda3\envs\llamaenv\lib\site-packages\awq\modules\linear\gemm.py", line 63, in forward
out = torch.matmul(x, out)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != struct c10::BFloat16
Team Any updates?
Note: I am on AWQ 0.2.8, Transformers 4.51.1, Accelerate 1.6.0,PyTorch version: 2.6.0+cu126, CUDA Version: 12.6, flash attn 2.7.4.post1, Triton version 3.2.0 Post 10
This AWQ model works with transformers version 4.49.0 (rest as above), the reasons are unknown but has an issue with latest transformers version (4.51.1) which i was using earlier