AssertionError: Both operands must be same dtype. Got fp16 and bf16
I get this error when running the demo sample script:
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 102:13:
zeros = tl.interleave(zeros, zeros)
zeros = tl.interleave(zeros, zeros)
zeros = tl.broadcast_to(zeros, (BLOCK_SIZE_K, BLOCK_SIZE_N))
offsets_s = N * offsets_szk[:, None] + offsets_sn[None, :]
masks_sk = offsets_szk < K // group_size
masks_s = masks_sk[:, None] & masks_sn[None, :]
scales_ptrs = scales_ptr + offsets_s
scales = tl.load(scales_ptrs, mask=masks_s)
scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))
b = (b >> shifts) & 0xF
^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
Ubuntu 22.04, latest git transformers. triton==3.2.0, autoawq==0.2.8
The error IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
is solved by ensuring you use torch_dtype="auto"
. Don't set it to torch.float16
like autoawq recommends.
But now I get this other error:
Traceback (most recent call last):
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper
return fn(*args, **kwargs)
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/core.py", line 1548, in dot
return semantic.dot(input, other, acc, input_precision, max_num_imprecise_acc, out_dtype, _builder)
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/semantic.py", line 1470, in dot
assert lhs.dtype == rhs.dtype, f"Both operands must be same dtype. Got {lhs.dtype} and {rhs.dtype}"
AssertionError: Both operands must be same dtype. Got fp16 and bf16
The above exception was the direct cause of the following exception:
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 108:22:
masks_s = masks_sk[:, None] & masks_sn[None, :]
scales_ptrs = scales_ptr + offsets_s
scales = tl.load(scales_ptrs, mask=masks_s)
scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))
b = (b >> shifts) & 0xF
zeros = (zeros >> shifts) & 0xF
b = (b - zeros) * scales
b = b.to(c_ptr.type.element_ty)
# Accumulate results.
accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype)
^
Temporary fix was to use this version of transformers:
pip install git+https://github.com/huggingface/transformers.git@8ee50537fe7613b87881cd043a85971c85e99519
Does this solution still work? I get this error when trying to use the model with the old transformers version (4.50.0.dev0) on the import statement for Qwen 2.5 VL:
Traceback (most recent call last):
File "/home/aiscuser/test32_awq.py", line 71, in <module>
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/modeling_utils.py", line 262, in _wrapper
return func(*args, **kwargs)
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/modeling_utils.py", line 4201, in from_pretrained
hf_quantizer.preprocess_model(
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/quantizers/base.py", line 194, in preprocess_model
return self._process_model_before_weight_loading(model, **kwargs)
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/quantizers/quantizer_awq.py", line 107, in _process_model_before_weight_loading
model, has_been_replaced = replace_with_awq_linear(
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
from awq.modules.linear.gemm import WQLinear_GEMM
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/__init__.py", line 24, in <module>
from awq.models.auto import AutoAWQForCausalLM
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/models/__init__.py", line 18, in <module>
from .qwen3 import Qwen3AWQForCausalLM
File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/models/qwen3.py", line 4, in <module>
from transformers.models.qwen3.modeling_qwen3 import (
ModuleNotFoundError: No module named 'transformers.models.qwen3'
On the latest transformers version (4.51.3), I get the error you mentioned on accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype)
when I run a script from the shell. And when I run the model in a notebook, I get this error:
reverse_awq_order_tensor = (
(tl.arange(0, 2) * 4)[None, :] + tl.arange(0, 4)[:, None]
).reshape(8)
# Use this to compute a set of shifts that can be used to unpack and
# reorder the values in iweights and zeros.
shifts = reverse_awq_order_tensor * 4
shifts = tl.broadcast_to(shifts[None, :], (BLOCK_SIZE_Y * BLOCK_SIZE_X, 8))
shifts = tl.reshape(shifts, (BLOCK_SIZE_Y, BLOCK_SIZE_X * 8))
# Unpack and reorder: shift out the correct 4-bit value and mask.
iweights = (iweights >> shifts) & 0xF
^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float32 and triton.language.float32')
All through, I am using torch_dtype="auto"
Anyone have any idea how to fix this?
Please fix this asap
Please fix this asap
same issue here
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 56:16:
reverse_awq_order_tensor = (
(tl.arange(0, 2) * 4)[None, :] + tl.arange(0, 4)[:, None]
).reshape(8)
# Use this to compute a set of shifts that can be used to unpack and
# reorder the values in iweights and zeros.
shifts = reverse_awq_order_tensor * 4
shifts = tl.broadcast_to(shifts[None, :], (BLOCK_SIZE_Y * BLOCK_SIZE_X, 8))
shifts = tl.reshape(shifts, (BLOCK_SIZE_Y, BLOCK_SIZE_X * 8))
# Unpack and reorder: shift out the correct 4-bit value and mask.
iweights = (iweights >> shifts) & 0xF
^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float32 and triton.language.float32')
Please fix this asap
Same issue here
The error
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
is solved by ensuring you usetorch_dtype="auto"
. Don't set it totorch.float16
like autoawq recommends.But now I get this other error:
Traceback (most recent call last): File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper return fn(*args, **kwargs) File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/ language/core.py", line 1548, in dot return semantic.dot(input, other, acc, input_precision, max_num_imprecise_acc, out_dtype, _builder) File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/ language/semantic.py", line 1470, in dot assert lhs.dtype == rhs.dtype, f"Both operands must be same dtype. Got {lhs.dtype} and {rhs.dtype}" AssertionError: Both operands must be same dtype. Got fp16 and bf16 The above exception was the direct cause of the following exception: File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns, triton.compiler.errors.CompilationError: at 108:22: masks_s = masks_sk[:, None] & masks_sn[None, :] scales_ptrs = scales_ptr + offsets_s scales = tl.load(scales_ptrs, mask=masks_s) scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N)) b = (b >> shifts) & 0xF zeros = (zeros >> shifts) & 0xF b = (b - zeros) * scales b = b.to(c_ptr.type.element_ty) # Accumulate results. accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype) ^
I think this might NOT be a problem with this specific Qwen2.5 model. I tried https://huggingface.co/gaunernst/gemma-3-27b-it-int4-awq, but got the exact same error .
Here my AutoAWQ version is 0.2.9, torch is 2.6.0, transformers is 4.51.3.
The error
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
is solved by ensuring you usetorch_dtype="auto"
. Don't set it totorch.float16
like autoawq recommends.But now I get this other error:
Traceback (most recent call last): File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper return fn(*args, **kwargs) File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/ language/core.py", line 1548, in dot return semantic.dot(input, other, acc, input_precision, max_num_imprecise_acc, out_dtype, _builder) File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/ language/semantic.py", line 1470, in dot assert lhs.dtype == rhs.dtype, f"Both operands must be same dtype. Got {lhs.dtype} and {rhs.dtype}" AssertionError: Both operands must be same dtype. Got fp16 and bf16 The above exception was the direct cause of the following exception: File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns, triton.compiler.errors.CompilationError: at 108:22: masks_s = masks_sk[:, None] & masks_sn[None, :] scales_ptrs = scales_ptr + offsets_s scales = tl.load(scales_ptrs, mask=masks_s) scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N)) b = (b >> shifts) & 0xF zeros = (zeros >> shifts) & 0xF b = (b - zeros) * scales b = b.to(c_ptr.type.element_ty) # Accumulate results. accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype) ^
I think this might NOT be a problem with this specific Qwen2.5 model. I tried https://huggingface.co/gaunernst/gemma-3-27b-it-int4-awq, but got the exact same error .
Here my AutoAWQ version is 0.2.9, torch is 2.6.0, transformers is 4.51.3.
Have you solved it?
Same issue here
same issue
Same issue here
still issue
Same issue
Same issue here ,pip install -U transformers
torch==2.7.1+cu128
torchaudio 2.7.1+cu128
torchvision 0.22.1+cu128
Successfully installed transformers-4.53.0
Ubuntu 22.04
solved it ,ok
i had the same issue, when i had transformers 4.51.3, torch 2.6.0+cu124
Then i created another python venv
torch==2.7.1+cu126
transformers-4.53.0
both in WSL Ubuntu 20.04 and it works finally