AssertionError: Both operands must be same dtype. Got fp16 and bf16

#8
by treehugg3 - opened

I get this error when running the demo sample script:

  File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
    return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 102:13:
        zeros = tl.interleave(zeros, zeros)
        zeros = tl.interleave(zeros, zeros)
        zeros = tl.broadcast_to(zeros, (BLOCK_SIZE_K, BLOCK_SIZE_N))

        offsets_s = N * offsets_szk[:, None] + offsets_sn[None, :]
        masks_sk = offsets_szk < K // group_size
        masks_s = masks_sk[:, None] & masks_sn[None, :]
        scales_ptrs = scales_ptr + offsets_s
        scales = tl.load(scales_ptrs, mask=masks_s)
        scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))

        b = (b >> shifts) & 0xF
             ^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')

Ubuntu 22.04, latest git transformers. triton==3.2.0, autoawq==0.2.8

The error IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16') is solved by ensuring you use torch_dtype="auto". Don't set it to torch.float16 like autoawq recommends.

But now I get this other error:

Traceback (most recent call last):                                                                         
  File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper                                                                     
    return fn(*args, **kwargs)                                                                             
  File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/core.py", line 1548, in dot                                                                       
    return semantic.dot(input, other, acc, input_precision, max_num_imprecise_acc, out_dtype, _builder)    
  File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/semantic.py", line 1470, in dot                                                                   
    assert lhs.dtype == rhs.dtype, f"Both operands must be same dtype. Got {lhs.dtype} and {rhs.dtype}"    
AssertionError: Both operands must be same dtype. Got fp16 and bf16                                        
                                                                                                           
The above exception was the direct cause of the following exception:                                       

  File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
    return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 108:22:
        masks_s = masks_sk[:, None] & masks_sn[None, :]
        scales_ptrs = scales_ptr + offsets_s
        scales = tl.load(scales_ptrs, mask=masks_s)
        scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))

        b = (b >> shifts) & 0xF
        zeros = (zeros >> shifts) & 0xF
        b = (b - zeros) * scales
        b = b.to(c_ptr.type.element_ty)

        # Accumulate results.
        accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype)
                      ^
treehugg3 changed discussion title from invalid operands of type triton.language.float16 and triton.language.float16 to AssertionError: Both operands must be same dtype. Got fp16 and bf16

Temporary fix was to use this version of transformers:

pip install git+https://github.com/huggingface/transformers.git@8ee50537fe7613b87881cd043a85971c85e99519

Thanks to https://github.com/Deep-Agent/R1-V/issues/105

Does this solution still work? I get this error when trying to use the model with the old transformers version (4.50.0.dev0) on the import statement for Qwen 2.5 VL:

Traceback (most recent call last):
  File "/home/aiscuser/test32_awq.py", line 71, in <module>
    model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/modeling_utils.py", line 262, in _wrapper
    return func(*args, **kwargs)
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/modeling_utils.py", line 4201, in from_pretrained
    hf_quantizer.preprocess_model(
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/quantizers/base.py", line 194, in preprocess_model
    return self._process_model_before_weight_loading(model, **kwargs)
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/quantizers/quantizer_awq.py", line 107, in _process_model_before_weight_loading
    model, has_been_replaced = replace_with_awq_linear(
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
    from awq.modules.linear.gemm import WQLinear_GEMM
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/__init__.py", line 24, in <module>
    from awq.models.auto import AutoAWQForCausalLM
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/models/__init__.py", line 18, in <module>
    from .qwen3 import Qwen3AWQForCausalLM
  File "/home/aiscuser/.conda/envs/slm_ot/lib/python3.9/site-packages/awq/models/qwen3.py", line 4, in <module>
    from transformers.models.qwen3.modeling_qwen3 import (
ModuleNotFoundError: No module named 'transformers.models.qwen3'

On the latest transformers version (4.51.3), I get the error you mentioned on accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype) when I run a script from the shell. And when I run the model in a notebook, I get this error:

reverse_awq_order_tensor = (
        (tl.arange(0, 2) * 4)[None, :] + tl.arange(0, 4)[:, None]
    ).reshape(8)

    # Use this to compute a set of shifts that can be used to unpack and
    # reorder the values in iweights and zeros.
    shifts = reverse_awq_order_tensor * 4
    shifts = tl.broadcast_to(shifts[None, :], (BLOCK_SIZE_Y * BLOCK_SIZE_X, 8))
    shifts = tl.reshape(shifts, (BLOCK_SIZE_Y, BLOCK_SIZE_X * 8))

    # Unpack and reorder: shift out the correct 4-bit value and mask.
    iweights = (iweights >> shifts) & 0xF
                ^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float32 and triton.language.float32')

All through, I am using torch_dtype="auto"

Anyone have any idea how to fix this?

Please fix this asap

Please fix this asap

same issue here

return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,

triton.compiler.errors.CompilationError: at 56:16:
reverse_awq_order_tensor = (
(tl.arange(0, 2) * 4)[None, :] + tl.arange(0, 4)[:, None]
).reshape(8)

# Use this to compute a set of shifts that can be used to unpack and
# reorder the values in iweights and zeros.
shifts = reverse_awq_order_tensor * 4
shifts = tl.broadcast_to(shifts[None, :], (BLOCK_SIZE_Y * BLOCK_SIZE_X, 8))
shifts = tl.reshape(shifts, (BLOCK_SIZE_Y, BLOCK_SIZE_X * 8))

# Unpack and reorder: shift out the correct 4-bit value and mask.
iweights = (iweights >> shifts) & 0xF
            ^

IncompatibleTypeErrorImpl('invalid operands of type triton.language.float32 and triton.language.float32')

Please fix this asap

Same issue here

Sign up or log in to comment