Any explanation for this bug and what it causes ?
#1
by
manu
- opened
What are we doing by correcting this here ?
Thanks
Fix a bug in transformers/models/qwen2_vl/modeling_qwen2_vl.py around line 1774
position_ids = position_ids.unsqueeze(0).expand(3, -1, -1)
# make sure the following three line are inside the 'else' statement
if cache_position[0] != 0:
pixel_values = None
pixel_values_videos = None
I mean, I see it bugs otherwise, but why do you even need to prepare_inputs_for_generation in your case where everything is a single forward pass ?
if cache_position[0] != 0
bug here is because cache position here is None, need to make the if statement to be something like if cache_position is not None and cache_position[0] != 0
I tried to drop prepare_inputs_for_generation and directly run forward(), but seems the input is not valid.
prepare_inputs_for_generation seems to have some steps to deal with the shape of pixel value inputs.
yeah, I guess the rope deltas are what's making it bug for multigpu ...
Planning on open an issue on HF for this ?