Spaces:
Running
on
Zero
[ BUG ] An error is reported directly without any information.
Hi @Paper99 Hmm, it's weird. When I duplicated the Space and ran it on ZeroGPU, I got the following error. Does this error ring a bell?
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 216, in thread_wrapper
res = future.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/user/app/app.py", line 76, in generate_image
images = pipe(
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/app/pipeline.py", line 420, in __call__
noise_pred = self.unet(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 1112, in forward
sample, res_samples = downsample_block(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 1160, in forward
hidden_states = attn(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 392, in forward
hidden_states = block(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention.py", line 329, in forward
attn_output = self.attn1(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 527, in forward
return self.processor(
File "/usr/local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1239, in __call__
query = attn.to_q(hidden_states, *args)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 495, in forward
result = self.base_layer(x, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 495, in call_prediction
output = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 177, in gradio_handler
raise res.value
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
The code doesn't seem to be updated recently and the Space had been working fine before, so it's weird, but maybe it's due to some updates of dependency libraries?
Thanks for your advice and help! It's working now.
But what's strange to me is that I've moved the device
on this line of code.
This seems to be caused by the lack of automatic device conversion for LoRA when calling fuse_lora
. This may require the attention of those maintaining diffusers.
Yeah, I guessed there was some change in implementation of .fuse_lora()
and tried it. Maybe
@sayakpaul
@YiYiXu
might know something.
Can we maybe try to use the most recent version of peft
and diffusers
and see if this persists?
OK, I'll give it a try and provide feedback.
After trying the latest version of peft and diffusers, the bug is fixed.
Thanks a lot!