Spaces:

MoonQiu
/

FreeScale

Running on Zero

App Files Files Community

Getting error in image generation

by ysharma HF staff - opened 9 days ago

Discussion

ysharma

9 days ago

•

edited 9 days ago

Congratulations @MoonQiu on the app release, the UI looks slick!

I am unable to generate an image atm and am getting this error, any help?

Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 541, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1928, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1514, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2505, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1005, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 833, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 43, in infer
result = infer_gpu_part(pipe, seed, prompt, negative_prompt, ddim_steps, guidance_scale, resolutions_list, fast_mode, cosine_scale, disable_freeu)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 208, in gradio_handler
res = worker.res_queue.get()
File "/usr/local/lib/python3.10/multiprocessing/queues.py", line 367, in get
return _ForkingPickler.loads(res)
TypeError: StableDiffusionXLPipelineOutput.init() missing 1 required positional argument: 'images'

MoonQiu

Owner 9 days ago

Hi, thanks for your interest. This demo is still under testing. I think this problem may be caused by GPU switching on ZERO but I need more time to check it due to limited GPU quota.

ysharma

9 days ago

Sure, let us know here if you need any help with these issues. cc: @hysts and @akhaliq for visibility.

MoonQiu

Owner 9 days ago

•

edited 9 days ago

Sure, let us know here if you need any help with these issues. cc: @hysts and @akhaliq for visibility.

Thanks. I think it works well now.

MoonQiu changed discussion status to closed 9 days ago

MoonQiu changed discussion status to open 9 days ago

MoonQiu

Owner 6 days ago

@ysharma @hysts Hi, do you have any suggestions for speeding up the .cuda() operation? Since GPUZero needs to load the model to GPU in each inference, it is really time-consuming. After I use the SDXL-turbo checkpoints, the inference time of 2048x2048 images is around 10s after the model is loaded. However, the model loader and GPU loader will take 40s.

hysts

5 days ago

@MoonQiu
You can call pipe.to("cuda") outside of functions with @spaces.GPU. CUDA is only available inside functions decorated with it on ZeroGPU, but ZeroGPU sort of remembers that .to("cuda") is called and automatically moves models to GPU when the decorated function is called. For example, you might want to take a look at this: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev/blob/2f733451dcd2c6690953bf03ced2b9d89e6546f3/app.py#L11-L15
When the function is executed for the first time, there is some overhead for loading the model, but the model remains on the GPU for a while after the function finishes, so the execution becomes faster afterwards. (The model will be offloaded from the GPU again after a certain amount of time since the last execution.)

MoonQiu

Owner 5 days ago

@MoonQiu
You can call pipe.to("cuda") outside of functions with @spaces.GPU. CUDA is only available inside functions decorated with it on ZeroGPU, but ZeroGPU sort of remembers that .to("cuda") is called and automatically moves models to GPU when the decorated function is called. For example, you might want to take a look at this: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev/blob/2f733451dcd2c6690953bf03ced2b9d89e6546f3/app.py#L11-L15
When the function is executed for the first time, there is some overhead for loading the model, but the model remains on the GPU for a while after the function finishes, so the execution becomes faster afterwards. (The model will be offloaded from the GPU again after a certain amount of time since the last execution.)

Thanks. This operation saves much time.

MoonQiu changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment