Results are only good when image size is 768x768

#22

by MazzzyStar - opened Nov 28, 2022

Discussion

MazzzyStar

Nov 28, 2022

512x512

786x768 with same prompt

Even the 512x768 is not perfoming good when using 768.ckpt, can anyone tell me why?

deleted

Nov 28, 2022

•

edited Nov 28, 2022

That is because this model was trained to perform better when using 768x768, if you want to use 512x512 I suggest you use the base 512 model. (https://huggingface.co/stabilityai/stable-diffusion-2-base/blob/main/512-base-ema.ckpt)

MazzzyStar

Nov 28, 2022

•

edited Nov 29, 2022

@Ayitsmatt Thanks, I know that. But what if 512x768 or any other size proportion？ Because I could only see the results of 768x768 on Twitter "stablediffusion2" hashtag, it made me felt maybe it's not so flexible or suitable for other size, I just want to confirm that.

deleted

Nov 28, 2022

Honestly im not sure, I just know that when using 768x768 the model seems to perform better.

MazzzyStar

Nov 29, 2022

@Ayitsmatt Thanks for your answer.

maxspire

Dec 2, 2022

I don't know what you were using to interface with it, but I noticed that the Colabs that worked best with 1.4 and 1.5 all seemed to have some custom scripts for dealing with images over 512x512, for getting the AI to draw over its training set without starting a totally new image (which it still did, sometimes). I don't think anyone has done that for the new model yet. That's my working theory, anyway.

MazzzyStar

Dec 6, 2022

@maxspire Thanks for point out that. What made me confused is, when using 1.4 or 1.5, they are much better flexible on image size. Consider that you use an unregular input size for img2img, they often produce fairly resonable result . But for SD 2.0, I found this kind of "flexiblity" disappeared. It do not work well when the input size is not 768x768, that was my question.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment