Results are only good when image size is 768x768
That is because this model was trained to perform better when using 768x768, if you want to use 512x512 I suggest you use the base 512 model. (https://huggingface.co/stabilityai/stable-diffusion-2-base/blob/main/512-base-ema.ckpt)
@Ayitsmatt Thanks, I know that. But what if 512x768
or any other size proportion? Because I could only see the results of 768x768
on Twitter "stablediffusion2" hashtag, it made me felt maybe it's not so flexible or suitable for other size, I just want to confirm that.
@Ayitsmatt Thanks for your answer.
I don't know what you were using to interface with it, but I noticed that the Colabs that worked best with 1.4 and 1.5 all seemed to have some custom scripts for dealing with images over 512x512, for getting the AI to draw over its training set without starting a totally new image (which it still did, sometimes). I don't think anyone has done that for the new model yet. That's my working theory, anyway.
@maxspire
Thanks for point out that. What made me confused is, when using 1.4 or 1.5, they are much better flexible on image size. Consider that you use an unregular input size for img2img, they often produce fairly resonable result . But for SD 2.0, I found this kind of "flexiblity" disappeared. It do not work well when the input size is not 768x768
, that was my question.