could you give an exact example about informative initialization for 1 step generation with lora?
like how to generate a latent about informative initialization
The readme already had a demo for IPI.
i think it is like sdxl mechenism: get the base model latent and give the latent to refiner model. But i finally got a WORSE result.
here is how i got my latent:
image = pipeline_yoso_lora(
prompt=prompt,
num_inference_steps=steps,
num_images_per_prompt = 1,
generator = torch.Generator(device="cuda").manual_seed(seed),
guidance_scale=cfg,
negetive_prompt = negetive_prompt,
output_type="latent",
).images
bs = 1
latents = image # maybe some latent codes of real images or SD generation
latent_mean = latents.mean(dim=0)
noise = torch.randn([1,bs,64,64])
noise = noise.to('cuda')
timesteps = torch.randint(0, pipeline_yoso_lora.scheduler.config.num_train_timesteps, (bs,), device=latents.device)
timesteps = timesteps.long()
input_latent = pipeline_yoso_lora.scheduler.add_noise(latent_mean.repeat(bs,1,1,1), noise, timesteps)
input_latent = input_latent.to(torch.float16)
The readme already had a demo for IPI.
i think it is like sdxl mechenism: get the base model latent and give the latent to refiner model. But i finally got a WORSE result.
here is how i got my latent:image = pipeline_yoso_lora( prompt=prompt, num_inference_steps=steps, num_images_per_prompt = 1, generator = torch.Generator(device="cuda").manual_seed(seed), guidance_scale=cfg, negetive_prompt = negetive_prompt, output_type="latent", ).images bs = 1 latents = image # maybe some latent codes of real images or SD generation latent_mean = latents.mean(dim=0) noise = torch.randn([1,bs,64,64]) noise = noise.to('cuda') timesteps = torch.randint(0, pipeline_yoso_lora.scheduler.config.num_train_timesteps, (bs,), device=latents.device) timesteps = timesteps.long() input_latent = pipeline_yoso_lora.scheduler.add_noise(latent_mean.repeat(bs,1,1,1), noise, timesteps) input_latent = input_latent.to(torch.float16)
This code was built based on the old version of the readme. I found a bug in the old version of the readme and updated the readme today. Please try according to the new version of the readme.
And there exists a serious bug in your code, you sample timesteps in [0,1000), which definitely produces bad results. Please check your own code before raising questions.
Your code can be updated as:
image = pipeline_yoso_lora(
prompt=prompt,
num_inference_steps=steps,
num_images_per_prompt = 1,
generator = torch.Generator(device="cuda").manual_seed(seed),
guidance_scale=cfg,
negetive_prompt = negetive_prompt,
output_type="latent",
).images
bs = 1
latents = image # maybe some latent codes of real images or SD generation
latent_mean = latents.mean(dim=0)
noise = torch.randn([bs,1,64,64])
noise = noise.to('cuda')
timesteps = torch.ones(bs).to(latents.device) * 999
timesteps = timesteps.long()
init_latent = latent_mean.repeat(bs,1,1,1) + latents.std() * torch.randn_like(noise)
input_latent = pipeline_yoso_lora.scheduler.add_noise(init_latent , noise, timesteps)
input_latent = input_latent.to(torch.float16)
however, i follow your fixed code and got WORSE results, here is an example:
prompt : a motorcycle
i am curious about how to correctly use SD generated latent to do IPI in one-step generation
There is still a bug in the code. You should fix this. Moreover, we use the training set to compute mean and variance, it is suboptimal to use generated samples for estimating this.
Considering the potential complexity for the community to use the IPI, we may consider releasing a version that distills from IPI soon.